Reinforcement Learning from Human Feedback (RLHF) for Human–AI Collaboration

Contract Type

Internship offer

Working Time

Full-time

Compensation

4.35€ / hour

Role

Intern

The goal of this internship is to explore Reinforcement Learning from Human Feedback (RLHF) in a cooperative setting. Instead of learning only from environment rewards, the agent’s policy will be shaped by a human partner’s feedback during collaboration. Specifically, the human teammate will be able to assign positive or negative rewards based on how helpful, efficient, or intuitive the agent’s actions feel during the joint task. The intern will investigate how such feedback influences learning stability, team performance, and perceived fluency.