Reinforcement Learning from Human Feedback with Hugging Face

Name: Reinforcement Learning from Human Feedback with Hugging Face
Start: 2024-03-07T16:30:00-05:00
End: 2024-03-07T18:00:00-05:00
Location: Private Location (sign in to display)

by

Training/Workshop Programming Languages Research & Data Analysis

Thu, Mar 7, 2024

4:30 PM – 6 PM EST (GMT-5)

Private Location (sign in to display)

49

Registered

Registration

Registration is now closed (this event already took place).

Details

This workshop explores recent technological advances in training large language models (LLMs) with reinforcement learning from human feedback (RLHF). The speaker will discuss the pioneering works in RL and RLHF and how they led to the creation of ChatGPT. He'll also delve into the challenges and accomplishments of open-source RLHF and more recent techniques like Direct Policy Optimization (DPO). The workshop concludes with a hands-on exercise using Google Colab, where you'll gain practical experience with Hugging Face's open-source RLHF library for training your own models.

Speakers

Costa Huang

Machine learning engineer

Hugging Face

Costa Huang is a machine learning engineer at Hugging Face, specializing in Reinforcement Learning from Human Feedback (RLHF). He holds a Ph.D. from Drexel University, focusing on efficient and reproducible reinforcement learning. Notably, he's the creator of CleanRL, a user-friendly RL library designed for researchers.

Hosted By

Research Computing | View More Events
Co-hosted with: GradFUTURES