YoVDO

RLHF Courses

Direct Preference Optimization - Fine-Tuning LLMs Without Reinforcement Learning
Serrano.Academy via YouTube
Mastering ChatGPT (AI) and PowerPoint presentation
Udemy
Reinforcement Learning with TorchRL and TensoDict - NeurIPS Hacker Cup AI
Weights & Biases via YouTube
Reinforcement Learning from Human Feedback (RLHF) Explained
IBM via YouTube
PUZZLE: Efficiently Aligning Large Language Models through Light-Weight Context Switching
USENIX via YouTube
Scalable and Flexible Distributed Reinforcement Learning Systems
Finnish Center for Artificial Intelligence FCAI via YouTube
Direct Preference Optimization (DPO) - Advanced Fine-Tuning Technique
Trelis Research via YouTube
End-to-end Modern Machine Learning in Production - Part 2
MLOps.community via YouTube
RLHF Data Collection in Practice - Part 2
MLOps.community via YouTube
Less is More with PostgresML - ML and DB Seminar Series
CMU Database Group via YouTube
Page 1 Next >