Combined Preference and Supervised Fine-Tuning with ORPO
Offered By: Trelis Research via YouTube
Course Description
Overview
          Explore advanced fine-tuning techniques in this comprehensive video tutorial on Combined Preference and Supervised Fine Tuning with ORPO. Learn about the evolution of fine-tuning methods, understand the differences between unsupervised, supervised, and preference-based approaches, and delve into cross-entropy and odds ratio loss functions. Discover why preference fine-tuning enhances performance through a hands-on notebook demonstration of SFT and ORPO. Evaluate the results using lm-evaluation-harness and compare SFT and ORPO performance across various benchmarks. Gain insights into the practical benefits of ORPO and access valuable resources for further exploration and implementation.
        
Syllabus
 Preference and Supervised Fine-tuning at the Same Time!
 A short history of fine-tuning methods
 Video Overview/Agenda
 Difference between Unsupervised, Supervised and Preferences
 Understanding cross-entropy and odds ratio loss functions
 Why preference fine-tuning improves performance
 Notebook demo of SFT and ORPO
 Evaluation with lm-evaluation-harness
 Results: Comparing SFT and ORPO with gsm8k, arithmetic and mmlu
 Evaluation with Carlini's practical benchmark
 Is it worth doing ORPO? Yes!
Taught by
Trelis Research
Related Courses
TensorFlow: Working with NLPLinkedIn Learning Introduction to Video Editing - Video Editing Tutorials
Great Learning via YouTube HuggingFace Crash Course - Sentiment Analysis, Model Hub, Fine Tuning
Python Engineer via YouTube GPT3 and Finetuning the Core Objective Functions - A Deep Dive
David Shapiro ~ AI via YouTube How to Build a Q&A AI in Python - Open-Domain Question-Answering
James Briggs via YouTube