Model Evaluation Courses
Snorkel AI via YouTube The Challenges and Opportunities of Continual Learning in Real-Time Machine Learning
Snorkel AI via YouTube Why AI Needs New Data Benchmarks and Quality Metrics
Snorkel AI via YouTube Direct Preference Optimization (DPO): How It Works and How It Topped an LLM Eval Leaderboard
Snorkel AI via YouTube Student Lightning Talks: Text Generation, Reward Consistency, and Evaluation Validity
Center for Language & Speech Processing(CLSP), JHU via YouTube Hierarchical Prompting Taxonomy - A Universal Evaluation Framework for Large Language Models
Unify via YouTube Validate and Monitor Your AI and Machine Learning Models
Toronto Machine Learning Series (TMLS) via YouTube Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
Toronto Machine Learning Series (TMLS) via YouTube DynaTask: A New Open Source Approach for AI Benchmarking
Toronto Machine Learning Series (TMLS) via YouTube Interpretability Tools as Feedback Loops in Machine Learning Training
Toronto Machine Learning Series (TMLS) via YouTube