YoVDO

LLM Evaluation: Challenges and Best Practices - MLOps Podcast #210

Offered By: MLOps.community via YouTube

Tags

MLOps Courses Fine-Tuning Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of Language Model (LLM) evaluation in this 56-minute podcast featuring Aparna Dhinakaran, Co-Founder and Chief Product Officer of Arize AI. Delve into the complexities of LLM assessment, the significance of the Phoenix evaluations library, and the importance of tailored evaluations in software applications. Examine the nuances of AI fine-tuning, debate the merits of open-source versus private models, and understand the urgency of deploying models into production for early bottleneck identification. Learn about the relevance of retrieved information, output legitimacy, and the operational advantages of Phoenix in supporting LLM evaluations. Gain insights from Dhinakaran's extensive experience in ML infrastructure and AI observability as she discusses real-world challenges and solutions in LLM implementation and evaluation.

Syllabus

[] AI in Production Conference
[] Aparna preferred coffee
[] Takeaways
[] Shout out to Arize team for being a sponsor of the MLOps Community since 2020!
[] Please like, share, and subscribe to our MLOps channels!
[] Evaluation space
[] Chatbots Prevent Misinformation
[] Evaluating AI response based on factual retrieval
[] Balancing eval response and impact on speed
[] Context length, placement, and information recall study
[] GPT-4 excels, prompt iterations affect outcomes
[] Multiple sub-steps and requiring visibility on Application calls
[] Evaluate calls, breakdown, score, and application evaluation
[] Rata classification for effective evaluation Research
[] Benchmarks on Huggingface and Twitter reliability
[] Power of observability and retrieval embeddings
[] Tweaking data points
[] Hot take
[] Bottlenecks and errors from rapid production


Taught by

MLOps.community

Related Courses

TensorFlow: Working with NLP
LinkedIn Learning
Introduction to Video Editing - Video Editing Tutorials
Great Learning via YouTube
HuggingFace Crash Course - Sentiment Analysis, Model Hub, Fine Tuning
Python Engineer via YouTube
GPT3 and Finetuning the Core Objective Functions - A Deep Dive
David Shapiro ~ AI via YouTube
How to Build a Q&A AI in Python - Open-Domain Question-Answering
James Briggs via YouTube