YoVDO

Lessons Learned from Scaling Large Language Models in Production

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

MLOps Courses Vector Databases Courses Scaling Courses Inference Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for scaling large language models (LLMs) in production environments during this 40-minute conference talk from MLOps World: Machine Learning in Production. Gain insights from Matt Squire, CTO of Fuzzy Labs, as he shares valuable lessons learned from building and running LLMs at scale for customers. Discover how to overcome the complexities of high-traffic demands, slow LLM inference, and expensive GPU resources. Learn about performance profiling techniques, optimizing GPU utilization, and implementing effective guardrails through real code examples. Understand the nuances of scaling beyond basic RAG applications with open-source models like Mistral, and acquire practical knowledge to enhance your LLM deployment strategies for production-level performance and efficiency.

Syllabus

Lessons learned from scaling large language models in production


Taught by

MLOps World: Machine Learning in Production

Related Courses

Vector Similarity Search
Data Science Dojo via YouTube
Supercharging Semantic Search with Pinecone and Cohere
Pinecone via YouTube
Search Like You Mean It - Semantic Search with NLP and a Vector Database
Pinecone via YouTube
The Rise of Vector Data
Pinecone via YouTube
NER Powered Semantic Search in Python
James Briggs via YouTube