YoVDO

Exploring the Latency, Throughput, and Cost Space for LLM Inference

Offered By: MLOps.community via YouTube

Tags

MLOps Courses Cost Optimization Courses Mistral AI Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of LLM inference stacks in this 30-minute conference talk by Timothée Lacroix, CTO of Mistral. Delve into the process of selecting the optimal model for specific tasks, choosing appropriate hardware, and implementing efficient inference code. Examine popular inference stacks and setups, uncovering the factors that contribute to inference costs. Gain insights into leveraging current open-source models effectively and learn about the limitations in existing open-source serving stacks. Discover the potential advancements that future generations of models may bring to the field of LLM inference.

Syllabus

Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral


Taught by

MLOps.community

Related Courses

Comprehensive Guide to Large Language Models in 2024 - Usage and Selection
MattVidPro AI via YouTube
Mistral Large with Function Calling - Review and Implementation
Sam Witteveen via YouTube
Intro to Mistral AI
Scrimba
Intro to Mistral AI
Scrimba via Coursera
Getting Started with Mistral
DeepLearning.AI via Coursera