Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai
Offered By: MLOps.community via YouTube
Course Description
Overview
Explore the challenges and solutions for deploying Large Language Models (LLMs) in production environments through this informative 12-minute talk by Dmytro Dzhulgakov, co-founder and CTO of Fireworks.ai. Learn how the Fireworks.ai GenAI Platform assists developers in navigating the complex journey from early experimentation to high-load production deployments while managing costs and latency. Gain insights into handling multiple model variants, scaling up usage, and optimizing cost-to-serve and latency concerns. Discover how Fireworks.ai's high-performance, low-cost LLM inference service can help you experiment with and productionize large models effectively. Benefit from Dzhulgakov's expertise as a PyTorch core maintainer and his experience in transitioning PyTorch from a research framework to numerous production applications across Meta's AI use cases and the broader industry.
Syllabus
Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai // Dmytro Dzhulgakov
Taught by
MLOps.community
Related Courses
Machine Learning Operations (MLOps): Getting StartedGoogle Cloud via Coursera Проектирование и реализация систем машинного обучения
Higher School of Economics via Coursera Demystifying Machine Learning Operations (MLOps)
Pluralsight Machine Learning Engineer with Microsoft Azure
Microsoft via Udacity Machine Learning Engineering for Production (MLOps)
DeepLearning.AI via Coursera