YoVDO

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

Offered By: MLOps.community via YouTube

Tags

Model Deployment Courses PyTorch Courses MLOps Courses Cost Optimization Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for deploying Large Language Models (LLMs) in production environments through this informative 12-minute talk by Dmytro Dzhulgakov, co-founder and CTO of Fireworks.ai. Learn how the Fireworks.ai GenAI Platform assists developers in navigating the complex journey from early experimentation to high-load production deployments while managing costs and latency. Gain insights into handling multiple model variants, scaling up usage, and optimizing cost-to-serve and latency concerns. Discover how Fireworks.ai's high-performance, low-cost LLM inference service can help you experiment with and productionize large models effectively. Benefit from Dzhulgakov's expertise as a PyTorch core maintainer and his experience in transitioning PyTorch from a research framework to numerous production applications across Meta's AI use cases and the broader industry.

Syllabus

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai // Dmytro Dzhulgakov


Taught by

MLOps.community

Related Courses

Developing a Tabular Data Model
Microsoft via edX
Data Science in Action - Building a Predictive Churn Model
SAP Learning
Serverless Machine Learning with Tensorflow on Google Cloud Platform 日本語版
Google Cloud via Coursera
Intro to TensorFlow em Português Brasileiro
Google Cloud via Coursera
Serverless Machine Learning con TensorFlow en GCP
Google Cloud via Coursera