Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

Offered By: MLOps.community via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the challenges and solutions for deploying Large Language Models (LLMs) in production environments through this informative 12-minute talk by Dmytro Dzhulgakov, co-founder and CTO of Fireworks.ai. Learn how the Fireworks.ai GenAI Platform assists developers in navigating the complex journey from early experimentation to high-load production deployments while managing costs and latency. Gain insights into handling multiple model variants, scaling up usage, and optimizing cost-to-serve and latency concerns. Discover how Fireworks.ai's high-performance, low-cost LLM inference service can help you experiment with and productionize large models effectively. Benefit from Dzhulgakov's expertise as a PyTorch core maintainer and his experience in transitioning PyTorch from a research framework to numerous production applications across Meta's AI use cases and the broader industry.

Syllabus

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai // Dmytro Dzhulgakov

Taught by

MLOps.community

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Efficient Serving of LLMs for Experimentation and Production with Fireworks.ai

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue