YoVDO

Making LLM Inference Affordable - Part 2

Offered By: MLOps.community via YouTube

Tags

Machine Learning Courses MLOps Courses Quantization Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore techniques for making large language model (LLM) inference more affordable and efficient in this 32-minute conference talk by Daniel Campos at the LLMs in Production Conference. Learn about the challenges of using foundational models and APIs, and discover alternatives like self-hosting models. Delve into methods for optimizing model performance within latency and inference budgets, including pseudo-labeling, knowledge distillation, pruning, and quantization. Gain insights from Campos' extensive experience in NLP, ranging from his work at Microsoft on Bing's ranking system to his current Ph.D. research on efficient LLM inference and robust dense retrieval at the University of Illinois Urbana Champaign.

Syllabus

Making LLM Inference Affordable // Daniel Campos // LLMs in Production Conference Part 2


Taught by

MLOps.community

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent