YoVDO

Unlocking LLM Performance with eBPF - Optimizing Training and Inference Pipelines

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

eBPF Courses PyTorch Courses Memory Profiling Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how to optimize Large Language Model (LLM) performance using eBPF in this 38-minute conference talk from the Cloud Native Computing Foundation (CNCF). Discover techniques for achieving observability in LLM training and inference processes without disruption, including Memory Profiling for model and training data loading performance, Network Profiling for data exchange performance, and GPU Profiling for analyzing Model FLOPs Utilization (MFU) and performance bottlenecks. Learn about the practical effects of implementing eBPF-based observability in PyTorch LLM applications and the llm.c project to enhance training and inference performance. Gain insights into overcoming the challenges of improving GPU utilization in LLM processes that handle vast amounts of data and consume significant computational resources.

Syllabus

Unlocking LLM Performance with EBPF: Optimizing Training and Inference Pipelines - Yang Xiang


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Logging, Monitoring and Observability in Google Cloud en Français
Google Cloud via Coursera
Logging, Monitoring and Observability in Google Cloud
Pluralsight
Memory Profiler - The Tool for Troubleshooting Memory-Related Issues
Unity via YouTube
Logging, Monitoring and Observability in Google Cloud
Google Cloud via edX
Profiling CPU and Memory on Linux, with Opensource Graphical Tools
Linux Foundation via YouTube