YoVDO

Model Compression Courses

Quant-LLM: Accelerating Large Language Model Serving via FP6-Centric Algorithm-System Co-Design
USENIX via YouTube
Implicit Deep Learning - Seminar Series
VinAI via YouTube
Efficient AI: From Supercomputers to Smartphones
Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube
How to Quantize a Large Language Model with GGUF or AWQ
Trelis Research via YouTube
Unlock Faster and More Efficient LLMs with SparseGPT - Neural Magic
Neural Magic via YouTube
Pruning and Quantizing ML Models With One Shot Without Retraining
Neural Magic via YouTube
Applying Second-Order Pruning Algorithms for SOTA Model Compression
Neural Magic via YouTube
Sparse Training of Neural Networks Using AC/DC
Neural Magic via YouTube
How Well Do Sparse Models Transfer? - Exploring Transfer Performance in Computer Vision and NLP
Neural Magic via YouTube
Leveraging Pruning and Quantization for Efficient Real-Time Audio Applications
ADC - Audio Developer Conference via YouTube
< Prev Page 3 Next >