Model Compression Courses
MIT HAN Lab via YouTube Quantization in Neural Networks - Lecture 5
MIT HAN Lab via YouTube Neural Architecture Search for Efficient Deep Learning - Lecture 9
MIT HAN Lab via YouTube Neural Architecture Search (Part II) - Lecture 8
MIT HAN Lab via YouTube AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT HAN Lab via YouTube TinyML and Efficient Deep Learning Computing - Lecture 24: Course Summary
MIT HAN Lab via YouTube TinyML and Efficient Deep Learning Computing - Course Summary
MIT HAN Lab via YouTube Efficient Inference of Extremely Large Transformer Models
Toronto Machine Learning Series (TMLS) via YouTube Composable Interventions for Language Models
USC Information Sciences Institute via YouTube OpenAI Model Distillation and Evals Explained
echohive via YouTube