YoVDO

Efficient Transformers - Lecture 20

Offered By: MIT HAN Lab via YouTube

Tags

Transformers Courses Neural Networks Courses Quantization Courses Distillation Courses TinyML Courses Neural Architecture Search Courses Model Compression Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore efficient transformers in this lecture from MIT's TinyML and Efficient Deep Learning Computing course. Dive into techniques for optimizing transformer models to run on resource-constrained devices like mobile phones and IoT hardware. Learn about model compression, pruning, quantization, neural architecture search, and knowledge distillation approaches to reduce the computational and memory requirements of transformer architectures. Discover how to apply these methods to enable powerful natural language processing capabilities on edge devices. Gain practical insights for deploying transformer-based AI applications in mobile and embedded systems. Access accompanying slides and resources to reinforce key concepts covered in the 1 hour 18 minute video lecture led by Professor Song Han of the MIT HAN Lab.

Syllabus

Lecture 20 - Efficient Transformers | MIT 6.S965


Taught by

MIT HAN Lab

Related Courses

Machine Learning Modeling Pipelines in Production
DeepLearning.AI via Coursera
MLOps for Scaling TinyML
Harvard University via edX
Parameter Prediction for Unseen Deep Architectures - With First Author Boris Knyazev
Yannic Kilcher via YouTube
SpineNet - Learning Scale-Permuted Backbone for Recognition and Localization
Yannic Kilcher via YouTube
Synthetic Petri Dish - A Novel Surrogate Model for Rapid Architecture Search
Yannic Kilcher via YouTube