AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Offered By: MIT HAN Lab via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the groundbreaking research presented in this 19-minute conference talk video from MLSys 2024, featuring the Best Paper "AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration." Delve into the innovative approach developed by researchers from MIT HAN Lab for compressing and accelerating Large Language Models (LLMs). Learn about the Activation-aware Weight Quantization (AWQ) technique and its potential impact on improving the efficiency of LLMs. Gain insights into the methodology, results, and implications of this cutting-edge work in machine learning systems. Access additional resources, including the project website, full paper, and code repository, to further understand and potentially implement the AWQ technique in your own projects.

Syllabus

MLSys'24 Best Paper - AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Taught by

MIT HAN Lab

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue