Speculative Decoding: Techniques for Faster LLM Inference
Offered By: Trelis Research via YouTube
Course Description
Overview
Explore the concept of speculative decoding in this 38-minute video from Trelis Research. Dive into various decoding techniques including naive speculative decoding, prompt-based n-gram speculation, lookahead decoding, and assisted decoding. Learn how these methods can significantly speed up inference in large language models. Follow along with performance testing and analysis of results to understand the practical implications of these techniques. Gain valuable tips for achieving faster inference in your own projects. Access additional resources, including free templates and paid guides, to further enhance your knowledge and implementation of advanced inference techniques.
Syllabus
Faster inference with Speculative Decoding
Video Overview
How speculative decoding works?
Naive speculative decoding
Prompt based n-gram speculation
Lookahead decoding
Assisted decoding
Summary of Decoding Techniques
Performance Testing
Summary of Results
Tips for faster inference
Taught by
Trelis Research
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent