YoVDO

Running Llama 2 with Extended Context Length - Up to 32k Tokens

Offered By: Trelis Research via YouTube

Tags

Quantization Courses RunPod Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to scale Llama 2 to handle 32k context length in this comprehensive 22-minute tutorial video. Discover techniques for achieving up to 16k tokens on a Colab 40 GB GPU and 32k tokens on an 80 GB A100 using platforms like RunPod, AWS, or Azure. Explore the use of Flash attention, BetterTransformer, and GPTQ quantization to optimize performance. Gain insights on running GPTQ models in Colab, streaming Llama 2 13B with various context lengths, and adjusting parameters like max token output and temperature. Access a free Jupyter notebook for implementation or consider the PRO version for advanced features like conversation saving and document analysis. Delve into theoretical aspects of extending context length, compare different models, and gather valuable tips for working with long context lengths in language models.

Syllabus

How to run Llama 2 with longer context length
Run Llama 2 with 16k context in Google Colab
How to run a GPTQ model in Colab
Run Llama 2 7B with 32k context length using RunPod
Run Llama 2 13B for better performance! 16k context length
Streaming Llama 2 13B on 16k context length
Adjusting max token output and temperature
Streaming Llama 2 13B on 16k context length and 0 temperature
STREAMING LLAMA 2 13B ON 32k CONTEXT LENGTH!
PRO NOTEBOOK - Save Chats and Files. Easily adjust context length.
THEORY BONUS: How to get longer context length?
How does GPTQ work?
How does Flash attention work?
What is the best model for long context length?
What is better Llama 2 or Code-llama or YaRN?
Tips for long context lengths


Taught by

Trelis Research

Related Courses

Epic Web UI DreamBooth Update - New Best Settings - Stable Diffusion Training Compared on RunPods
Software Engineering Courses - SE Courses via YouTube
How to Train Stable Diffusion on Your Photos on a Remote GPU - Using RunPod and Dreambooth
AI Tutorials with Kris Kashtanova via YouTube
Train Stable Diffusion on Your Own Photos - Updated Tutorial
AI Tutorials with Kris Kashtanova via YouTube
ComfyUI Master Tutorial - Stable Diffusion XL - Install on PC, Google Colab and RunPod
Software Engineering Courses - SE Courses via YouTube
Stable Diffusion- Training SDXL 1.0 - Finetune, LoRA, D-Adaptation, Prodigy
kasukanra via YouTube