YoVDO

Fine-tuning Llama 3 on Wikipedia Datasets for Low-Resource Languages

Offered By: Trelis Research via YouTube

Tags

Fine-Tuning Courses LoRA (Low-Rank Adaptation) Courses Language Models Courses Wikipedia Courses Low-Resource Languages Courses Catastrophic Forgetting Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the process of fine-tuning Llama 3 for low-resource languages using Wikipedia datasets in this comprehensive 44-minute tutorial. Learn how to create a HuggingFace dataset using WikiExtractor, set up Llama 3 fine-tuning with LoRA, and implement dataset blending to prevent catastrophic forgetting. Dive into trainer setup, parameter selection, and loss inspection. Gain insights on learning rates, annealing, and additional tips for improving your fine-tuning results. Access provided resources including slides, dataset links, and code repositories to enhance your learning experience.

Syllabus

Fine-tuning Llama 3 for a low resource language
Overview of Wikipedia Dataset and Loss Curves
Video overview
HuggingFace Dataset creation with WikiExtractor
Llama 3 fine-tuning setup, incl. LoRA
Dataset blending to avoid catastrophic forgetting
Trainer setup and parameter selection
Inspection of losses and results
Learning Rates and Annealing
Further tips and improvements


Taught by

Trelis Research

Related Courses

Amazon SageMaker JumpStart Foundations (Japanese)
Amazon Web Services via AWS Skill Builder
AWS Flash - Generative AI with Diffusion Models
Amazon Web Services via AWS Skill Builder
AWS Flash - Operationalize Generative AI Applications (FMOps/LLMOps)
Amazon Web Services via AWS Skill Builder
AWS SimuLearn: Automate Fine-Tuning of an LLM
Amazon Web Services via AWS Skill Builder
AWS SimuLearn: Fine-Tune a Base Model with RLHF
Amazon Web Services via AWS Skill Builder