SliceGPT Explained - Compressing Large Language Models
Offered By: Unify via YouTube
Course Description
Overview
Explore a 55-minute session featuring Saleh Ashkboos, a PhD student at ETH Zurich, as he delves into SliceGPT, a novel approach for compressing large language models. Learn how this technique can remove up to 25% of model parameters while maintaining high zero-shot task performance for models like LLAMA2-70B, OPT 66B, and Phi-2. Discover the intricacies of the SliceGPT method, which involves deleting rows and columns to achieve significant compression without substantial performance loss. Gain insights into accelerating deep neural network training and developing systems for large-scale graph processing. Access additional resources, including the original research paper, AI research newsletters, and blogs on AI deployment. Connect with the Unify community through various platforms to stay updated on AI optimization, LLM compression, and related topics.
Syllabus
SliceGPT Explained
Taught by
Unify
Related Courses
LLaMA2 for Multilingual Fine TuningSam Witteveen via YouTube Set Up a Llama2 Endpoint for Your LLM App in OctoAI
Docker via YouTube AI Engineer Skills for Beginners: Code Generation Techniques
All About AI via YouTube Training and Evaluating LLaMA2 Models with Argo Workflows and Hera
CNCF [Cloud Native Computing Foundation] via YouTube LangChain Crash Course - 6 End-to-End LLM Projects with OpenAI, LLAMA2, and Gemini Pro
Krish Naik via YouTube