YoVDO

Enable Generative AI Everywhere with Ubiquitous Hardware and Open Software

Offered By: Linux Foundation via YouTube

Tags

Generative AI Courses Performance Improvement Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore optimization techniques for Generative AI and Large Language Models (LLMs) in this informative conference talk. Learn about strategies to reduce inference latency and improve performance, including low precision inference, Flash Attention, Efficient Attention in scaled dot product attention (SDPA), optimized KV cache access, and Kernel Fusion. Discover how these optimizations, implemented within PyTorch and Intel Extension for PyTorch, can significantly enhance model efficiency on CPU servers with 4th generation Intel Xeon Scalable Processors. Gain insights into scaling up and out model inference on multiple devices using Tensor Parallel techniques, enabling the deployment of generative AI across various hardware configurations.

Syllabus

Enable Generative AI Everywhere with Ubiquitous Hardware and Open Software - Guobing Chen, Intel


Taught by

Linux Foundation

Tags

Related Courses

Building and Managing Superior Skills
State University of New York via Coursera
ChatGPT et IA : mode d'emploi pour managers et RH
CNAM via France Université Numerique
Digital Skills: Artificial Intelligence
Accenture via FutureLearn
AI Foundations for Everyone
IBM via Coursera
Design a Feminist Chatbot
Institute of Coding via FutureLearn