YoVDO

AI Inference Performance Acceleration: Methods, Tools, and Deployment Workflows

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Model Optimization Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore AI inference performance acceleration methods, tools, and deployment workflows in this 43-minute conference talk by Yifei Zhang and 磊 钱 from Bytedance. Discover cloud-native solutions for storage performance issues and learn about tools for evaluating inference performance across different configurations. Gain insights into optimizing GPU selection, serving framework configuration, and model/data loading to enhance inference efficiency. Understand the impact of inference performance on user experience and how optimization can reduce costs. Explore strategies using technologies like Fluid and model optimization to improve inference performance. Receive guidance on hardware selection based on performance and cost analysis of various GPUs. Learn about a performance testing tool that evaluates and recommends the best combinations of models, hardware, and acceleration schemes, aligning with deployment workflows based on test results.

Syllabus

AI Inference Performance Acceleration: Methods, Tools, and Deployment Workflows - Yifei Zhang & 磊 钱


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

3D-печать для всех и каждого
Tomsk State University via Coursera
Developing a Multidimensional Data Model
Microsoft via edX
Launching into Machine Learning 日本語版
Google Cloud via Coursera
Art and Science of Machine Learning 日本語版
Google Cloud via Coursera
Launching into Machine Learning auf Deutsch
Google Cloud via Coursera