YoVDO

Optimizing LLM Inference with AWS Trainium, Ray, vLLM, and Anyscale

Offered By: Anyscale via YouTube

Tags

AWS Inferentia Courses Cloud Computing Courses Amazon Elastic Kubernetes Service (EKS) Courses Generative AI Courses vLLM Courses Anyscale Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to optimize large language model (LLM) inference using AWS Trainium, Ray, vLLM, and Anyscale in this 46-minute webinar. Learn to scale and productionize LLM workloads cost-effectively by leveraging AWS accelerator instances, including AWS Inferentia, for reliable LLM serving at scale. Explore building a complete LLM inference stack using vLLM and Ray on Amazon EKS, and understand Anyscale's performance and enterprise capabilities for ambitious LLM and GenAI inference workloads. Gain insights into using AWS Inferentia accelerators for leading price-performance, leveraging AWS compute instances on Anyscale for optimized LLM inference, and utilizing Anyscale's managed enterprise LLM inference offering with advanced cluster management optimizations. Ideal for AI Engineers seeking to operationalize generative AI models at scale cost-efficiently and Infrastructure Engineers planning to support GenAI use cases and LLM inference in their organizations.

Syllabus

Optimizing LLM Inference with AWS Trainium, Ray, vLLM, and Anyscale


Taught by

Anyscale

Related Courses

Software as a Service
University of California, Berkeley via Coursera
Software Defined Networking
Georgia Institute of Technology via Coursera
Pattern-Oriented Software Architectures: Programming Mobile Services for Android Handheld Systems
Vanderbilt University via Coursera
Web-Technologien
openHPI
Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique