YoVDO

LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

Offered By: The Machine Learning Engineer via YouTube

Tags

Data Science Courses Machine Learning Courses Deep Learning Courses Quantization Courses Model Optimization Courses GPU Acceleration Courses LLMOps Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.

Syllabus

LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning


Taught by

The Machine Learning Engineer

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX