YoVDO

LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

Offered By: The Machine Learning Engineer via YouTube

Tags

Data Science Courses Machine Learning Courses Deep Learning Courses Quantization Courses Model Optimization Courses GPU Acceleration Courses LLMOps Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.

Syllabus

LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning


Taught by

The Machine Learning Engineer

Related Courses

Fundamentals of Accelerated Computing with CUDA C/C++
Nvidia via Independent
Using GPUs to Scale and Speed-up Deep Learning
IBM via edX
Deep Learning
IBM via edX
Deep Learning with IBM
IBM via edX
Accelerating Deep Learning with GPUs
IBM via Cognitive Class