YoVDO

LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

Offered By: The Machine Learning Engineer via YouTube

Tags

Data Science Courses Machine Learning Courses Deep Learning Courses Quantization Courses Model Optimization Courses GPU Acceleration Courses LLMOps Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.

Syllabus

LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning


Taught by

The Machine Learning Engineer

Related Courses

Large Language Models: Application through Production
Databricks via edX
LLMOps - LLM Bootcamp
The Full Stack via YouTube
MLOps: Why DevOps Solutions Fall Short in the Machine Learning World
Linux Foundation via YouTube
Quick Wins Across the Enterprise with Responsible AI
Microsoft via YouTube
End-to-End AI App Development: Prompt Engineering to LLMOps
Microsoft via YouTube