YoVDO

LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM

Offered By: The Machine Learning Engineer via YouTube

Tags

Data Science Courses Machine Learning Courses Deep Learning Courses Quantization Courses Model Optimization Courses GPU Acceleration Courses LLMOps Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.

Syllabus

LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning


Taught by

The Machine Learning Engineer

Related Courses

3D-печать для всех и каждого
Tomsk State University via Coursera
Developing a Multidimensional Data Model
Microsoft via edX
Launching into Machine Learning 日本語版
Google Cloud via Coursera
Art and Science of Machine Learning 日本語版
Google Cloud via Coursera
Launching into Machine Learning auf Deutsch
Google Cloud via Coursera