LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.
Syllabus
LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning
Taught by
The Machine Learning Engineer
Related Courses
3D-печать для всех и каждогоTomsk State University via Coursera Developing a Multidimensional Data Model
Microsoft via edX Launching into Machine Learning 日本語版
Google Cloud via Coursera Art and Science of Machine Learning 日本語版
Google Cloud via Coursera Launching into Machine Learning auf Deutsch
Google Cloud via Coursera