LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.
Syllabus
LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning
Taught by
The Machine Learning Engineer
Related Courses
Fundamentals of Accelerated Computing with CUDA C/C++Nvidia via Independent Using GPUs to Scale and Speed-up Deep Learning
IBM via edX Deep Learning
IBM via edX Deep Learning with IBM
IBM via edX Accelerating Deep Learning with GPUs
IBM via Cognitive Class