LLMOps: Accelerate LLM Inference in GPU Using TensorRT-LLM
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Discover how to accelerate Large Language Model (LLM) generation and inference using TensorRT-LLM in this 17-minute tutorial. Learn to leverage the Runtime TensorRT-LLM for optimizing LLM performance on GPUs. Access the accompanying Jupyter notebook for hands-on practice and implementation. Gain valuable insights into LLMOps, data science, and machine learning techniques to enhance your AI development skills.
Syllabus
LLMOps: Acelerate LLM Inference in GPU using TensorRT-LLM #datascience #machinelerning
Taught by
The Machine Learning Engineer
Related Courses
Data AnalysisJohns Hopkins University via Coursera Computing for Data Analysis
Johns Hopkins University via Coursera Scientific Computing
University of Washington via Coursera Introduction to Data Science
University of Washington via Coursera Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera