YoVDO

Hardware, Software, Performance and Costs for Llama-2 70b and Mixtral 8x7b LLM Inference with Low Concurrency

Offered By: Linux Foundation via YouTube

Tags

Inference Courses Cloud Computing Courses Cost Analysis Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore hardware requirements, software configurations, performance metrics, and cost considerations for running Llama-2 70b and Mixtral 8x7b unquantized inference with low concurrency in this informative 52-minute conference talk. Analyze benchmark data from GitHub to compare different frameworks and hardware setups, including on-premises and cloud-based solutions in the USA. Gain insights into the speed and cost advantages of open-source software (OSS) LLMs compared to closed APIs like OpenAI. Learn how to configure servers and replicate benchmarks through practical code examples. Acquire valuable knowledge to jumpstart your journey in implementing high-quality OSS LLM inference for small organizations in 2024, focusing on optimal hardware and software choices for efficient performance.

Syllabus

HW, SW, Performance and Costs for Llama-2 70b and Mixtral 8x7b LLM Inference with Low...- Ivan Baldo


Taught by

Linux Foundation

Tags

Related Courses

Software as a Service
University of California, Berkeley via Coursera
Software Defined Networking
Georgia Institute of Technology via Coursera
Pattern-Oriented Software Architectures: Programming Mobile Services for Android Handheld Systems
Vanderbilt University via Coursera
Web-Technologien
openHPI
Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique