YoVDO

Running Open Large Language Models in Production with Ollama and Serverless GPUs

Offered By: Devoxx via YouTube

Tags

Ollama Courses LLaMA (Large Language Model Meta AI) Courses GPU Computing Courses Model Deployment Courses Cloud Run Courses Gemma Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the deployment of open large language models in production environments using Ollama and serverless GPUs. Learn why companies are increasingly interested in running open models like Gemma and Llama, which offer full control over deployment, model upgrades, and data privacy. Discover how to leverage Ollama, a popular open-source LLM inference server, for both local and containerized environments. Gain practical insights into deploying an application that utilizes an open model with Ollama on Cloud Run, featuring scale-to-zero capabilities and serverless GPUs. This 43-minute talk from Devoxx provides valuable knowledge for organizations looking to harness the power of open LLMs while maintaining control over their AI infrastructure.

Syllabus

Running open large language models in production with Ollama and serverless GPUs by Wietse Venema


Taught by

Devoxx

Related Courses

Google Certified Associate Cloud Engineer
A Cloud Guru
Introduction to Serverless on Google Cloud
A Cloud Guru
Build a Google Workspace Add-on with Node.js and Cloud Run
Google via Google Cloud Skills Boost
Build a Serverless App with Cloud Run that Creates PDF Files
Google via Google Cloud Skills Boost
Cloud Run Canary Deployments
Google via Google Cloud Skills Boost