Running Open Large Language Models in Production with Ollama and Serverless GPUs
Offered By: Devoxx via YouTube
Course Description
Overview
Explore the deployment of open large language models in production environments using Ollama and serverless GPUs. Learn why companies are increasingly interested in running open models like Gemma and Llama, which offer full control over deployment, model upgrades, and data privacy. Discover how to leverage Ollama, a popular open-source LLM inference server, for both local and containerized environments. Gain practical insights into deploying an application that utilizes an open model with Ollama on Cloud Run, featuring scale-to-zero capabilities and serverless GPUs. This 43-minute talk from Devoxx provides valuable knowledge for organizations looking to harness the power of open LLMs while maintaining control over their AI infrastructure.
Syllabus
Running open large language models in production with Ollama and serverless GPUs by Wietse Venema
Taught by
Devoxx
Related Courses
Running Gemma Using HuggingFace Transformers and OllamaSam Witteveen via YouTube Machine Learning News: Gemma, Gemini, Groq, Sora, and AI Developments
Yannic Kilcher via YouTube AI News Roundup: Grok-1, Nvidia GTC, OpenAI Leaks, and EU AI Act
Yannic Kilcher via YouTube Creating, Building, and Releasing Gemma - Google's Open Model Family
TensorFlow via YouTube Claude 3 vs ChatGPT in Street Fighter - AI Model Tournament
All About AI via YouTube