Taming the Beast - Managing the Day 2 Operational Complexity of Kubeflow
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore strategies for managing the operational complexity of Kubeflow in this 25-minute conference talk from KubeCon + CloudNativeCon North America 2021. Gain insights into deploying, configuring, and maintaining Kubeflow, a machine learning toolkit for Kubernetes. Learn tips for navigating the platform's many components, including notebooks, service meshes, and pipelines. Discover lessons from experienced practitioners on managing Kubeflow deployments and contributing to upstream development. Delve into topics such as deployment options, operators, day 2 operations, component updates, security measures, integration with Istio, upgrades, external databases, and troubleshooting techniques. Equip yourself with practical knowledge to effectively tame the operational challenges of Kubeflow and optimize your machine learning workflows on Kubernetes.
Syllabus
Intro
Overview
Challenges
Taming the Beast
Kubeflow Deployment
Kubeflow Manifests Restructure
How This Can Help
Another Deployment Option
Operators
Kubeflow Operator
Day 2
Kubeflow Component Update
Stale Webhooks
Securing Your Deployment
Integration with Different Istio
Updates / Upgrades
External DB
Kubernetes Version Update
Troubleshoot Platform
Monitoring
The Future
References
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Introduction to Cloud Infrastructure TechnologiesLinux Foundation via edX Scalable Microservices with Kubernetes
Google via Udacity Google Cloud Fundamentals: Core Infrastructure
Google via Coursera Introduction to Kubernetes
Linux Foundation via edX Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX