YoVDO

Building and Managing a Centralized ML Platform with Kubeflow at CERN

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Conference Talks Courses Machine Learning Courses Cloud Computing Courses Kubernetes Courses Data Preparation Courses Cluster Management Courses Model Training Courses Distributed Training Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the journey of building and managing a centralized machine learning platform using Kubeflow at CERN in this 31-minute conference talk. Discover how CERN leverages ML solutions for various challenges, including particle classification, simulation data generation, and beam calibration. Learn about the recently introduced centralized service that handles data preparation, model training, and serving while optimizing resource usage for different types of accelerators. Gain insights into CERN's experience with Kubeflow on Kubernetes, their integration of on-premises resources, and potential extensions to public clouds. Delve into topics such as cluster layout, deployment strategies, integrations, and automation of distributed training. Witness a demo of job submission and results, and understand the motivations behind CERN's ML platform development.

Syllabus

Introduction
Introductions
What is CERN
Motivation for our service
Reconstruction
Simulations
Goals
Platform
Cluster Layout
Deployment
Integrations
Issues
Burst to Public Clouds
Automating Distributed Training
Service Dashboard
Demo
Submitting jobs
Results
Closing remarks


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Breaking the Teeth of Bluetooth Padlocks
YouTube
Closing Keynote Lectures or Life Experiences Awareness Training that Works
YouTube
Do You Want Educated Users Because This is How You Get Educated Users
YouTube
Don't Blame That Checklist for Your Crappy Security Program
YouTube
Managing Your MSSP
YouTube