YoVDO

Building and Managing a Centralized ML Platform with Kubeflow at CERN

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Conference Talks Courses Machine Learning Courses Cloud Computing Courses Kubernetes Courses Data Preparation Courses Cluster Management Courses Model Training Courses Distributed Training Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the journey of building and managing a centralized machine learning platform using Kubeflow at CERN in this 31-minute conference talk. Discover how CERN leverages ML solutions for various challenges, including particle classification, simulation data generation, and beam calibration. Learn about the recently introduced centralized service that handles data preparation, model training, and serving while optimizing resource usage for different types of accelerators. Gain insights into CERN's experience with Kubeflow on Kubernetes, their integration of on-premises resources, and potential extensions to public clouds. Delve into topics such as cluster layout, deployment strategies, integrations, and automation of distributed training. Witness a demo of job submission and results, and understand the motivations behind CERN's ML platform development.

Syllabus

Introduction
Introductions
What is CERN
Motivation for our service
Reconstruction
Simulations
Goals
Platform
Cluster Layout
Deployment
Integrations
Issues
Burst to Public Clouds
Automating Distributed Training
Service Dashboard
Demo
Submitting jobs
Results
Closing remarks


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

How Google does Machine Learning en EspaƱol
Google Cloud via Coursera
Creating Custom Callbacks in Keras
Coursera Project Network via Coursera
Automatic Machine Learning with H2O AutoML and Python
Coursera Project Network via Coursera
AI in Healthcare Capstone
Stanford University via Coursera
AutoML con Pycaret y TPOT
Coursera Project Network via Coursera