YoVDO

Machine Learning at Scale with MLflow and Apache Spark

Offered By: Databricks via YouTube

Tags

MLFlow Courses Machine Learning Courses Hadoop Courses Apache Spark Courses CI/CD Courses Data Lakes Courses Spark Streaming Courses

Course Description

Overview

Explore the challenges and solutions for deploying machine learning projects at scale in a major French bank during this conference talk. Learn about the difficulties faced in productionizing ML applications, including the lack of model registry and deployment issues. Discover how MLflow was implemented as a key component in the production Hadoop environment, overcoming security constraints. Examine the process of building a CI/CD pipeline for automatic ML application deployment, with MLflow playing a crucial role. Gain insights from a concrete production project utilizing MLflow, Spark streaming, Sklearn, and CI/CD. Understand the importance of defining clear collaboration processes, implementing a model registry, and establishing a CI/CD pipeline for successful machine learning productionization in large organizations like Société Générale.

Syllabus

Intro
About me
Agenda
Statistics
Common issues
Painful journey
Machine learning ecosystem
Data lake
Python Libraries
MLflow Tracking Server
Prediction
Real Project
Picking the news
Python Notebooks vs Spark
Versioning Data
Spark vs MLflow
QA
Regulation
Question


Taught by

Databricks

Related Courses

Intro to Hadoop and MapReduce
Cloudera via Udacity
Processing Big Data with Hadoop in Azure HDInsight
Microsoft via edX
Implementing Real-Time Analytics with Hadoop in Azure HDInsight
Microsoft via edX
Hadoop Platform and Application Framework
University of California, San Diego via Coursera
Data Manipulation at Scale: Systems and Algorithms
University of Washington via Coursera