Machine Learning at Scale with MLflow and Apache Spark
Offered By: Databricks via YouTube
Course Description
Overview
Explore the challenges and solutions for deploying machine learning projects at scale in a major French bank during this conference talk. Learn about the difficulties faced in productionizing ML applications, including the lack of model registry and deployment issues. Discover how MLflow was implemented as a key component in the production Hadoop environment, overcoming security constraints. Examine the process of building a CI/CD pipeline for automatic ML application deployment, with MLflow playing a crucial role. Gain insights from a concrete production project utilizing MLflow, Spark streaming, Sklearn, and CI/CD. Understand the importance of defining clear collaboration processes, implementing a model registry, and establishing a CI/CD pipeline for successful machine learning productionization in large organizations like Société Générale.
Syllabus
Intro
About me
Agenda
Statistics
Common issues
Painful journey
Machine learning ecosystem
Data lake
Python Libraries
MLflow Tracking Server
Prediction
Real Project
Picking the news
Python Notebooks vs Spark
Versioning Data
Spark vs MLflow
QA
Regulation
Question
Taught by
Databricks
Related Courses
Advanced Big Data Systems | 高级大数据系统Tsinghua University via edX Data Streaming
Udacity 数据科学 | Data Science
Tsinghua University via edX Apache Spark with Scala - Hands On with Big Data!
Udemy Streaming Big Data with Spark Streaming and Scala
Udemy