Machine Learning at Scale with MLflow and Apache Spark
Offered By: Databricks via YouTube
Course Description
Overview
Explore the challenges and solutions for deploying machine learning projects at scale in a major French bank during this conference talk. Learn about the difficulties faced in productionizing ML applications, including the lack of model registry and deployment issues. Discover how MLflow was implemented as a key component in the production Hadoop environment, overcoming security constraints. Examine the process of building a CI/CD pipeline for automatic ML application deployment, with MLflow playing a crucial role. Gain insights from a concrete production project utilizing MLflow, Spark streaming, Sklearn, and CI/CD. Understand the importance of defining clear collaboration processes, implementing a model registry, and establishing a CI/CD pipeline for successful machine learning productionization in large organizations like Société Générale.
Syllabus
Intro
About me
Agenda
Statistics
Common issues
Painful journey
Machine learning ecosystem
Data lake
Python Libraries
MLflow Tracking Server
Prediction
Real Project
Picking the news
Python Notebooks vs Spark
Versioning Data
Spark vs MLflow
QA
Regulation
Question
Taught by
Databricks
Related Courses
Intro to Hadoop and MapReduceCloudera via Udacity Processing Big Data with Hadoop in Azure HDInsight
Microsoft via edX Implementing Real-Time Analytics with Hadoop in Azure HDInsight
Microsoft via edX Hadoop Platform and Application Framework
University of California, San Diego via Coursera Data Manipulation at Scale: Systems and Algorithms
University of Washington via Coursera