The Birth and Growth of Spark: An Open Source Success Story
Offered By: MLOps.community via YouTube
Course Description
Overview
Explore the origins and evolution of Apache Spark in this insightful podcast episode featuring Matei Zaharia, the creator of Spark and Chief Technologist at Databricks. Delve into the development of other open-source projects like MLflow and Delta Lake, and gain valuable insights into the concept of "lakehouse ML." Discover Zaharia's perspectives on combining academia and industry experience, his approach to problem-solving in machine learning, and the development of the "DSP" (Demonstrate Search Predict) project for integrating LLMs with other text-returning systems. Learn about Spark's inception, the similarities and differences between Spark and MLflow, and the unique culture of Stanford's Computer Science Department. Gain advice for grad students, explore the impact of LLMs on the tech industry, and hear Zaharia's thoughts on corporate research labs and respected companies in the field.
Syllabus
[] Matei's preferred coffee
[] Takeaways
[] Please subscribe to our newsletters, join our Slack, and subscribe to our podcast channels!
[] Getting to know Matei as a person
[] Spark
[] Open and freewheeling cross-pollination
[] Actual formation of Spark
[] Spark and MLFlow Similarities and Differences
[] Concepts in MLFlow
[] DJ Khalid of the ML world
[] Data Lakehouse
[] Stanford's unique culture of the Computer Science Department
[] Starting a company
[] Unique advice to grad students
[] Open source project
[] LLMs in the New Revolution
[] Type of company to start with
[] Emergence of Corporate Research Labs
[] LLMs size context
[] Companies to respect
[] Wrap up
Taught by
MLOps.community
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera