Different Streaming Methods with Apache Spark and Kafka
Offered By: Databricks via YouTube
Course Description
Overview
Explore different streaming methods using Apache Spark and Kafka in this 34-minute conference talk by Itai Yaffe from Nielsen. Learn how Nielsen Marketing Cloud (NMC) transformed their data infrastructure to support real-time analytics for marketers and publishers. Discover the journey from CSV files and standalone Java applications to multiple Kafka and Spark clusters, handling a mixture of streaming and batch ETLs while supporting 10x data growth. Gain insights into early adoption experiences with Spark Streaming and Spark Structured Streaming, including overcoming technical challenges. Examine a unique solution using Kafka to simulate streaming over a Data Lake, reducing cloud service costs. Cover topics such as Kafka and Spark Streaming for stateless and stateful use cases, Spark Structured Streaming as an alternative, combining Spark Streaming with batch ETLs, and "streaming" over Data Lake using Kafka.
Syllabus
Intro
Problems
Whats Next
Local Aggregation
Weaknesses
Kafka
Summary
Recap
Big Data for Women
Questions
Taught by
Databricks
Related Courses
Web Intelligence and Big DataIndian Institute of Technology Delhi via Coursera Big Data for Better Performance
Open2Study Big Data and Education
Columbia University via edX Big Data Analytics in Healthcare
Georgia Institute of Technology via Udacity Data Mining with Weka
University of Waikato via Independent