Data Streaming
Offered By: Udacity
Course Description
Overview
Learn the latest skills to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.
Syllabus
- Welcome to the Data Streaming Nanodegree Program
- Data Ingestion with Kafka and Kafka Streaming
- Learn to use REST Proxy, Kafka Connect, KSQL, and Faust Python Stream Processing and use it to stream public transit statuses using Kafka and Kafka ecosystem to build a stream processing application that shows the status of trains in real-time.
- Streaming API Development and Documentation
- In this course you will grow your expertise in the components of streaming data systems, and build a real
time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture
and API), build a continuous application with Structured Streaming, consume and process data from Apache
Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame
as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink
for accuracy. - Career Services
Taught by
Ben Goldberg, Judit Lantos, David Drummond and Jillian Kim
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera