Spark 2.0
Offered By: Scala Days Conferences via YouTube
Course Description
Overview
Syllabus
Intro
What is Apache Spark?
A Large Community
Apache Spark Users
Original Spark Vision
Motivation: Unification
Motivation: Concise API
How Did the Vision Hold Up?
Libraries Built on Spark
Which Libraries Do People Use?
Top Applications
Main Challenge: Functional API
Which API Call Causes Most Tickets?
Example Problem
Challenge: Data Representation
Why Structure?
DataFrames and Datasets
Execution Steps
DataFrame API
Why DataFrames?
What Structured APIs Enable
Performance
Dataset API Details
Data Sources
Data Source API
Examples
Hardware Trends
Project Tungsten
Tungsten's Compact Encoding
Space Efficiency
Runtime Code Generation
Long-Term Vision
Versioning in Spark
Major Features in 2.0
Background
Structured Streaming High-level streaming API built on DataFrames/Datasets
Structured Streaming API
Example: Batch Aggregation
Example: Continuous Aggregation
Incrementalized By Spark
Release Timeline
Conclusion
Want to Learn Apache Spark?
Taught by
Scala Days Conferences
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera