YoVDO

Introduction to Spark Datasets

Offered By: Scala Days Conferences via YouTube

Tags

Scala Days Courses Scala Courses Functional Programming Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Apache Spark's Dataset API in this 43-minute conference talk from Scala Days Copenhagen 2017. Dive into the basics of working with Spark Datasets, a hybrid approach that combines functional and relational programming concepts. Learn about Spark's components, including machine learning and streaming, and how they're being rewritten to support Dataset-compatible APIs. Discover the performance benefits and space efficiency of Spark SQL, and gain hands-on experience loading JSON data, applying schemas, and performing relational transformations. Understand how the optimizer works and how to mix functional and relational styles effectively. Examine windowed operations and window specifications, and grasp why Datasets are becoming increasingly important in the Spark ecosystem. No prior Spark knowledge is required, but a basic understanding of Scala is recommended.

Syllabus

Intro
What is Spark?
The different pieces of Spark
Why should we consider Spark SQL?
What is the performance like?
How is it so fast?
How much more space efficient?
Getting started
Loading some simple JSON data
Sample case class for schema
Then apply some type magic
What do relational transforms look like?
Writing a relational transformation
What can the optimizer do now?
Using Datasets to mix functional & relational style
And functional style maps
What is DS functional perf like?
Build the recipe for each query
Windowed operations
Window specs
Summary: Why to use Datasets
The next book.....


Taught by

Scala Days Conferences

Related Courses

Functional Programming Principles in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Functional Program Design in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Parallel programming
École Polytechnique Fédérale de Lausanne via Coursera
Big Data Analysis with Scala and Spark
École Polytechnique Fédérale de Lausanne via Coursera
Functional Programming in Scala Capstone
École Polytechnique Fédérale de Lausanne via Coursera