Gallia - Practical Data Transformation in Scala
Offered By: Scala Days Conferences via YouTube
Course Description
Overview
Explore the practical applications of Gallia, a schema-aware data transformation library for Scala, in this 40-minute conference talk from Scala Days 2023 Seattle. Discover how Gallia emphasizes practicality, readability, and scalability, making it easier to transform data, create code readable by domain experts, and process big data using Apache Spark RDDs. Learn about the library's internal workings, including its two Directed Acyclic Graphs for schema and data processing. Watch live coding demonstrations showcasing use cases for both small and large datasets. Gain insights into Gallia's strengths, weaknesses, latest features like Avro/Parquet support, and future development plans. Understand how Gallia compares to alternative tools such as Pandas and Apache Spark, and determine when it might be the right choice for your data transformation needs.
Syllabus
Introduction
What is Gallia
Getting started
Goals
When to use it
Team behind Gallia
Case classes
IO support
Target selection
Live coding
Adding additional data
Transformation
Basic processing
Spark Context
Writing RDDs
Scaling
Summary
Feedback
Taught by
Scala Days Conferences
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera