Re-Imagining Apache Spark Development - Tools for Productivity and Standardization
Offered By: Databricks via YouTube
Course Description
Overview
Explore a 25-minute conference talk that challenges traditional ETL tools and proposes a new approach to Apache Spark development. Delve into the evolution of data engineering practices, from ETL tools to code-based solutions, and discover why current methods may be falling short. Learn about innovative tools designed to enhance Spark development, focusing on productivity, code standardization, metadata management, lineage tracking, and agile CI/CD processes. Gain insights into the potential of a new generation of development tools that combine the benefits of code-based approaches with the standardization and productivity features of traditional ETL tools. Witness a demonstration of Prophecy, a tool embodying these new principles, and understand how it aims to revolutionize Apache Spark development for modern data engineering needs.
Syllabus
Introduction
Data Engineering vs ETL
How to become successful with ETL
Its bad for Spark
This is 2020
What does Butdo look like
Engineering Tools
Visual ETL
Standardized Components
Metadata
Continuous Deployment
Compilers
Demo
Prophecy
Taught by
Databricks
Related Courses
内存数据库管理openHPI CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX Processing Big Data with Azure Data Lake Analytics
Microsoft via edX Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera