Ten Years of Building Open Source Standards: From Parquet to Arrow to OpenLineage
Offered By: Data Council via YouTube
Course Description
Overview
Explore the journey of building successful open source projects in the data ecosystem through this 35-minute conference talk by Julien Le Dem, Chief Architect at Astronomer and Co-Founder of Datakin. Gain insights into the ideation process and early growth of Apache Parquet columnar format, and discover how it led to the creation of Apache Arrow. Learn about the development of OpenLineage, an LFAI & Data project bringing observability to the data ecosystem. Understand the factors that contributed to the success of these projects and how they have shaped the data landscape over the past decade. Benefit from Le Dem's extensive experience in data processing tools and content platforms, including his work at Twitter, Wework, Dremio, and Yahoo.
Syllabus
Ten years of building open source standards: From Parquet to Arrow to OpenLineage | Astronomer
Taught by
Data Council
Related Courses
Using Pandas and Dask to Work with Large Columnar Datasets in Apache ParquetEuroPython Conference via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube Building InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet
Data Council via YouTube Ten Years of Building Open Source Standards in Data Engineering
Data Council via YouTube Time Series Analytics with Apache Arrow, Pandas, and Parquet - A 101 Introduction
Data Council via YouTube