Mastering a Data Pipeline with Python - 6 Years of Learned Lessons from Mistakes
Offered By: EuroPython Conference via YouTube
Course Description
Overview
Explore a comprehensive talk from EuroPython 2020 that delves into the intricacies of building data pipelines with Python. Learn from six years of hard-earned experience and valuable lessons gleaned from mistakes in creating reliable data pipelines and managing vast amounts of valuable data. Discover how to effectively utilize Python as the core technology for data pipeline development. Gain insights into various components of the data pipeline puzzle, including data acquisition, ingestion, transformation, storage, workflow management, and serving. Compare the merits of PySpark versus Dask and Pandas, understand the role of Airflow in workflow management, and explore Apache Arrow as a novel approach to data processing. Benefit from best practices and learn to anticipate and address potential issues in data pipeline development.
Syllabus
Robson Junior - Mastering a data pipeline with Python: 6 years of learned lessons from mistakes
Taught by
EuroPython Conference
Related Courses
Internet of Things: Sensing and Actuation From DevicesUniversity of California, San Diego via Coursera 用Python玩转数据 Data Processing Using Python
Nanjing University via Coursera Enabling Technologies for Data Science and Analytics: The Internet of Things
Columbia University via edX Data Journalism Fundamentals
Google via Independent Data Science Essentials
Microsoft via edX