YoVDO

Dask-SQL - Empowering Pythonistas for Scalable End-to-End Data Engineering

Offered By: PyCon US via YouTube

Tags

PyCon US Courses Machine Learning Courses Python Courses Data Engineering Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to leverage dask-sql for scalable end-to-end data engineering in this 26-minute PyCon US talk. Learn to overcome the challenges of accessing data trapped in Hive/Spark-based datalakes or complex SQL queries. Explore the capabilities of dask-sql, which enables Python developers to create comprehensive data projects without extensive knowledge of JVM/Hadoop ecosystems. Gain insights into performing SQL data extraction from datalakes and Hive tables using Python and dask-sql. Understand how to refine and utilize extracted data for machine learning, analytics, or transformation workloads with popular PyData tools. Delve into the innovative design of dask-sql, which combines SQL optimization from Apache Calcite, scalable dataframe operations via Dask, and integration with the Hive metastore data catalog. Follow along with a demo and walkthrough of dask-sql, covering topics such as the enterprise data processing pipeline and practical implementation. Access accompanying slides for further reference and study.

Syllabus

Introduction
Enterprise Data Processing Pipeline
DaskSQL
Demo
DaskSQL Walkthrough


Taught by

PyCon US

Related Courses

Intro to Python for Brand New Programmers
PyCon US via YouTube
Comprehending Comprehensions
PyCon US via YouTube
Data Analysis with SQLite and Python
PyCon US via YouTube
Build a Production Ready GraphQL API Using Python
PyCon US via YouTube
Web Development With A Python-backed Frontend - Featuring HTMX and Tailwind
PyCon US via YouTube