YoVDO

Streaming Featurization with Ibis, Substrait and Apache Arrow

Offered By: Open Data Science via YouTube

Tags

Data Engineering Courses Big Data Courses Machine Learning Courses Real-Time Data Processing Courses Streaming Data Processing Courses Apache Arrow Courses

Course Description

Overview

Explore a collaborative effort between Two Sigma and Voltron Data to enhance featurization workflow performance using Ibis, Substrait, and Apache Arrow in this 31-minute conference talk. Learn about the evolution of open-source data science at Two Sigma, featurization challenges, and the key components of this powerful software stack. Dive into Apache Arrow's high-performance data representation, Ibis' high-level APIs for data processing and analysis, and Substrait's machine learning framework. Discover how this integrated solution enables real-time streaming data processing, providing fast and accurate insights for decision-making. Gain valuable knowledge about the future of data science interfaces and their potential to work with multiple data engines.

Syllabus

- Introductions
- How I Met Wes McKinney
- Timeline of Open Source Data Science at TS
- Featurization Challenges
- About Wes McKinney
- Apache Arrow
- Ibis
- Substrait
- One Data Science Interface; Many Data Engines
- Look Ahead


Taught by

Open Data Science

Related Courses

Processing Real-Time Data Streams in Azure
Microsoft via edX
Gérez des flux de données temps réel
CentraleSupélec via OpenClassrooms
Data Streaming
Udacity
Taming Big Data with Apache Spark and Python - Hands On!
Udemy
Python & Cryptocurrency API: Build 5 Real World Applications
Udemy