YoVDO

Streaming Featurization with Ibis, Substrait and Apache Arrow

Offered By: Open Data Science via YouTube

Tags

Data Engineering Courses Big Data Courses Machine Learning Courses Real-Time Data Processing Courses Streaming Data Processing Courses Apache Arrow Courses

Course Description

Overview

Explore a collaborative effort between Two Sigma and Voltron Data to enhance featurization workflow performance using Ibis, Substrait, and Apache Arrow in this 31-minute conference talk. Learn about the evolution of open-source data science at Two Sigma, featurization challenges, and the key components of this powerful software stack. Dive into Apache Arrow's high-performance data representation, Ibis' high-level APIs for data processing and analysis, and Substrait's machine learning framework. Discover how this integrated solution enables real-time streaming data processing, providing fast and accurate insights for decision-making. Gain valuable knowledge about the future of data science interfaces and their potential to work with multiple data engines.

Syllabus

- Introductions
- How I Met Wes McKinney
- Timeline of Open Source Data Science at TS
- Featurization Challenges
- About Wes McKinney
- Apache Arrow
- Ibis
- Substrait
- One Data Science Interface; Many Data Engines
- Look Ahead


Taught by

Open Data Science

Related Courses

内存数据库管理
openHPI
CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Processing Big Data with Azure Data Lake Analytics
Microsoft via edX
Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera