SQL for Efficient Data Organization in Machine Learning
Offered By: Snorkel AI via YouTube
Course Description
Overview
Explore how SQL can enhance data organization for machine learning in this 11-minute video presentation by Columbia PhD student Zachary Huang. Learn about JoinBoost, a lightweight Python library that transforms tree training algorithms over normalized databases into pure SQL queries. Discover how this innovative approach addresses the mismatch between ML data organization requirements and traditional database structures, offering a simplified, all-in-one data stack solution. Gain insights into JoinBoost's compatibility with various DBMS and data stacks, its exceptional performance and scalability, and how it outperforms specialized ML libraries like LightGBM in terms of speed and scalability for random forests and gradient boosting algorithms.
Syllabus
Introduction
Background
Example
Problem Statement
Taught by
Snorkel AI
Related Courses
Getting and Cleaning DataJohns Hopkins University via Coursera 数据结构与算法第二部分 | Data Structures and Algorithms Part 2
Peking University via edX 社会调查与研究方法 (下)Methodologies in Social Research (Part 2)
Peking University via Coursera 統計学Ⅰ:データ分析の基礎 (ga014)
University of Tokyo via gacco Fundamentos do Google para o Ensino
Fundação Lemann via Coursera