YoVDO

PandasUDFs - Scaling Ensembles for Improved Predictions

Offered By: Databricks via YouTube

Tags

Big Data Courses Apache Spark Courses Databricks Courses Data Processing Courses Predictive Modeling Courses Distributed Computing Courses Ensemble Models Courses

Course Description

Overview

Discover how to leverage PandasUDFs as a powerful technique for scaling ensemble models in this 38-minute Databricks talk. Learn to transform development code into scalable solutions for category-specific predictions, dramatically reducing runtime from hours to minutes. Explore the general usage, types of PandasUDFs, strategies for overcoming data limits, and equivalent approaches in R and Koalas. Gain insights into applying this method to scale from single models to entire ensembles, enhancing prediction accuracy across diverse categories.

Syllabus

Introduction
The Problem
PandasUDFs
Use Cases
Data Limits
Other Frameworks


Taught by

Databricks

Related Courses

Cloud Computing Concepts, Part 1
University of Illinois at Urbana-Champaign via Coursera
Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera
Reliable Distributed Algorithms - Part 1
KTH Royal Institute of Technology via edX
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera
Réalisez des calculs distribués sur des données massives
CentraleSupélec via OpenClassrooms