YoVDO

Scaling XGBoost for Thousands of Features with Databricks

Offered By: Databricks via YouTube

Tags

XGBoost Courses Data Science Courses Big Data Courses Machine Learning Courses Python Courses Scala Courses Apache Spark Courses Databricks Courses Online Advertising Courses Feature Engineering Courses

Course Description

Overview

Explore scaling techniques for XGBoost models with thousands of features in this 51-minute conference talk from Databricks. Dive into an online advertising use case that enables marketers to target users based on demographic information. Learn about the challenges faced, mistakes made, and valuable insights gained during the process of scaling XGBoost model training. Discover common pitfalls to avoid and notable differences between Python and Scala implementations of XGBoost in Spark. Gain practical knowledge from experts Phan Chuong and Eric Yatskowitz as they share their experiences in scaling machine learning models for production environments and supporting marketing decisions with data insights.

Syllabus

Intro
Welcome
Recording
Boulder Denver Group
Databricks Summit 2022
Fan and Eric
Introduction
Agenda
TMobile Marketing Solutions
Magenta Marketing Platform
Why dont we just use this data directly
How are demographic insights used
Pandas
UDF
Improving XGBoost
Data set
Why XGBoost
What we did
How did we achieve that
Parallelizations
Autoscaling
Normal transformation
Pivot vs Vector
RDD
Questions


Taught by

Databricks

Related Courses

Data Science at Scale - Capstone Project
University of Washington via Coursera
Feature Engineering for Improving Learning Environments
University of Texas Arlington via edX
How to Win a Data Science Competition: Learn from Top Kagglers
Higher School of Economics via Coursera
Advanced Machine Learning
The Open University via FutureLearn
Feature Engineering
Google Cloud via Coursera