YoVDO

Rapid PySpark Custom Processing on Time Series Big Data in Databricks

Offered By: Databricks via YouTube

Tags

PySpark Courses Time Series Analysis Courses Databricks Courses Distributed Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how Sleep Number leveraged Pyspark and Databricks to efficiently process massive time series data from smartbed sensors in this 28-minute conference talk. Learn about the challenges of analyzing noisy sensor readings and the implementation of custom entropy calculations on rolling windows. Explore the transition from a memory-constrained Pandas approach to a scalable Pyspark solution that processed 50 million records in just 0.3 seconds. Gain insights into optimizing big data processing for constant time complexity, regardless of data size. Presented by Gary Garcia Molina and Megha Rajam Rao from Sleep Number, this talk demonstrates advanced techniques for handling complex time series analysis in a distributed computing environment.

Syllabus

Rapid Pyspark Custom Processing on Time Series Big Data in Databricks


Taught by

Databricks

Related Courses

Data Processing with Azure
LearnQuest via Coursera
Mejores prácticas para el procesamiento de datos en Big Data
Coursera Project Network via Coursera
Data Science with Databricks for Data Analysts
Databricks via Coursera
Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera
Curso Completo de Spark con Databricks (Big Data)
Coursera Project Network via Coursera