YoVDO

Dependency Management in Spark Connect - Simple, Isolated, Powerful

Offered By: Databricks via YouTube

Tags

Apache Spark Courses Python Courses Scala Courses Distributed Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the new session-based dependency management system in Spark Connect, introduced in Apache Sparkā„¢ 3.5.0, through this 20-minute conference talk by Databricks engineers Akhil Gudesa and Hyukjin Kwon. Dive into the challenges of managing application environments in distributed computing and learn how Spark Connect addresses the limitations of static dependency setups. Discover the power of the Artifact API for dynamic dependency updates during runtime while maintaining strict isolation across sessions. Through practical examples, gain insights on creating, packaging, utilizing, and updating custom isolated environments for seamless execution of both Python and Scala applications. Enhance your understanding of flexible dependency management in distributed computing environments and explore additional resources on Data Lakehouse architecture and Lakehouse Fundamentals Training.

Syllabus

Dependency Management in Spark Connect: Simple, Isolated, Powerful


Taught by

Databricks

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera