Large Scale Geospatial Indexing and Analysis on Apache Spark
Offered By: Databricks via YouTube
Course Description
Overview
Explore large-scale geospatial indexing and analysis using Apache Spark in this 23-minute conference talk by Databricks. Delve into the challenges of processing geospatial data at scale, examining open-source frameworks like Apache Sedona and its improvements over conventional technology. Learn about spatial data structures, formats, and indexing techniques such as H3. Discover how these components integrate into a cloud-first architecture utilizing Databricks, Delta, MLFlow, and AWS. Examine practical examples of geospatial analysis with complex geometries and spatial queries. Gain insights into augmenting analysis with machine learning modeling, human-in-the-loop annotation, and quality validation. The talk covers topics including spatial indexing, use cases, SQL queries, spatial joins, geometry overlap, and overall architecture, providing a comprehensive overview of large-scale geospatial data processing and analysis techniques.
Syllabus
Introduction
About Safegra
Processing
Spatial Indexing
Use Cases
Safecraft Approach
SQL Query
Spatial Join
Geometry Overlap
Architecture
Blog
Taught by
Databricks
Related Courses
Distributed Computing with Spark SQLUniversity of California, Davis via Coursera Apache Spark (TM) SQL for Data Analysts
Databricks via Coursera Building Your First ETL Pipeline Using Azure Databricks
Pluralsight Implement a data lakehouse analytics solution with Azure Databricks
Microsoft via Microsoft Learn Perform data science with Azure Databricks
Microsoft via Microsoft Learn