YoVDO

Perform data engineering with Azure Synapse Apache Spark Pools

Offered By: Microsoft via Microsoft Learn

Tags

Microsoft Azure Courses Big Data Courses Apache Spark Courses Azure Synapse Analytics Courses Data Transformation Courses Data Engineering Courses DataFrames Courses Data Ingestion Courses

Course Description

Overview

  • Module 1: Understand big data engineering with Apache Spark in Azure Synapse Analytics
  • After completing this module, you will be able to:

    • Differentiate between Apache Spark and Spark pools
    • Differentiate between Azure Databricks and Spark pools
    • Differentiate between HDInsight and Spark Pools
    • Differentiate between Spark Pools and SQL Pools
    • Understand the use-cases of data engineering with Apache Spark in Azure Synapse analytics
    • Create a Spark pool in Azure Synapse Analytics
  • Module 2: Ingest data with Apache Spark notebooks in Azure Synapse Analytics
  • After completing this module, you will be able to:

    • Understand the use-cases for Spark Notebooks
    • Create a Spark Notebook in Azure Synapse Analytics
    • Understand the supported languages in Spark Notebooks
    • Develop Spark Notebooks
    • Run Spark Notebooks
    • Load data in Spark Notebooks
    • Save Spark Notebooks
  • Module 3: Transform data with DataFrames in Apache Spark Pools in Azure Synapse Analytics
  • After completing this module, you will be able to:

    • Understand DataFrames in Spark Pools in Azure Synapse Analytics
    • Load data into a Spark DataFrame
    • Create a Spark table
    • Write Data to and from a storage account
    • Load a streaming DataFrame into Apache Spark
    • Flatten nested structures and explode arrays with Apache Spark
  • Module 4: Integrate SQL and Apache Spark pools in Azure Synapse Analytics
  • After completing this module, you will be able to:

    • Describe the integration methods between SQL and Spark Pools in Azure Synapse Analytics
    • Understand the use-cases for SQL and Spark Pools integration
    • Authenticate in Azure Synapse Analytics
    • Transfer data between SQL and Spark Pool in Azure Synapse Analytics
    • Authenticate between Spark and SQL Pool in Azure Synapse Analytics
    • Integrate SQL and Spark Pools in Azure Synapse Analytics
    • Externalize the use of Spark Pools within Azure Synapse workspace
    • Transfer data outside the Synapse workspace using SQL Authentication
    • Transfer data outside the Synapse workspace using the PySpark Connector
    • Transform data in Apache Spark and write back to SQL Pool in Azure Synapse Analytics
  • Module 5: Monitor and manage data engineering workloads with Apache Spark in Azure Synapse Analytics
  • After completing this module, you will be able to:

    • Monitor Spark Pools in Azure Synapse Analytics
    • Understand Resource Utilization of Spark Pools in Azure Synapse Analytics
    • Monitor Query activity of Spark Pools in Azure Synapse Analytics
    • Base-line Apache Spark performance with Apache Spark History Server in Azure Synapse Analytics
    • Optimize Apache Spark jobs in Azure Synapse Analytics
    • Automate scaling of Apache Spark pools in Azure Synapse Analytics

Syllabus

  • Module 1: Understand big data engineering with Apache Spark in Azure Synapse Analytics
    • Introduction
    • What is an Apache Spark pool in Azure Synapse Analytics
    • How do Apache Spark pools work in Azure Synapse Analytics
    • When do you use Apache Spark pools in Azure Synapse Analytics
    • Knowledge check
    • Summary
  • Module 2: Ingest data with Apache Spark notebooks in Azure Synapse Analytics
    • Introduction
    • Introduction to spark notebooks
    • Understand the use-cases for spark notebooks
    • Exercise: Create a spark notebook in Azure Synapse Analytics
    • Discover supported languages in spark notebooks
    • Develop spark notebooks
    • Exercise: Develop spark notebooks
    • Run spark notebooks
    • Exercise: Run spark notebooks
    • Load data in spark notebooks
    • Exercise: Load data in spark notebooks
    • Save spark notebooks
    • Knowledge check
    • Summary
  • Module 3: Transform data with DataFrames in Apache Spark Pools in Azure Synapse Analytics
    • Introduction
    • Introduction to dataframes in spark pools in Azure Synapse Analytics
    • Load data into a spark dataframe
    • Exercise: Load data into a spark dataframe
    • Exercise: Create a spark table
    • Flatten nested structures and explode arrays with Apache Spark
    • Exercise: Flatten nested structures and explode arrays with Apache Spark in synapse
    • Knowledge check
    • Summary
  • Module 4: Integrate SQL and Apache Spark pools in Azure Synapse Analytics
    • Introduction
    • Describe the integration methods between SQL and spark pools in Azure Synapse Analytics
    • Understand the use-cases for SQL and spark pools integration
    • Authenticate in Azure Synapse Analytics
    • Transfer data between SQL and spark pool in Azure Synapse Analytics
    • Authenticate between spark and SQL pool in Azure Synapse Analytics
    • Exercise: Integrate SQL and spark pools in Azure Synapse Analytics
    • Externalize the use of spark pools within Azure Synapse Workspace
    • Transfer data outside the synapse workspace using the PySpark connector
    • Knowledge check
    • Summary
  • Module 5: Monitor and manage data engineering workloads with Apache Spark in Azure Synapse Analytics
    • Introduction
    • Monitor spark pools in Azure Synapse Analytics
    • Base-line Apache Spark performance with Apache Spark history server in Azure Synapse Analytics
    • Optimize Apache Spark jobs in Azure Synapse Analytics
    • Automate scaling of Apache Spark pools in Azure Synapse Analytics
    • Knowledge check
    • Summary

Tags

Related Courses

ETL and ELT Basics
A Cloud Guru
Programming Use Cases with Python
A Cloud Guru
Microsoft Power BI: Advanced Data Analysis and Visualisation
Cloudswyft via FutureLearn
Amazon Connect Data Streaming Intermediate
Amazon Web Services via AWS Skill Builder
Analisar e preparar dados com o Amazon SageMaker Data Wrangler e o Amazon EMR (Português (Brasil)) | Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR (Portuguese (Brazil))
Amazon Web Services via AWS Skill Builder