YoVDO

Apache Spark Essential Training: Big Data Engineering

Offered By: LinkedIn Learning

Tags

Apache Spark Courses Data Engineering Courses Stream Processing Courses Parallel Processing Courses Batch Processing Courses

Course Description

Overview

Learn how to make Apache Spark work with other Big Data technologies and put together an end-to-end project that can solve a real-world business problem.

Syllabus

Introduction
  • Driving big data engineering with Apache Spark
  • Course prerequisites
  • Setting up the exercise files
1. Data Engineering Concepts
  • What is data engineering?
  • Data engineering vs. data analytics vs. data science
  • Data engineering functions
  • Batch vs. real-time processing
  • Data engineering with Spark
2. Spark Capabilities for ETL
  • Spark architecture review
  • Parallel processing with Spark
  • Spark execution plan
  • Stateful stream processing
  • Spark analytics and ML
3. Batch Processing Pipelines
  • Batch processing use case: Problem statement
  • Batch processing use case: Design
  • Setting up the local DB
  • Uploading stock to a central store
  • Aggregating stock across warehouses
4. Real-Time Processing Pipelines
  • Real-time use case: Problem
  • Real-time use case: Design
  • Generating a visits data stream
  • Building a website analytics job
  • Executing the real-time pipeline
5. Data Engineering with Spark: Best Practices
  • Batch vs. real-time options
  • Scaling extraction and loading operations
  • Scaling processing operations
  • Building resiliency
6. End-to-End Exercise Project
  • Project exercise requirements
  • Solution design
  • Extracting long last actions
  • Building a scorecard
Conclusion
  • More about Apache Spark

Taught by

Kumaran Ponnambalam

Related Courses

Apache Kafka Deep Dive
A Cloud Guru
Microsoft Certified: Azure Data Engineer Associate (DP-203)
A Cloud Guru
Approfondimento sui concetti e gli strumenti per analizzare i dati in streaming (Italiano) | Deep Dive into Concepts and Tools for Analyzing Streaming Data (Italian)
Amazon Web Services via AWS Skill Builder
Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera
Apache Kafka
LearnKartS via Coursera