Apache Spark Essential Training: Big Data Engineering
Offered By: LinkedIn Learning
Course Description
Overview
Learn how to make Apache Spark work with other Big Data technologies and put together an end-to-end project that can solve a real-world business problem.
Syllabus
Introduction
- Driving big data engineering with Apache Spark
- Course prerequisites
- Setting up the exercise files
- What is data engineering?
- Data engineering vs. data analytics vs. data science
- Data engineering functions
- Batch vs. real-time processing
- Data engineering with Spark
- Spark architecture review
- Parallel processing with Spark
- Spark execution plan
- Stateful stream processing
- Spark analytics and ML
- Batch processing use case: Problem statement
- Batch processing use case: Design
- Setting up the local DB
- Uploading stock to a central store
- Aggregating stock across warehouses
- Real-time use case: Problem
- Real-time use case: Design
- Generating a visits data stream
- Building a website analytics job
- Executing the real-time pipeline
- Batch vs. real-time options
- Scaling extraction and loading operations
- Scaling processing operations
- Building resiliency
- Project exercise requirements
- Solution design
- Extracting long last actions
- Building a scorecard
- More about Apache Spark
Taught by
Kumaran Ponnambalam
Related Courses
Apache Kafka Deep DiveA Cloud Guru Microsoft Certified: Azure Data Engineer Associate (DP-203)
A Cloud Guru Approfondimento sui concetti e gli strumenti per analizzare i dati in streaming (Italiano) | Deep Dive into Concepts and Tools for Analyzing Streaming Data (Italian)
Amazon Web Services via AWS Skill Builder Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera Apache Kafka
LearnKartS via Coursera