YoVDO

Introduction to Designing Data Lakes on AWS

Offered By: Amazon Web Services via edX

Tags

Amazon Web Services (AWS) Courses AWS Glue Courses Amazon Kinesis Courses Data Lakes Courses Amazon QuickSight Courses Data Processing Courses Data Organization Courses Data Ingestion Courses

Course Description

Overview

Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.


Syllabus

Week 1: Hello World, I mean, Hello Data Lakes!

  • Video: Meet the Instructors
  • Video: Introduction to Week 1
  • Video: Why Data Lakes?
  • Video: Characteristics of a Data Lake
  • Video: Data Lake Components
  • Reading: Data Lake Characteristics and Components
  • Video: Comparison of a Data Lake to a Data Warehouse
  • Reading: Data Lakes and Data Warehouses
  • Video: Discussing sample Data Lake Architectures
  • Quiz/Assessment: Week 1 quiz

Week 2: AWS data related services

  • Video: Introduction to Week 2
  • Video: AWS Data Lake related services
  • Video: Amazon S3
  • Video: AWS Glue Data Catalog
  • Reading: S3 and Glue Data Catalog
  • Video: AWS Services used for data movement
  • Reading: Kinesis, API Gateway, etc
  • Video: AWS Services for Data processing
  • Video: AWS Services for Analytics
  • Video: AWS Services used for Predictive Analytics and Machine Learning
  • Reading: EMR, Glue Jobs, Lambda, Kinesis Analytics, Redshift
  • Video: Introduction to AWS LakeFormation
  • Reading: LakeFormation
  • Lab: Get familiar with AWS Services and create your first simple data lake

Week 3: Ingesting the rivers

  • Video: Introduction to Week 3
  • Video: Use the right tool for the job
  • Video: Understanding Data Structure and when to process data
  • Video: Data Streaming ingestion with Amazon Kinesis Services
  • Video: Diving Deep on Amazon Kinesis
  • Demo: Batch Data Ingestion with AWS Transfer Family
  • Reading: Batch Data Ingestion with AWS Services
  • Video: Data Cataloging
  • Demo: Using Glue Crawlers
  • Reading: The importance of data cataloging
  • Video: Reviewing the ingestion part of some Data Lake architectures
  • Lab: Ingesting Web Logs

Week 4: Processing and Analyzing data that sits in the Data Lake

  • Video: Introduction to Week 4
  • Video: Data prep and AWS Glue jobs
  • Video: File optimizations
  • Demo: Using S3, Glue and Athena to get insights about NYC Taxi data
  • Reading: Glue Jobs, Data Prep, Athena? Columnar Data Formats and Amazon Athena Optimizations
  • Video: Introduction to Data Lake security
  • Reading: Security and compliance
  • Video: The power of data visualization
  • Video: Introduction to Amazon QuickSight
  • Demo: Amazon Quicksight
  • Reading: Data visualization, Amazon QuickSight
  • Video: Registry of Open Data on AWS
  • Lab: Create an end-to-end Data Lake with AWS Services
  • Video: Course wrap-up!

Taught by

Rafael Lopes and Morgan Willis

Tags

Related Courses

Getting Started with Data Analytics on AWS
Amazon Web Services via edX
Getting Started with Data Analytics on AWS
Amazon Web Services via Coursera
Introduction to Amazon Quicksight
Pluralsight
Amazon (AWS) QuickSight - Getting Started
Udemy
Mastering AWS Glue, QuickSight, Athena & Redshift Spectrum
Udemy