Building Data Lakes on AWS
Offered By: Amazon Web Services via Coursera
Course Description
Overview
          The fundamental-level course is typically designed for individuals with a basic understanding of data storage and processing concepts but little to no prior experience with building data lakes on AWS specifically. After a brief introduction to Data Lakes, we'll introduce data ingestion, cataloging and preparation, concluding with an overview of querying data with Amazon Athena. The course will continue with an AWS Lake Formation overview, including a hands-on lab where you'll build a data lake. We'll then introduce data processing and analytics leveraing AWS Glue before diving into automated data lake creatiokn using Lake Formation blueprints. Finally, we'll close with Modern Data Architectures on AWS with a lab that covers publishing and consuming data products as a service.
        
Syllabus
- Module 1: Introduction to Data Lakes
- This module provides an overview of data lakes, their purpose, and how they differ from data warehouses. It also covers the components and architectures involved in data lakes.
- Module 2: Data ingestion, cataloging, and preparation
- This module focuses on the processes of ingesting data into a data lake, cataloging the data, and preparing it for analysis. It covers topics such as data lake storage, data ingestion methods, crawling and cataloging data, data formatting, partitioning, compression, and querying data with Amazon Athena.
- Module 3: Building a data lake with AWS Lake Formation
- This module introduces AWS Lake Formation, a service that helps build and manage data lakes on AWS. It covers the basic permission model, and provides an overview of the service’s features and capabilities.
- Module 4: Data processing and analytics
- This module covers data transformation techniques and tools like AWS Glue for processing and analyzing data in the data lake. It includes hands-on demos and a technical talk on Glue and Athena Federated Queries.
- Module 5: AWS Lake Formation additional configurations and capabilities
- This module explores advanced features and configurations of AWS Lake Formation, including blueprints, workflows, and fine-grained access control. It also covers data visualization with Amazon QuickSight.
- Module 6: Modern data architecture on AWS
- This module introduces the concept of modern data architecture and its implementation on AWS. It covers data movement scenarios, data sharing models, and relevant readings.
Taught by
Morgan Willis and Rafael Lopes
Tags
Related Courses
Data Lakes for Big DataEdCast Distributed Computing with Spark SQL
University of California, Davis via Coursera Modernizing Data Lakes and Data Warehouses with Google Cloud
Google Cloud via Coursera Data Engineering with AWS
Udacity Preparing for Google Cloud Certification: Cloud Data Engineer
Google Cloud via Coursera
