Building a Cloud Data Lake with Databricks and AWS - Best Practices and Implementation
Offered By: Databricks via YouTube
Course Description
Overview
Syllabus
Intro
What is a data lake?
A data lake architecture enables data science
Data lakes and analytics from AWS
Amazon Simple Storage Service (S3) Secure, highly scalable, durable object storage with millisecond latency for data access
Most ways to transfer data into the data lake Open and comprehensive
Most comprehensive and open
Cloud data lakes are great for data storage Data Lake is a file system that supports
Organizations want to operationalize To operationalize data lakes, you need features you expect on a database • Transactions
A new standard for building data lakes
Data reliability challenges with data lakes
Performance challenges with data lakes
Delta Lake: Adds Reliability & Performance
The A DELTA LAKE
Integration with Glue
Integration with Redshift
Cloud native enterprise solution
Best practices for building a cloud data lake
Databricks & AWS data lake implementation
Taught by
Databricks
Related Courses
Distributed Computing with Spark SQLUniversity of California, Davis via Coursera Apache Spark (TM) SQL for Data Analysts
Databricks via Coursera Building Your First ETL Pipeline Using Azure Databricks
Pluralsight Implement a data lakehouse analytics solution with Azure Databricks
Microsoft via Microsoft Learn Perform data science with Azure Databricks
Microsoft via Microsoft Learn