Realtime Streaming with Data Lakehouse - End-to-End Data Engineering Project
Offered By: CodeWithYu via YouTube
Course Description
Overview
Learn to design, implement, and maintain secure, scalable, and cost-effective lakehouse architectures in this comprehensive video tutorial. Explore advanced techniques using Apache Spark, Apache Kafka, Apache Flink, Delta Lake, AWS, and open-source tools to unlock data's full potential through analytics and machine learning. Follow step-by-step instructions to set up a Kafka Broker in KRaft mode, configure Minio, produce data into Kafka, acquire S3 access credentials, create an S3 Bucket Event Listener for the lakehouse, and preview the resulting data. Gain practical insights into real-time streaming and data engineering best practices for building robust, scalable data solutions.
Syllabus
Setting up Kafka Broker in KRaft Mode
Setting up Minio
Producing data into Kafka
Acquiring Secret and Access Key for S3
Creating S3 Bucket Event Listener for Lakehouse
Data Preview and Results
Outro
Taught by
CodeWithYu
Related Courses
内存数据库管理openHPI CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX Processing Big Data with Azure Data Lake Analytics
Microsoft via edX Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera