Introduction to Data Engineering
Offered By: DeepLearning.AI via Coursera
Course Description
Overview
In this course, you will be introduced to the data engineering lifecycle, from data generation in source systems, to ingestion, transformation, storage, and serving data to downstream stakeholders. You’ll study the key undercurrents that affect all stages of the lifecycle, and start developing a framework for how to think like a data engineer. To gain hands-on practice, you’ll gather stakeholder needs, translate those needs into system requirements, and choose tools and technologies to build systems that provide business value. By the end of this course you’ll be spinning up batch and streaming data pipelines to serve product recommendations on the AWS cloud!
Syllabus
- Introduction to Data Engineering
- Gain a high-level overview of the data engineering lifecycle and key undercurrents to understand how data engineers add business value to organizations. Start developing a mental framework for thinking like a data engineer, starting with gathering stakeholder needs and translating them into system requirements. Learn the basics of working on the cloud from an AWS expert.
- The Data Engineering Lifecycle and Undercurrents
- Dive deeper into the stages of the data engineering lifecycle and its key undercurrents. Build an end-to-end data pipeline on AWS that encompasses all the stages of the data engineering lifecycle.
- Data Architecture
- Define data architecture and how it fits within the larger enterprise architecture. Examine the principles of good data architecture and how these principles inform tools and technology choices. Evaluate and optimize the security, performance, reliability, cost-efficiency, and scalability of a web application hosted on AWS.
- Translating Requirements to Architecture
- Practice gathering stakeholder needs and translating them into system requirements. Choose the appropriate tools and technologies based on the system requirements, then build an end-to-end data system that includes a batch and a streaming component to train a product recommendation system and serves product recommendations to a sales platform.
Taught by
Joe Reis
Tags
Related Courses
Google Cloud Big Data and Machine Learning Fundamentals en EspañolGoogle Cloud via Coursera Data Analysis with Python
IBM via Coursera Intro to TensorFlow 日本語版
Google Cloud via Coursera TensorFlow on Google Cloud - Français
Google Cloud via Coursera Freedom of Data with SAP Data Hub
SAP Learning