Building Realtime Pipelines in Cloud Data Fusion
Offered By: Google via Google Cloud Skills Boost
Course Description
Overview
In addition to batch pipelines, Data Fusion also allows you to create realtime pipelines, that can process events as they are generated. Currently, realtime pipelines execute using Apache Spark Streaming on Cloud Dataproc clusters. In this lab you you will learn how to build a streaming pipeline using Data Fusion.
Syllabus
- GSP808
- Overview
- Setup and requirements
- Task 1. Project permissions
- Task 2. Ensure that the Dataflow API is successfully enabled
- Task 3. Load the data
- Task 4. Setting up Pub/Sub Topic
- Task 5. Add a Pub/Sub subscription
- Task 6. Add necessary permissions for your Cloud Data Fusion instance
- Task 7. Navigate the Cloud Data Fusion UI
- Task 8. Build a realtime pipeline
- Task 9. Send messages into Cloud Pub/Sub
- Task 10. Viewing your pipeline metrics
- Congratulations!
Tags
Related Courses
Google Cloud Fundamentals: Core InfrastructureGoogle via Coursera Google Cloud Big Data and Machine Learning Fundamentals
Google Cloud via Coursera Serverless Data Analysis with Google BigQuery and Cloud Dataflow en Français
Google Cloud via Coursera Essential Google Cloud Infrastructure: Foundation
Google Cloud via Coursera Elastic Google Cloud Infrastructure: Scaling and Automation
Google Cloud via Coursera