Conceptualizing the Processing Model for Apache Flink
Offered By: Pluralsight
Course Description
Overview
Flink is a stateful, tolerant, and large-scale system with excellent latency and throughput characteristics. It works with bounded and unbounded datasets using the same underlying stream-first architecture, focusing on streaming or unbounded data.
Apache Flink is built on the concept of stream-first architecture, where the stream is the source of truth. Flink offers extensive APIs to process both batch as well as streaming data in an easy and intuitive manner. In this course, Conceptualizing the Processing Model for Apache Flink, you’ll be introduced to Flink Architecture and processing APIs to get started on your data analysis journey. First, you’ll explore the differences between processing batch and streaming data, and understand how stream-first architecture works. You’ll study the stream-first processing model that Flink uses to process data at scale, and Flink’s architecture which uses JobManager, TaskManagers, and task slots to execute the operators and streams in a Flink application in a data-parallel manner. Next, you’ll understand the difference between stateless and stateful stream transformations and apply these concepts in a hands-on manner in your Flink stream processing. You’ll process data in a stateless manner using the map(), flatMap(), and filter() transformations, and use keyed streams and rich functions to work with Flink state. Finally, you’ll round off your understanding of the state persistence and fault-tolerance mechanism that Flink uses by exploring the checkpointing architecture in Flink. You’ll enable checkpoints and savepoints in your streaming application, see how state can be restored from a snapshot in the case of failures, and configure your Flink application to support different restart strategies. When you’re finished with this course, you’ll have the skills and knowledge to design Flink pipelines performing stateless and stateful transformations, and you’ll be able to build fault-tolerant applications using checkpoints and savepoints.
Apache Flink is built on the concept of stream-first architecture, where the stream is the source of truth. Flink offers extensive APIs to process both batch as well as streaming data in an easy and intuitive manner. In this course, Conceptualizing the Processing Model for Apache Flink, you’ll be introduced to Flink Architecture and processing APIs to get started on your data analysis journey. First, you’ll explore the differences between processing batch and streaming data, and understand how stream-first architecture works. You’ll study the stream-first processing model that Flink uses to process data at scale, and Flink’s architecture which uses JobManager, TaskManagers, and task slots to execute the operators and streams in a Flink application in a data-parallel manner. Next, you’ll understand the difference between stateless and stateful stream transformations and apply these concepts in a hands-on manner in your Flink stream processing. You’ll process data in a stateless manner using the map(), flatMap(), and filter() transformations, and use keyed streams and rich functions to work with Flink state. Finally, you’ll round off your understanding of the state persistence and fault-tolerance mechanism that Flink uses by exploring the checkpointing architecture in Flink. You’ll enable checkpoints and savepoints in your streaming application, see how state can be restored from a snapshot in the case of failures, and configure your Flink application to support different restart strategies. When you’re finished with this course, you’ll have the skills and knowledge to design Flink pipelines performing stateless and stateful transformations, and you’ll be able to build fault-tolerant applications using checkpoints and savepoints.
Taught by
Janani Ravi
Related Courses
Coding the Matrix: Linear Algebra through Computer Science ApplicationsBrown University via Coursera كيف تفكر الآلات - مقدمة في تقنيات الحوسبة
King Fahd University of Petroleum and Minerals via Rwaq (رواق) Datascience et Analyse situationnelle : dans les coulisses du Big Data
IONIS via IONIS Data Lakes for Big Data
EdCast 統計学Ⅰ:データ分析の基礎 (ga014)
University of Tokyo via gacco