Pattern Matching at Scale Using Finite State Machine
Offered By: Strange Loop Conference via YouTube
Course Description
Overview
Explore pattern matching at scale using finite state machines in this conference talk from Strange Loop. Dive into the challenges of locating data that fits patterns within big data from non-homogeneous sources, focusing on Netflix's approach to improving the sign-up experience through experimentation. Learn about a framework for expressing user journey patterns translated into a Non-Deterministic Finite State Machine, inspired by Ken Thompson's 1968 CACM paper. Discover how this state machine is applied across billions of events using Spark, and how it's made accessible to Data Engineers, Scientists, and Analysts. Gain insights into the development of the "Conduit" framework, including design decisions and challenges encountered. The talk covers topics such as graph data models, wildcards, events in sequence, abstract syntax trees, regular expressions, Apache Spark optimizations, and matching multiple patterns simultaneously. Presented by Ajit Koti and Rashmi Shamprasad, experienced engineers from Netflix's Growth Data Engineering team, this session offers valuable knowledge for those interested in large-scale distributed systems, big data solutions, and data engineering.
Syllabus
Introduction
Example
Challenges
Common Solutions
Graph Data Models
Requirements
Demo
Questions
Wildcard
Events
Events in Sequence
Results
Who did that
Changing the expression
Summary statistics
Conclusion
Ajith Cody
Guiding Principles
Building Blocks
Abstract Syntax Trees
Finite State Machine
Regular Expressions
Syntax Tree
State Machine
Bounded Repeat
Methodology
Un unbounded repeat
Match state
Evaluation
Plan Selection
Provide Payment
Login Event
Apache Spark
Map Partition
Optimizations
Matching multiple patterns simultaneously
Taught by
Strange Loop Conference
Tags
Related Courses
80043368 - Strategies to Improve Human Papillomavirus (HPV) Vaccination Rates Among College StudentsJohns Hopkins University via Independent MBA Core Curriculum
University System of Maryland via edX A Beginner’s Guide to Data Analytics
Boxplay via FutureLearn A Beginner’s Guide to Data Handling and Management in Excel
Packt via FutureLearn A Day in the Life of a Data Engineer (Korean)
Amazon Web Services via AWS Skill Builder