YoVDO

Pattern Matching at Scale Using Finite State Machine

Offered By: Strange Loop Conference via YouTube

Tags

Strange Loop Conference Courses Data Analysis Courses Regular Expressions Courses Finite State Machine Courses Abstract Syntax Tree Courses Syntax Trees Courses

Course Description

Overview

Explore pattern matching at scale using finite state machines in this conference talk from Strange Loop. Dive into the challenges of locating data that fits patterns within big data from non-homogeneous sources, focusing on Netflix's approach to improving the sign-up experience through experimentation. Learn about a framework for expressing user journey patterns translated into a Non-Deterministic Finite State Machine, inspired by Ken Thompson's 1968 CACM paper. Discover how this state machine is applied across billions of events using Spark, and how it's made accessible to Data Engineers, Scientists, and Analysts. Gain insights into the development of the "Conduit" framework, including design decisions and challenges encountered. The talk covers topics such as graph data models, wildcards, events in sequence, abstract syntax trees, regular expressions, Apache Spark optimizations, and matching multiple patterns simultaneously. Presented by Ajit Koti and Rashmi Shamprasad, experienced engineers from Netflix's Growth Data Engineering team, this session offers valuable knowledge for those interested in large-scale distributed systems, big data solutions, and data engineering.

Syllabus

Introduction
Example
Challenges
Common Solutions
Graph Data Models
Requirements
Demo
Questions
Wildcard
Events
Events in Sequence
Results
Who did that
Changing the expression
Summary statistics
Conclusion
Ajith Cody
Guiding Principles
Building Blocks
Abstract Syntax Trees
Finite State Machine
Regular Expressions
Syntax Tree
State Machine
Bounded Repeat
Methodology
Un unbounded repeat
Match state
Evaluation
Plan Selection
Provide Payment
Login Event
Apache Spark
Map Partition
Optimizations
Matching multiple patterns simultaneously


Taught by

Strange Loop Conference

Tags

Related Courses

80043368 - Strategies to Improve Human Papillomavirus (HPV) Vaccination Rates Among College Students
Johns Hopkins University via Independent
MBA Core Curriculum
University System of Maryland via edX
A Beginner’s Guide to Data Analytics
Boxplay via FutureLearn
A Beginner’s Guide to Data Handling and Management in Excel
Packt via FutureLearn
A Day in the Life of a Data Engineer (Korean)
Amazon Web Services via AWS Skill Builder