Immutable Data Science with Datomic, Spark and Kafka
Offered By: Strange Loop Conference via YouTube
Course Description
Overview
Explore an innovative approach to data science architecture in this conference talk that leverages Datomic, Spark, and Kafka for scalable real-time analysis of production data without traditional ETL techniques. Discover how immutability, consistent timelines, and multi-database querying enable machine learning models with full traceability in a microservices architecture. Learn about modern stored procedures, pass-by-reference queries, horizontal read scalability, and an immutable messaging substrate. Gain insights into an alternative to lambda and kappa architectures, addressing sensitive data encryption and information security concerns. Understand how this solution eliminates the need for ETL and database synchronization pipelines while maintaining scalability and isolation for both transactional and analytical use cases.
Syllabus
Intro
Microservices
Board
How is it stored?
How is it queried?
How do we get ?
Enriched entity
Entity from cursor and id
Multiple DBS
No interference
Using it
Sample message
Sample query
Model service
Output
Scoring time
Training time
RDDs: our use case
Sharding queries
Data access
Learning curve
Testimonials
Taught by
Strange Loop Conference
Tags
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent