YoVDO

Leveraging Apache Spark and Delta Lake for Efficient Data Encryption at Scale

Offered By: Databricks via YouTube

Tags

Apache Spark Courses Data Engineering Courses Data Security Courses Data Privacy Courses Data Encryption Courses Delta Lake Courses

Course Description

Overview

Explore an innovative approach to data privacy and security in this 25-minute conference talk from Databricks. Learn how Mars Petcare's data engineering team developed Gecko, an efficient CCPA compliance ecosystem designed for Apache Spark and Delta Lake. Discover how Gecko automates consumer deletion requests, enhances PII data security, maintains non-PII data integrity, and ensures accessibility of PII data when needed. Understand the implementation of row-level encryption for PII tables and the strategic storage of encryption keys. Gain insights into leveraging Spark and Delta Lake for large-scale data encryption, automated privacy rights requests, and enhanced platform security. Explore the potential for using the generated labeled dataset in developing machine learning models for automatic PII detection. Delve into the technical aspects, benefits, and future possibilities of this data privacy solution, tailored for organizations facing challenges in consumer data privacy compliance.

Syllabus

Intro
Agenda
Authors
The Petcare Data Platform
Our Mission
Gecko Ecosystem
Key Generation
Data Encryption
Optimizing Parquet Encryption
Master Table Generation
Gecko Delete
Benefits
Future Work


Taught by

Databricks

Related Courses

内存数据库管理
openHPI
CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Processing Big Data with Azure Data Lake Analytics
Microsoft via edX
Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera