YoVDO

Data Privacy Techniques with Apache Spark - Defensive and Offensive Approaches

Offered By: Databricks via YouTube

Tags

Data Privacy Courses Apache Spark Courses Encryption Courses

Course Description

Overview

Explore data privacy techniques and protection of personally identifiable information in this 27-minute talk from Databricks. Compare offensive and defensive approaches, learning about k-anonymity, quasi-identifiers, and various methods like suppression, perturbation, obfuscation, encryption, tokenization, and watermarking. Discover elementary code examples for implementing these techniques when third-party products are unavailable. Examine approaches to minimize data exfiltration risks and understand how Databricks Delta can assist in making datasets privacy-ready. Gain insights into the long-term implications of different privacy methods and their effects on statistical usefulness, re-identification risks, data schema, format preservation, and read/write performance.

Syllabus

Intro
Data Privacy
Offensive techniques
Technique comparison dimensions
Pseudonymization
Hashing
Making hash cracking a bit more difficult
Credit card numbers
Token Vault with Databricks Delta
Synthetic data
Generalisation
Binning
Truncating: IP addresses
Rounding
Auditing
Remote desktop
Screenshot prevention
Feedback


Taught by

Databricks

Related Courses

Introduction to Data Analytics for Business
University of Colorado Boulder via Coursera
Digital and the Everyday: from codes to cloud
NPTEL via Swayam
Systems and Application Security
(ISC)² via Coursera
Protecting Health Data in the Modern Age: Getting to Grips with the GDPR
University of Groningen via FutureLearn
Teaching Impacts of Technology: Data Collection, Use, and Privacy
University of California, San Diego via Coursera