Data Privacy Techniques with Apache Spark - Defensive and Offensive Approaches
Offered By: Databricks via YouTube
Course Description
Overview
Explore data privacy techniques and protection of personally identifiable information in this 27-minute talk from Databricks. Compare offensive and defensive approaches, learning about k-anonymity, quasi-identifiers, and various methods like suppression, perturbation, obfuscation, encryption, tokenization, and watermarking. Discover elementary code examples for implementing these techniques when third-party products are unavailable. Examine approaches to minimize data exfiltration risks and understand how Databricks Delta can assist in making datasets privacy-ready. Gain insights into the long-term implications of different privacy methods and their effects on statistical usefulness, re-identification risks, data schema, format preservation, and read/write performance.
Syllabus
Intro
Data Privacy
Offensive techniques
Technique comparison dimensions
Pseudonymization
Hashing
Making hash cracking a bit more difficult
Credit card numbers
Token Vault with Databricks Delta
Synthetic data
Generalisation
Binning
Truncating: IP addresses
Rounding
Auditing
Remote desktop
Screenshot prevention
Feedback
Taught by
Databricks
Related Courses
Internet History, Technology, and SecurityUniversity of Michigan via Coursera Sicherheit im Internet
openHPI أساسيات التشفير
Rwaq (رواق) Desarrollo de Aplicaciones Web: Seguridad
University of New Mexico via Coursera Web Application Development: Security
University of New Mexico via Coursera