3rd Party Data Burns
Offered By: YouTube
Course Description
Overview
Explore a conference talk that delves into the complexities of handling third-party data sets, focusing on normalization techniques, wildcard usage versus domain:key approaches, and lessons learned from red-teaming experiences. Learn about the growth of a data analysis tool, measured using Chris Roberts' metric, and discover intriguing insights about data quality issues, including the prevalence of unusual entries in large-scale SQL dumps. Gain valuable knowledge on inheriting and managing complex data sets, cleaning processes, and the importance of data integrity in cybersecurity and information analysis.
Syllabus
RD PARTY DATA BURNS
the traditional about me moment!
you inherit the complexity of the data-set by virtue. so the more of a mess it is, the more cleaning up you need to do.
normalization, wild-cards Vs domain:key and the 'Big Lesson learned
when we've been used in red-teaming, the tool has kicked ass!
I've been using the Chris Roberts' metric to track the growth of the tool.
the elephant in the room
just check the email-field of any recent SQL dump, to verify that.
731,308,683 Unique Documents Indexed
of course 42 people used the zip code for the State of Michigan Department of Treasury'.
Related Courses
Cryptography IStanford University via Coursera MongoDB Advanced Deployment and Operations
MongoDB University Developing SQL Databases
Microsoft via edX Six Sigma Tools for Define and Measure
University System of Georgia via Coursera Using clinical health data for better healthcare
The University of Sydney via Coursera