What is Your Data Worth? Equitable Data Valuation in Machine Learning
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore the concept of equitable data valuation in machine learning through this 44-minute lecture by James Zou from Stanford University. Delve into the importance of measuring data value, focusing on its role in machine learning contexts. Learn about the Leave One Out Method and Data Shapley Value, understanding their desirable properties and applications. Examine real-world case studies, including UK Biobank lung cancer prediction, face recognition domain adaptation, dermatology classification, and clinical notes NLP. Discover how removing low-value data and adding high-value data impacts prediction accuracy, and how negative Shapley values can identify mislabeled data. Gain insights into efficient approximation methods for data Shapley and explore new frontiers in data valuation.
Syllabus
Intro
If data is fuel, then we need to measure its value
Data value in the context of ML
Ingredients of Data Value in ML
Leave One Out Method
Desirable properties
Data Shapley Value
Applications of Data Shapley
UK Biobank Lung Cancer prediction
Removing low value data improves prediction
Adding high value data improves prediction
Negative Shapley identifies mislabeled data
Domain adaptation: face recognition
Dermatology classification
Clinical notes NLP
Efficiently approximating data Shapley
New frontiers of data valuation
Discussion
Taught by
Simons Institute
Related Courses
Data Science BasicsA Cloud Guru Introduction to Machine Learning
A Cloud Guru Address Business Issues with Data Science
CertNexus via Coursera Advanced Clinical Data Science
University of Colorado System via Coursera Advanced Data Science Capstone
IBM via Coursera