YoVDO

Leakage and the Reproducibility Crisis in ML-based Science

Offered By: Inside Livermore Lab via YouTube

Tags

Machine Learning Courses Data Science Courses Research Ethics Courses Logistic Regression Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the critical issue of data leakage and reproducibility in machine learning-based science through this insightful 48-minute talk. Delve into a comprehensive investigation of reproducibility failures across 17 scientific fields, affecting 329 papers and leading to overly optimistic conclusions. Examine a detailed taxonomy of 8 types of leakage, ranging from basic errors to complex research challenges. Learn about proposed methodological changes, including model info sheets, to prevent leakage before publication. Discover the results of a reproducibility study in civil war prediction, revealing how complex ML models fail to outperform older statistical methods due to data leakage. Gain valuable insights from Sayash Kapoor, a Ph.D. candidate at Princeton University, whose research on ML methods in science has garnered recognition and been featured in prominent media outlets.

Syllabus

DSI | Leakage and the Reproducibility Crisis in ML-based Science


Taught by

Inside Livermore Lab

Related Courses

Understanding Research: An Overview for Health Professionals
University of California, San Francisco via Coursera
Solid Science: Research Methods
University of Amsterdam via Coursera
مقدمة عن البحث في العلوم الاجتماعية والإنسانية
Rwaq (رواق)
Qualitative Research Methods
University of Amsterdam via Coursera
Ética en la investigación universitaria
University of the Basque Country via Miríadax