Four Rare Machine Learning Skills All Data Scientists Need
Offered By: SAS via Coursera
Course Description
Overview
          This course covers the most neglected yet critical skills in machine learning, four vital techniques that are very rarely covered – most courses and books omit them entirely.
1) UPLIFT MODELING (AKA PERSUASION MODELING): When you're modeling, are you even predicting the right thing?
2) THE ACCURACY FALLACY: When evaluating how well a model works, are you even reporting on the right thing?
3) P-HACKING: Are your simplest discoveries from data even real?
4) THE PARADOX OF ENSEMBLE MODELS: Do you understand how they work, even though they seem to defy Occam's Razor?
>> WHY THESE ADVANCED METHODS ARE ESSENTIAL: Each one addresses a question that is fundamental to machine learning (above). For many projects, success hinges on these particular skills.
>> NO HANDS-ON – BUT FOR TECHNICAL LEARNERS: This course has no coding and no use of machine learning software. Instead, it lays the conceptual groundwork before you take on the hands-on practice. When it comes to these state-of-the-art techniques and prevalent pitfalls, there's a foundation of conceptual knowledge to build before going hands-on – and you'll be glad you did.
>> VENDOR-NEUTRAL: This course includes illuminating software demos of machine learning in action using SAS products. However, the curriculum is vendor-neutral and universally-applicable. The contents and learning objectives apply, regardless of which machine learning software tools you end up choosing to work with.
        
Syllabus
- Four Rare Machine Learning Skills All Data Scientists Need
- This one-week course has only one module, which covers the course's four rare yet vital topics: (1) UPLIFT MODELING: How do you optimize marketing – which is meant to persuade – if we cannot generally establish causal relationships? Put another way, how do you model and predict influence when you cannot measure influence? The special, advanced method uplift modeling (aka persuasion modeling) goes beyond predicting an outcome to actually predicting the influence that a treatment decision would have on that outcome. We'll explore the marketing applications of uplift modeling and see success stories from the likes of US Bank and President Obama's 2012 reelection campaign. (2) THE ACCURACY FALLACY: For many machine learning projects, high accuracy is unattainable – and, besides, accuracy isn't the right metric in the first place. But many projects are falsely advertised as "highly accurate." Learn to identify occurrences of the accuracy fallacy, a common misstep by which researches spread misinformation about predictive model performance. (3) P-HACKING: In what way is bigger data more dangerous? How do we avoid being fooled by random noise and ensure scientific discoveries are trustworthy? This prevalent pitfall is a huge gotcha! (4) THE PARADOX OF ENSEMBLE MODELS: Is there a way to advance model capability and performance that's elegant and simple, without involving the complexity of neural networks? Why yes there is.
Taught by
Eric Siegel
Tags
Related Courses
Data AnalysisJohns Hopkins University via Coursera Computing for Data Analysis
Johns Hopkins University via Coursera Scientific Computing
University of Washington via Coursera Introduction to Data Science
University of Washington via Coursera Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera
