Introduction to Data Science
Offered By: University of Washington via Coursera
Course Description
Overview
          Commerce and research are being transformed by data-driven discovery and
prediction. Skills required for data analytics at massive levels – scalable
data management on and off the cloud, parallel algorithms, statistical
modeling, and proficiency with a complex ecosystem of tools and platforms
– span a variety of disciplines and are not easy to obtain through conventional
curricula. Tour the basic techniques of data science, including both SQL
and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries),
algorithms for data mining (e.g., clustering and association rule mining),
and basic statistical modeling (e.g., linear and non-linear regression).
Syllabus
Part 0: Introduction 
- Examples, data science articulated, history and context, technology landscape
- Databases and the relational algebra 
        
 
- Parallel databases, parallel query processing, in-database analytics
- MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages
- Key-value stores and NoSQL; tradeoffs of SQL and NoSQL
- Topics in statistical modeling: basic concepts, experiment design, pitfalls
 
- Topics in machine learning: supervised learning (rules, trees, forests, nearest neighbor, regression), optimization (gradient descent and variants), unsupervised learning
- Visualization, data products, visual data analytics 
        
 
- Provenance, privacy, ethics, governance
- Graph Analytics: structure, traversals, analytics, PageRank, community detection, recursive queries, semantic web
- Guest Lectures
Taught by
Bill Howe
Tags
Related Courses
Introduction to DatabasesMeta via Coursera Web Development
Udacity Datenmanagement mit SQL
openHPI Sabermetrics 101: Introduction to Baseball Analytics
Boston University via edX Intro to Relational Databases
Udacity
