Introduction to Data Science

Offered By: University of Washington via Coursera

Course Description

Overview

Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels – scalable data management on and off the cloud, parallel algorithms, statistical modeling, and proficiency with a complex ecosystem of tools and platforms – span a variety of disciplines and are not easy to obtain through conventional curricula. Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression).

Syllabus

Part 0: Introduction

Examples, data science articulated, history and context, technology landscape

Part 1: Data Manipulation at Scale

Databases and the relational algebra
Parallel databases, parallel query processing, in-database analytics
MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages
Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

Part 2: Analytics

Topics in statistical modeling: basic concepts, experiment design, pitfalls
Topics in machine learning: supervised learning (rules, trees, forests, nearest neighbor, regression), optimization (gradient descent and variants), unsupervised learning

Part 3: Communicating Results

Visualization, data products, visual data analytics
Provenance, privacy, ethics, governance

Part 4: Special Topics

Graph Analytics: structure, traversals, analytics, PageRank, community detection, recursive queries, semantic web
Guest Lectures

Taught by

Bill Howe

Introduction to Data Science

Tags

Course Description

Overview

Syllabus

Taught by

Tags

Related Courses

Introduction to Data Science

Tags

Course Description

Overview

Syllabus

Taught by

Tags

Related Courses

Login to Continue