Data Wrangling with Python Project
Offered By: University of Colorado Boulder via Coursera
Course Description
Overview
The "Data Wrangling Project" course provides students with an opportunity to apply the knowledge gained throughout the specialization in a real-life data wrangling project of their interest. Participants will follow the data wrangling pipeline step by step, from identifying data sources to processing and integrating data, to achieve a fine dataset ready for analysis. This course enables students to gain hands-on experience in the data wrangling process and prepares them to handle complex data challenges in real-world scenarios.
Throughout the course, students will work on their data wrangling project, applying the knowledge and skills gained in each module to achieve a refined and well-prepared dataset. By the end of the course, participants will be proficient in the data wrangling process and ready to tackle real-world data challenges in diverse domains.
Syllabus
- Data Wrangling Pipeline
- In this introductory week, you will gain an understanding of the data wrangling pipeline, which serves as a structured approach to transform raw data into a cleaned and organized dataset for analysis. You will learn the key stages involved in the pipeline, setting the foundation for the rest of the course.
- Identify Your Data
- In this week, you will learn how to identify and define the scope and objectives of your data wrangling project. You will explore various data sources, understand their structure, and assess the suitability of each source for the project.
- Data Collection and Integration
- This week covers the data collection and integration stage of the data wrangling process. You will learn techniques for data collection, validate the collected data, and integrate data from multiple sources.
- Data Understanding and Visualization
- This week focuses on gaining a comprehensive understanding of the dataset through statistical analysis and data visualization. You will learn how to perform descriptive statistics, create informative visualizations, and conduct exploratory data analysis (EDA).
- Data Processing and Manipulation
- In this week, you will delve into essential data processing and manipulation techniques. You will learn how to handle missing values, detect and handle outliers, perform data sampling and dimensionality reduction, apply data scaling and discretization, and explore data cubes and pivot tables.
Taught by
Di Wu
Tags
Related Courses
Intro to StatisticsStanford University via Udacity Introduction to Data Science
University of Washington via Coursera Passion Driven Statistics
Wesleyan University via Coursera Information Visualization
Indiana University via Independent DCO042 - Python For Informatics
University of Michigan via Independent