YoVDO

Generating Mock Data with Python - NumPy, Pandas, and Datetime Libraries

Offered By: Keith Galli via YouTube

Tags

Python Courses Data Science Courses pandas Courses NumPy Courses Code Efficiency Courses

Course Description

Overview

Learn how to generate realistic mock data for sales analysis using Python in this comprehensive tutorial video. Explore the power of NumPy, Pandas, and Datetime libraries to create a sales dataset from scratch. Master techniques for simulating product purchases, implementing normal and geometric distributions, generating realistic timestamps, and creating random addresses. Discover how to improve code efficiency, add multiple item correlations, and produce 12 months of data across separate CSV files. Gain practical skills in data manipulation and synthetic dataset creation for data science projects.

Syllabus

- Intro & Background Info
- What we're creating in this video!
- Start writing code generating a simple dataframe & csv
- Task: Making our data more realistic, selecting some products with higher probability than others
- Task: Generate 12 months worth of data in 12 csvs calendar library, f-strings
- Make some months have more purchases than others
- Normal distributions in NumPy
- Improving speed of our code making testing easier
- Task: Generate random addresses for our data
- Task: Generate order times for purchases datetime library overview
- Using timedelta objects to add & subtract time from dates
- Generate a realistic quantity ordered for each product using numpy geometric distribution
- Add multiple items being more likely to be sold together and cleaning code a bit


Taught by

Keith Galli

Related Courses

Computational Investing, Part I
Georgia Institute of Technology via Coursera
Введение в машинное обучение
Higher School of Economics via Coursera
Математика и Python для анализа данных
Moscow Institute of Physics and Technology via Coursera
Introduction to Python for Data Science
Microsoft via edX
Python for Data Science
University of California, San Diego via edX