Python for Data Engineering: from Beginner to Advanced
Offered By: LinkedIn Learning
Course Description
Overview
Practice fundamental skills using Python for data engineering in this hands-on, interactive course with coding challenges in CoderPad.
Syllabus
Introduction
- Welcome to the course
- What you should know
- CoderPad tour
- Introduction to Python and data engineering
- Setting up your Python environment
- Explore a Google Colab worksheet
- Variables and data types
- Operators and expressions
- Control structures
- Functions
- Modules and packages
- String manipulation
- Error handling
- Solution: String Manipulation
- Collection overview
- Python collections: Tuples
- Python collections: Lists
- Python collections: Sets
- Python collections: Dictionaries
- Solution: Analyze list
- File I/O overview
- Working with CSV files
- Working with JSON files
- Solution: Read/Write text to file
- Introduction to pandas
- Read files as DataFrames
- Data cleaning and preprocessing
- Data manipulation and aggregation
- Data visualization
- Write DataFrames as files
- Solution: Play with pandas
- Introduction to NumPy
- Array creation and attributes
- Array operations
- Indexing and slicing
- Linear algebra and statistics
- Write DataFrames as files
- Solution: NumPy Array Operation
- Understanding classes and objects
- Implementation: Classes and objects in Python
- Understand OOP features: Abstraction, inheritance, and more
- Solution: Accessing Object attributes
- Tips to write efficient Python code
- What is ETL in the data engineering world?
- What is Hadoop?
- Understand PySpark for data engineering
- Importance of visualization tools in DE
- On-prem vs. cloud data engineering
- Capstone project: Retail sales analysis
- Solution: Capstone project
- Next steps
Taught by
Deepak Goyal
Related Courses
Intro to StatisticsStanford University via Udacity Introduction to Data Science
University of Washington via Coursera Passion Driven Statistics
Wesleyan University via Coursera Information Visualization
Indiana University via Independent DCO042 - Python For Informatics
University of Michigan via Independent