PySpark - Python Spark Hadoop coding framework & testing
Offered By: Udemy
Course Description
Overview
Big data Python Spark PySpark coding framework logging error handling unit testing PyCharm PostgreSQL Hive data pipeline
What you'll learn:
What you'll learn:
- Python Spark PySpark industry standard coding practices - Logging, Error Handling, reading configuration, unit testing
- Building a data pipeline using Hive, Spark and PostgreSQL
- Python Spark Hadoop development using PyCharm
This course will bridge the gap between your academic and real world knowledge and prepare you for an entry level Big Data Python Spark developer role. You will learn the following
Python Spark coding best practices
Logging
Error Handling
Reading configuration from properties file
Doing development work using PyCharm
Using your local environment as a Hadoop Hive environment
Reading and writing to a Postgres database using Spark
Python unit testing framework
Building a data pipeline using Hadoop , Spark and Postgres
Prerequisites :
Basic programming skills
Basic database knowledge
Hadoop entry level knowledge
Taught by
FutureX Skills
Related Courses
Big DataUniversity of Adelaide via edX Advanced Data Science with IBM
IBM via Coursera Analysing Unstructured Data using MongoDB and PySpark
Coursera Project Network via Coursera Apache Spark for Data Engineering and Machine Learning
IBM via edX Apache Spark (TM) SQL for Data Analysts
Databricks via Coursera