Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB
Offered By: Udemy
Course Description
Overview
What you'll learn:
- Hadoop Ecosystem, Sqoop, Flume, Hive
- Expertise on writing code with Apache Spark
- Learn Kafka Fundamentals and using Kafka Connectors
- Learn writing queries and client in MongoDB
- Learn Data Engineering technologies
In this course, you will start by learning what is hadoop distributed file system and most common hadoop commands required to work with Hadoop File system.
Then you will be introduced to Sqoop Import
Understand lifecycle of sqoop command.
Use sqoop import command to migrate data from Mysql to HDFS.
Use sqoop import command to migrate data from Mysql to Hive.
Use various file formats, compressions, file delimeter,where clause and queries while importing the data.
Understand split-by and boundary queries.
Use incremental mode to migrate the data from Mysql to HDFS.
Further, you will learn Sqoop Export to migrate data.
What is sqoop export
Using sqoop export, migrate data from HDFS to Mysql.
Using sqoop export, migrate data from Hive to Mysql.
Further, you will learn about Apache Flume
Understand Flume Architecture.
Using flume, Ingest data from Twitter and save to HDFS.
Using flume, Ingest data from netcat and save to HDFS.
Using flume, Ingest data from exec and show on console.
Describe flume interceptors and see examples of using interceptors.
Flume multiple agents
Flume Consolidation.
In the next section, we will learn about Apache Hive
Hive Intro
External &Managed Tables
Working with Different Files - Parquet,Avro
Compressions
Hive Analysis
Hive String Functions
Hive Date Functions
Partitioning
Bucketing
You will learn about Apache Spark
Spark Intro
Cluster Overview
RDD
DAG/Stages/Tasks
Actions &Transformations
Transformation &Action Examples
Spark Data frames
Spark Data frames - working with diff File Formats & Compression
Dataframes API's
Spark SQL
Dataframe Examples
Spark with Cassandra Integration
Running Spark on Intellij IDE
Running Spark on EMR
You will learn about Apache Kafka
Kafka Architecture
Partitions and offsets
Kafka Producers and Consumers
Kafka SerDEs
Kafka Messages
Kafka Connector
Ingesting Data using Kafka Connector
You will learn about MongoDB
MongoDB Usecases
CRUDOperations
MongoDBOperators
Working with Arrays
MongoDBwith Spark
DataEngineering Interview Preparation
Sqoop Interview Questions
Hive Interview Questions
Spark Interview Questions
Data Engineering common questions
Data Engineering Real project questions.
Taught by
Navdeep Kaur
Related Courses
Amazon EMR Getting Started (Indonesian)Amazon Web Services via AWS Skill Builder Analisar e preparar dados com o Amazon SageMaker Data Wrangler e o Amazon EMR (Português (Brasil)) | Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR (Portuguese (Brazil))
Amazon Web Services via AWS Skill Builder Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Managing Big Data in Clusters and Cloud Storage
Cloudera via Coursera Analyzing Big Data with SQL
Cloudera via Coursera