YoVDO

Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB

Offered By: Udemy

Tags

Big Data Courses MongoDB Courses Apache Spark Courses Apache Hive Courses

Course Description

Overview

Full Hands on course to become Big Data Engineer: Spark/Kafka/Hadoop/Flume/Hive/Sqoop/MongoDB. Data Engineering course.

What you'll learn:
  • Hadoop Ecosystem, Sqoop, Flume, Hive
  • Expertise on writing code with Apache Spark
  • Learn Kafka Fundamentals and using Kafka Connectors
  • Learn writing queries and client in MongoDB
  • Learn Data Engineering technologies

In this course, you will start by learning what is hadoop distributed file system and most common hadoop commands required to work with Hadoop File system.


Then you will be introduced to Sqoop Import

  • Understand lifecycle of sqoop command.

  • Use sqoop import command to migrate data from Mysql to HDFS.

  • Use sqoop import command to migrate data from Mysql to Hive.

  • Use various file formats, compressions, file delimeter,where clause and queries while importing the data.

  • Understand split-by and boundary queries.

  • Use incremental mode to migrate the data from Mysql to HDFS.


Further, you will learn Sqoop Export to migrate data.

  • What is sqoop export

  • Using sqoop export, migrate data from HDFS to Mysql.

  • Using sqoop export, migrate data from Hive to Mysql.



Further, you will learn about Apache Flume

  • Understand Flume Architecture.

  • Using flume, Ingest data from Twitter and save to HDFS.

  • Using flume, Ingest data from netcat and save to HDFS.

  • Using flume, Ingest data from exec and show on console.

  • Describe flume interceptors and see examples of using interceptors.

  • Flume multiple agents

  • Flume Consolidation.


In the next section, we will learn about Apache Hive

  • Hive Intro

  • External &Managed Tables

  • Working with Different Files - Parquet,Avro

  • Compressions

  • Hive Analysis

  • Hive String Functions

  • Hive Date Functions

  • Partitioning

  • Bucketing


You will learn about Apache Spark

  • Spark Intro

  • Cluster Overview

  • RDD

  • DAG/Stages/Tasks

  • Actions &Transformations

  • Transformation &Action Examples

  • Spark Data frames

  • Spark Data frames - working with diff File Formats & Compression

  • Dataframes API's

  • Spark SQL

  • Dataframe Examples

  • Spark with Cassandra Integration

  • Running Spark on Intellij IDE

  • Running Spark on EMR


You will learn about Apache Kafka

  • Kafka Architecture

  • Partitions and offsets

  • Kafka Producers and Consumers

  • Kafka SerDEs

  • Kafka Messages

  • Kafka Connector

  • Ingesting Data using Kafka Connector

You will learn about MongoDB

  • MongoDB Usecases

  • CRUDOperations

  • MongoDBOperators

  • Working with Arrays

  • MongoDBwith Spark


DataEngineering Interview Preparation

  • Sqoop Interview Questions

  • Hive Interview Questions

  • Spark Interview Questions

  • Data Engineering common questions

  • Data Engineering Real project questions.





Taught by

Navdeep Kaur

Related Courses

Amazon EMR Getting Started (Indonesian)
Amazon Web Services via AWS Skill Builder
Analisar e preparar dados com o Amazon SageMaker Data Wrangler e o Amazon EMR (Português (Brasil)) | Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR (Portuguese (Brazil))
Amazon Web Services via AWS Skill Builder
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Managing Big Data in Clusters and Cloud Storage
Cloudera via Coursera
Analyzing Big Data with SQL
Cloudera via Coursera