YoVDO

From 0 to 1: The Oozie Orchestration Framework

Offered By: Udemy

Tags

Hadoop Courses MapReduce Courses Data Pipelines Courses

Course Description

Overview

A first-principles guide to working with Workflows, Coordinators and Bundles in Oozie

What you'll learn:
  • Install and set up Oozie
  • Configure Workflows to run jobs on Hadoop
  • Configure time-triggered and data-triggered Workflows
  • Configure data pipelines using Bundles

Prerequisites: Working with Oozie requires some basic knowledge of the Hadoop eco-system and running MapReduce jobs

Taught byateam which includes 2Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scaledata processing jobs.

Oozie is like the formidable, yet super-efficient admin assistant who can get things done for you, if you know how to ask

Let's parse that

formidable, yet super-efficient:Oozie is formidable because it is entirely written in XML, which ishard to debug when things go wrong. However, once you've figured out how to work with it, it's like magic. Complex dependencies, managing a multitude of jobs at different time schedules, managing entire data pipelines are all made easy with Oozie

get things done for you:Oozie allows you to manage Hadoop jobs as well as Java programs, scripts and any other executable with the same basic set up. It manages your dependencies cleanly and logically.

if you know how to ask:Knowing the right configurations parameters which gets the job done, that is the key to mastering Oozie

What's Covered:

Workflow Management:Workflow specifications, Action nodes, Control nodes, Global configuration, real examples with MapReduce and Shell actions which you can run and tweak

Time-based and data-based triggers for Workflows:Coordinator specification, Mimicing simple cron jobs, specifying time and data availability triggers for Workflows, dealing with backlog, running time-triggered and data-triggered coordinator actions

Data Pipelines using Bundles:Bundle specification, the kick-off time for bundles, running a bundle on Oozie


Taught by

Loony Corn

Related Courses

Intro to Hadoop and MapReduce
Cloudera via Udacity
Processing Big Data with Hadoop in Azure HDInsight
Microsoft via edX
Implementing Real-Time Analytics with Hadoop in Azure HDInsight
Microsoft via edX
Hadoop Platform and Application Framework
University of California, San Diego via Coursera
Data Manipulation at Scale: Systems and Algorithms
University of Washington via Coursera