Building Open Source Software (OSS) Analytics Solutions with Azure HDInsight
Offered By: Microsoft via Microsoft Learn
Course Description
Overview
- Module 1: Introduction to the Open source Analytics Offering
- What HDInsight is
- How HDInsight works
- When to use HDInsight
- Module 2: Choose the correct HDInsight Configuration to build open source analytics solutions
- The correct HDInsight configuration options
- Decision criteria for selecting the correct HDInsight configuration option
- Analyze a scenario and map it to an HDInsight configuration option
- Cost Optimization strategies for HDInsight clusters
- Module 3: Creating and configuring a HDInsight cluster
- Create an HDInsight Spark Cluster
- Execute queries on an HDInsight Spark Cluster
- Monitor an HDInsight Spark Cluster
- Learn how to fix common provisioning issues
- Module 4: Run Petabyte level OSS NoSQL databases with HDInsight HBase
- Introduction
- Use HDInsight HBase clusters
- Describe HBase Architecture Patterns
- Exercise - Provisioning a HDInsight HBase cluster
- Exercise – Run benchmarks in HBase
- Understand HBase Best Practices
- Summary
- Knowledge Check
- Module 5: Perform advanced streaming data transformations with Apache Spark and Kafka in Azure HDInsight
- When to use Apache Spark and Kafka with HDInsight
- How Spark Structured Streaming works
- The architecture of a Kafka and Spark solution
- How to provision HDInsight, create a Kafka producer, and stream Kafka data to a Jupyter notebook
- How to replicate data to a secondary cluster
- Module 6: Perform Zero ETL analytics with HDInsight Interactive Query
- Appropriate scenarios to deploy HDInsight Interactive Query clusters
- Learn about architectural patterns
- Deploy a cluster for your real-estate app and query the data
- Learn how to integrate Apache Spark and Hive LLAP queries using the Hive Warehouse Connector
- Create a large-scale interactive query dashboard to evaluate real estate values and locations
- Module 7: Manage enterprise security in HDInsight
- Introduction
- Describe HDInsight security areas
- Implement Network Security
- Understand Operating system security
- Manage Application/ Middleware security
- Implement Data Access security
- Knowledge Check
- Summary
At the end of this module, you will understand:
At the end of this module, you will understand:
In this module you will:
At the end of this module you will understand:
In this module you will learn the following:
Syllabus
- Module 1: Introduction to the Open source Analytics Offering
- Introduction
- What is HDInsight?
- How does HDInsight work
- When to use HDInsight
- Knowledge check
- Summary
- Module 2: Choose the correct HDInsight Configuration to build open source analytics solutions
- Introduction
- HDInsight configuration options
- Decision criteria for selecting the correct HDInsight configuration option
- Analyze a scenario and map it to a HDInsight configuration option
- Cost optimization strategies for HDinsight clusters
- Knowledge check
- Summary
- Module 3: Creating and configuring a HDInsight cluster
- Introduction
- Creating an HDInsight cluster
- Exercise - Create an HDInsight cluster via the Azure portal
- Opening a Jupyter Notebook on HDInsight Spark cluster
- Exercise - Execute queries on HDInsight Spark cluster
- Enable monitoring of HDInsight jobs
- Common provisioning Issues
- Exercise - Monitor an HDInsight cluster
- Summary
- Knowledge check
- Module 4: Run Petabyte level OSS NoSQL databases with HDInsight HBase
- Introduction
- Describe Apache HBase
- Explain HDInsight HBase clusters architecture and application patterns
- Improve the write and read performance of HBase clusters
- Determine migration and high availability strategies in HDInsight HBase
- Use Apache Phoenix on HDInsight HBase
- Determine HDInsight HBase cluster performance
- Perform benchmarking in HBase
- Knowledge check
- Summary
- Module 5: Perform advanced streaming data transformations with Apache Spark and Kafka in Azure HDInsight
- Introduction
- Use HDInsight Spark and Kafka
- Stream data with Apache Kafka
- Describe Spark structured streaming
- Create a Kafka and Spark architecture
- Exercise - Provision HDInsight to perform advanced streaming data transformations
- Exercise - Create the Kafka producer
- Exercise - Stream Kafka data to a Jupyter notebook and window the data
- Replicate data to a secondary cluster
- Knowledge check
- Summary
- Module 6: Perform Zero ETL analytics with HDInsight Interactive Query
- Introduction
- When should you use HDInsight Interactive Query
- HDInsight interactive queries
- Exercise - Provision HDInsight to perform adhoc analytics
- Exercise - Upload and query data in HDInsight
- Integrate Apache Spark and Hive LLAP queries
- Create a large scale interactive query dashboard for Evaluating Real Estate Trends
- Summary
- Knowledge check
- Module 7: Manage enterprise security in HDInsight
- Introduction
- Describe HDInsight security areas
- Implement Network security
- Understand operating system security
- Manage application/ middleware security
- Implement data access security
- Knowledge check
- Summary
Tags
Related Courses
Advanced AI on Microsoft Azure: Ethics and Laws, Research Methods and Machine LearningCloudswyft via FutureLearn Ethics, Laws and Implementing an AI Solution on Microsoft Azure
Cloudswyft via FutureLearn Deep Learning and Python Programming for AI with Microsoft Azure
Cloudswyft via FutureLearn Advanced Artificial Intelligence on Microsoft Azure: Deep Learning, Reinforcement Learning and Applied AI
Cloudswyft via FutureLearn AI Design and Engineering with Microsoft Azure
Cloudswyft via FutureLearn