YoVDO

AWS Data Architect Bootcamp - 43 Services 500 FAQs 20+ Tools

Offered By: Udemy

Tags

Amazon Web Services (AWS) Courses Big Data Courses Machine Learning Courses Data Storage Courses Data Migration Courses Data Streaming Courses Data Ingestion Courses

Course Description

Overview

AWS Databases, EMR, SageMaker, IoT, Redshift, Glue, QuickSight, RDS, Aurora, DynamoDB, Kinesis, Rekognition & much more

What you'll learn:
  • Confidently architect AWS solutions for Ingestion, Migration, Streaming, Storage, Big Data, Analytics, Machine Learning, Cognitive Solutions and more
  • Learn the use-cases, integration and cost of 40+ AWS Services to design cost-economic and efficient solutions for a variety of requirements
  • Answer detailed technical questions of your design and development teams regarding implementation and build
  • Practice hands-on labs on complex AWS services like IoT, EMR, SageMaker, Redshift, Glue, Comprehend and many more

Hi! Welcome to the AWSData Architect Bootcamp course, the only course you need to learn everything about data architecture on AWS and play the role of an Enterprise Data Architect. This is the most-comprehensive AWScourse related to AWS data architecture on the market. Here's why:

  • This is the only online course taught by an Enterprise Cloud Architect, who leads large teams of junior architects in the real world, who has an industry experience of close to two decades in the ITindustry, who is a published author, and leads technology architecture of XXX million dollar projects on cloud for multi-national clients. Data Architects draw a salary in the range of $150K - $250Kon an average. This course trains you for that job! This is my 10th course on Udemy, 3rd on AWS topics (previous 2 are best-sellers).


  • Typical AWSclassroom trainings on data architecture which contains a fraction of the topics covered in this course, costs $3000 - $5000. And this course teaches you 5 to 7 times more topics than AWSTraining (40+ AWSServices)in the fraction of the cost.


  • Everything covered in this course is kept latest. Services which are in Beta and launched in Re-invent (last Nov)are already covered in the course . AWSinnovates and adds features to their stack very fast, and Ikeep my course constantly updated with those changes. Think of this course as a Architecture Updates subscription.


  • Developers have questions, Architect's have questions, Clients have questions - All technical curious minds have questions. And this course also has 500+questions and answers (FAQs) curated from AWSFAQs, to equip you with as many ready-to-use answers as you would need in your architect role.


The entire course is formed of 40+services. Every service is composed of the below listed sections, with their proportion in each section / service.


  • Architecture (12%) – Diagrams, Integration, Terminology

  • Use-Cases (6%) – Whether and When to use the AWS Service

  • Pricing (2%) – Cost estimation methods to assess overall solution cost

  • Labs (75%) – To-the-point labs for architectural understanding covering all major and important features

  • Frequently Asked Questions (5%) – Selected question from AWS FAQs explained concisely. (Total 500+)


Apart from AWSServices, we will use a number of client tools to operate on AWSServices, Databases and other technology stack. Here is a list of the tools that we would be using:

1. EC2 2. Putty 3. Cloud9, 4. HeidiSQL 5. MySQL Workbench 6. Pgadmin 7. SSMS

8. Oracle SQL Developer 9. Aginity Workbench for Redshift 10. SQL Workbench / J

11. WinSCP 12. AWS CLI 13. FoxyProxy 14. Oracle Virtualbox 15. Linux Shell Commands

16. FastGlacier 17. Rstudio 18. Redis Client 19. Telnet 20. S3 Browser

21. Juypter Notebooks


Below is a detailed description of the curriculum as AWSServices we will be learning to understand how they fit in the overall cloud data architecture on AWS and address various use-cases. If you have any questions, please don't hesitate to contact me.


  1. AWSTransfer for SFTP (Nov 2018 Release) - We will start our journey in this course with this service and learn how to ingest files in self-service manner using an sFTPserver on AWSand sFTPtools on-premise to ingest file based data on AWS.


  2. AWSSnowball - Large data volumes spanning hundreds of TBs are not ideal for ingestion via network. Using this service, we will learn how to ingest mega volume data using device based offline data transport mechanism to AWScloud.


  3. AWSKinesis Data Firehose - One of the data ingestion mechanism is streaming. We will learn how to channel streamed data from Kinesis Data Streams to AWSData Storage &Analytics Repositories like S3, Redshift, ElasticSearch and more using this service.


  4. AWSKinesis Data Streams - Clients can have streaming infrastructure or even devices (IoT)which may stream data continuously. Using this service we will learn how to collect streaming data and store it on AWS.


  5. AWSManaged Streaming for Kafka (MSK)(Nov 2018 Release) - AWSrecently added Kafka to their technology stack, which has lot of similarities with Kinesis. Learn comparative features as well as the method of standing up Kafka cluster on AWSto accept streaming data in AWS.


  6. AWSSchema Conversion Tool - Database migration is a complex process and can be homogeneous (for ex. SQLServer on-premise to SQLServer on AWS) or heterogeneous ( for ex. MySQLto PostgreSQL). We will use this offline tool to learn about assessing migration complexities, generate migration assessment reports, and even perform schema migration.


  7. AWSDatabase Migration Service (DMS) - Database Migration / Replication is a very common need for any federated data solution. We will use this service to learn how to migrate and/or replicate on-premise data from databases to AWShosted relational databases on AWSRDS.


  8. AWSData Sync (Nov 2018 Release)- Continuous synchronization of data from on-premise to cloud hosted data repositories becomes a key requirement in environments where data is generated or changes very fast. We will use to service to learn how it can solve this requirement.


  9. AWSStorage Gateway - This service has striking resemblance with AWSData Sync, and is one of the alternatives for standing cached volumes and stored volumes on AWSto build a bridge between on-premise data storage and AWS. We will briefly learn similarities between AWSData Sync and AWSStorage Gateway.


  10. AWSElastiCache ( Memcached )- After covering most of the mechanisms of data ingestion, we will shift focus on caching data before moving on the databases. We will start learning about caching with Memcached flavor of this service which offers powerful caching capabilities for simpler data types.


  11. AWSElastiCache ( Redis )- We will learn comparative difference between Memcached and Redis for caching, and learn how to use Redis flavor of caching which can build cache clusters and can host complex data types.


  12. AWSS3 (Advanced)- AWSS3 is the basis of data storage and data lake in AWS. We will learn advanced tactics like locking data for legal compliance, cross-region global replication, data querying with S3 Select feature, Life-cycle management to move data to cold storage etc.


  13. AWSGlacier - Data keep accumulating on cloud and can increase storage costs dramatically. Infrequently used data is suitable for cold storage, where this service comes into play. We will learning archival, archive retrieval and archive querying using this service.


  14. AWSRelational Database Service(MariaDB)- We will be focusing heavily on AWSService, which consists of 6 different types of databases. We will learn basic concepts of AWSRDSusing MariaDB, stand-up an instance and query it with a client tool.


  15. AWSRelational Database Service(SQLServer) - Data needs to be imported and exported between data-centers and cloud hosted database instances. We will learn such tactics for dealing with backups and restores across cloud using SQLServer database on RDSwith a client tool.


  16. AWSRelational Database Service(Oracle) - We will spend some time to learn how to stand up Oracle on AWSRDS, especially for Oracle professionals.


  17. AWSRelational Database Service(MySQL) - After spending time on practicing basic concepts, with MySQLdatabase on AWSRDS, we will start practicing advanced concepts for High-Availability and Performance, like Read Replicas and Performance Insights features.


  18. AWSRelational Database Service(PostgreSQL) - There can be use-cases where there may be need to convert one database to another on cloud, for example convert PostgreSQLto MySQL. We will learn about some compatibility features where we can create a MySQLread replica from a PostgreSQLinstance and make a read replica as an independent database.


  19. AWSRelational Database Service(Aurora) - Aurora on AWSRDSis a native database service from AWS. It comes in two flavors - cluster hosted and serverless, which is suitable for different use-cases. Also the storage architecture of Aurora is shared by various other AWSservices like AWSNeptune and DocumentDB. We will learn this service in-depth.


  20. AWSNeptune - Relational databases is just one of the types of databases in the industry as well as on AWS. Graph is special use-case for very densely connected data where the value of relationships is much higher than normal. We will learn graph theory of RDFvs Property Graph, and learn how Neptune fits in this picture, stand-up a Neptune Server as well as client, and operate on it with query languages like Gremlin ( Tinkerpop )and SPARQL.


  21. AWSDocumentDB (Nov 2018 Release) - MongoDBis one of the industry leader in NoSQLDocument Databases. AWShas recently introduced this new service which is a native implementation of AWSto provide an equivalent database with MongoDBcompatibility. We will learn details of the same.


  22. AWSDynamoDB- Key-value databases are important for housing voluminous data typically logs, tokens etc. We will learn document database implementation in depth with advanced features like streaming, caching, data expiration and more.


  23. AWSAPIGateway - RESTAPIs are today's standard mechanism of data ingestion. We will learn how to build data ingestion and access pipeline with APIs using this service with AWSDynamoDB.


  24. AWSLambda - Microservices are often tied with APIs, and are the cornerstone of any programmatic integration with AWSServices, typically AWS's Artificial Intelligence and Machine Learning Services. We will learn developing Lambda functions


  25. AWSCloudWatch - System logging is at the center of all programmatic logic execution, and it ties very closely with microservices and metrics logging for a variety of AWSServices. We will learn how to access and log data from microservices in CloudWatch logs.


  26. AWSInternet of Things (IoT) - Today IoTis one of the fastest growing areas, and from a data perspective, its one of the most valued source of data. The first challenge enterprises phase is the mechanism of ingesting data from devices and then processing it. With prime focus on ingestion, we will learn how to solution this using an end-to-end practical example which reads data from a device and sends text messages on your cell phone.


  27. AWSData Pipeline - With Data Lakes already overflowing with data, moving data within cloud repositories and from on-premises to AWS requires an orchestration engine which can move the data around with some processing. We will learn how to solve this use-case with this service.


  28. Amazon Redshift and Redshift Spectrum - All stored data in relational or non-relational format needs to be analyzed and warehoused. We will learn how to cater the requirement for a peta-byte scale, massively parallel data warehouse using this service.


  29. AWSElasticSearch - ElasticSearch is one of the market leaders in search framework along with its alternative Apache Solr. AWSprovides its own managed implementation of ElasticSearch, which can be used as one of the options to search data from different repositories. We will learn how to use this service for addressing search use-cases, and understand how tools like Logtash and Kibana fits in the overall solution.


  30. AWSCloudSearch - Standing up an AWSElasticSearch needs some ElasticSearch specific understanding. For use-cases which needs a more managed solution, AWSprovides an alternative packaged solution for search based on Apache Solr. We will learn how to stand up this service and use if for standing up search solutions in an express manner.


  31. AWSElastic MapReduce (EMR) - After spending sufficient time on Ingestion, Migration, Storage, Databases, Search and Processing, now we will enter the world of Big Data Analytics where we will spend significant amount of time learning how to standup a Hadoop based cluster and process data with frameworks like Spark, Hive, Oozie, EMRFS, Tez, Jupyter Notebooks, EMRNotebooks, Dynamic Port Forwarding, RStudio on EMR, Read and Process data from S3 in EMR, Integrate Glue with Hive, Integrate DynamoDBwith Hive and much more.


  32. AWSBackup (Nov 2018 Release)- Creating backup routines of various data repositories is a Standard Operating Procedure of production environments. AWSmade this job easier for support team with this brand new service. We will learn about the details of this service.


  33. AWSGlue - AWShas centralized Data Cataloging and ETLfor any and every data repository in AWSwith this service. We will learn how to use features like crawlers, data catalog, serde (serialization de-serialization libraries), Extract-Transform-Load (ETL)jobs and many more features that addresses a variety of use-cases with this service.


  34. AWSAthena - Serverless data lake is formed using four major services :S3, Glue, Redshift, Athena and QuickSight. This service is at the tail end of the process, and acts like a query engine for the data lake. We will learn how it serves that purpose and completes the picture.


  35. AWSQuickSight - AWSfilled the gap of a cloud-native reporting service in 2017 with the launch of this service. We will learn how it fits in the Serverless Data Lake picture and allows to create reports and dashboards.


  36. AWSRekognition - We will start our journey into the world of cognitive services powered by Artificial Intelligence with this service. Images and Video are vital source of data, and extracting information from these data sources and processing that data in a programmatic manner has various applications. We will learn how to perform this integration with Rekognition.


  37. AWSTextract (Nov 2018 Release) - Optical Character Recognition is another vital source of data, for ex. we are very much used to scanning of bar codes, tax forms, ebooks etc. We will learn how to extract text from documents using this AIpowered brand new service form AWS.


  38. AWSComprehend - Natural Language Processing (NLP) is a very big practice area of data analytics, typically performed using data science languages like Rand Python. AWSmakes the job of NLP easier by wrapping up a AIpowered NLPservice. We will learn the use of this service and understand how it complements services like Textract and Rekognition.


  39. AWSTranscribe - One major source of data that we have not touched so far is Speech to Text. We will learn how to use this APpowered service to extract text from speech, and how it can be effectively used for a number of use-cases.


  40. AWSPolly - We would have covered many use-cases of processing textual data from one form to another, but processing text to speech, which is the exact opposite function of Transcribe, we will learn to perform that with this AIpowered service from AWS. We will also learn the use of Speech Synthesis Language to control the details of the speech that gets generated.


  41. AWSSageMaker - After comfortably using AIpowered service, which abstracts the complexity of machine learning models from end-users, we will now venture in the world of machine learning with this service. We will execute a machine learning model end-to-end and learn how to access data from S3, create a model, create notebooks for executing code to explore and process data, train - build - deploy machine learning model, tune hyper-parameters, and finally accessing it from a load balanced infrastructure using APIendpoints.


  42. AWSPersonalize - Recommendation Engines requires building a reinforced deep learning neural network. Amazon has been in the business of recommending products to customers since decades. They have packages their method of recommendation as a product and launched it as a service, which is making a debut in the form of Personalize. We will perform an end-to-end exercise to understand how to use this service for generating recommendations.


  43. AWSLake Formation (Nov 2018 Release)- As forming data lakes is a tedious process, AWShas introduce a set of orchestration steps in the form of service to expedite the generation of Data Lakes. As this service is in early preview (Beta)and is subject to change, we will look at a preview of the GUIof this service before concluding the curriculum of this course.


If you are not sure whether this course is right for you, feel free to drop me a message and Iwill be happy to answer your question related to suitability of this course for you. Hope you will enroll in the course. Ihope to see you soon in the class !


Taught by

Siddharth Mehta

Related Courses

Azure Data Lake Storage Gen2 and Data Streaming Solution
Microsoft via Coursera
Big Data Emerging Technologies
Yonsei University via Coursera
Building Resilient Streaming Systems on GCP em Português Brasileiro
Google Cloud via Coursera
Building Resilient Streaming Systems on Google Cloud Platform en Español
Google Cloud via Coursera
Deep Dive into Concepts and Tools for Analyzing Streaming Data (Traditional Chinese)
Amazon Web Services via AWS Skill Builder