AWS ML Engineer Associate 1.3 Validate Data and Prepare for Modeling
Offered By: Amazon Web Services via AWS Skill Builder
Course Description
Overview
This course covers part of the data preparation phase of the machine learning (ML) lifecycle. In this course, you will learn about data validation strategies, including strategies for bias mitigation and data security. You will also review a few Amazon Web Services (AWS) services that can assist with data validation, including AWS Glue DataBrew and AWS Glue Data Quality. You will also learn about final steps of data preparation and configuration, such as dataset splitting, shuffling, augmentation, and configuration to load into your model training resource.
- Course level: 300
- Duration: 45 minutes
Activities
- Online materials
- A demonstration
- Knowledge check questions
- A course assessment
Course objectives
- Explain the importance of ensuring data integrity.
- Identify fundamental pre-training bias metrics.
- Describe strategies to address class imbalance in datasets.
- Describe key AWS services for validating data quality.
- Use AWS tools to identify and mitigate sources of bias in data.
- Describe techniques for using AWS services to encrypt data.
- Identify implications of compliance requirements.
- Describe the value and technique of splitting, shuffling, and augmenting datasets.
- Identify data formats used in model training.
- Identify AWS tools and services for model training data configuration.
- Describe how to configure data to load it into a model training resource.
Intended audience
- Cloud architects
- Machine learning engineers
Recommended Skills
- At least 1 year of experience using Amazon SageMaker and other AWS services for ML engineering.
- At least 1 year of experience in a related role such as backend software developer, DevOps developer, data engineer, or data scientist.
- A fundamental understanding of programming languages such as Python.
- Preceding courses in the AWS ML Engineer Associate Learning Plan.
Course outline
- Section 1: Introduction
- Lesson 1: How to Use This Course
- Lesson 2: Course Overview
- Lesson 3: Fundamentals of Data Validation
- Section 2: Validate Data
- Lesson 4: Addressing Class Imbalance
- Lesson 5: AWS Tools and Services for Data Validation and Bias Mitigation
- Lesson 6: Identifying and Mitigating Bias with Amazon SageMaker Clarify
- Lesson 7: Data Security and Compliance
- Section 3: Final Steps of Data Preparation
- Lesson 8: Dataset Splitting, Shuffling, and Augmentation
- Lesson 9: Configure Data for Modeling Training
- Section 4: Conclusion
- Lesson 10: Course Summary
- Lesson 11: Assessment
- Lesson 12: Contact Us
Tags
Related Courses
Rails with Active Record and Action PackJohns Hopkins University via Coursera Excel Skills for Business: Intermediate II
Macquarie University via Coursera Programming 103: Saving and Structuring Data
Raspberry Pi Foundation via FutureLearn Everyday Excel, Part 1
University of Colorado Boulder via Coursera Creating Dashboards in Google Spreadsheets
Coursera Project Network via Coursera