Observing, Troubleshooting, and Optimizing Workloads Running on Amazon ECS
Offered By: Amazon Web Services via AWS Skill Builder
Course Description
Overview
In this course you learn how to gain observability to your applications running on Amazon Elastic Container Service (Amazon ECS). You learn how to collect metrics, logs, and traces at the system and application level. You also learn how to apply the information you gather to identify and correct problems.
- Course level: Advanced
- Duration: 150 minutes
Activities
This course includes presentations and demonstrations.
Course objectives
In this course, you learn to:
- Describe the components of a robust monitoring pattern
- Collect, visualize, and alert on metrics from systems and applications
- Collect, aggregate, and visualize application and system logs
- Perform traces on distributed microservices
- Use collected data to troubleshoot applications running in Amazon ECS
Intended audience
This course is intended for:
- Cloud architects
- DevOps engineers
- Operations staff
- Developers
Prerequisites
We recommend that attendees of this course have:
- A working knowledge of containers and Amazon ECS.
- Previous experience deploying simple applications on Amazon ECS.
- Amazon Elastic Container Service (ECS) Primer
- Building Enterprise Architectures in Amazon ECS
- Managing Application Lifecycle in Amazon ECS
- Managing Applications at Scale with Amazon ECS
Course outline
Module 1: The Value of Observability
- Gaining Observability
Module 2: Collecting Metrics
- Sources of Metrics
- Task and Service Metrics
- Custom Metrics
- Demonstration: Collecting and Visualizing Metrics
- Alerting on Metrics
Module 3: Collecting Logs
- Working with Logs
- Working with AWS Log Routing
- Working with Custom Log Routing
- Working with Network Logs
Module 4: Observing Connections
- Introduction to Distributed Tracing
- Implementing Distributed Tracing with AWS X-Ray and Amazon ECS
- Demonstration: Using AWS X-Ray to Visualize
- Identifying Communication Problems with a Service Mesh
Module 5: Troubleshooting
- Troubleshooting Methodology
- Common Issues: Stopped Tasks
- Common Issues: Invalid CPU and Memory Errors
- Common Issues: Load Balancer Configuration Errors
Tags
Related Courses
A Beginner's Guide to Kubernetes for Container OrchestrationPackt via FutureLearn Advanced Terraform with GCP
A Cloud Guru Ansible: Setup, Configure, and Ad Hoc Commands Deep Dive
A Cloud Guru Applying Infrastructure as Code and Serverless Technologies to AWS Deployments
A Cloud Guru AWS Certified DevOps Engineer – Professional
A Cloud Guru