CloudWatch Monitoring Setup on AL2023
π― Goal
Set up comprehensive monitoring for an EC2 instance running nginx by implementing CloudWatch Agent to collect system metrics and application logs.
π Prerequisites
Before beginning this exercise, you should:
- Have an EC2 instance running Amazon Linux 2023
- Understand basic IAM concepts (roles and policies)
- Be familiar with SSH and Linux command line
- Have basic knowledge of web servers (nginx)
π Learning Objectives
By the end of this exercise, you will:
- Configure IAM roles with proper permissions for CloudWatch and Systems Manager
- Install and configure CloudWatch Agent to collect metrics and logs
- Use Systems Manager Parameter Store to centrally manage agent configuration
- Create a CloudWatch Dashboard to visualize system and application metrics
- Understand the difference between default EC2 metrics and custom CloudWatch Agent metrics
π Why This Matters
In real-world applications, monitoring is crucial because:
- It enables proactive issue detection before customers are affected
- It’s essential for troubleshooting production incidents
- CloudWatch is AWS’s native monitoring solution, making it the standard for AWS workloads
- Understanding IAM permissions for monitoring is critical for security
π§ Step-by-Step Instructions
Step 1: Configure IAM Role for Your EC2 Instance
First, create an IAM role with the necessary permissions for CloudWatch Agent.
- Navigate to IAM Console β Roles β Create role
- Select AWS service β EC2 β Next
- Search and attach this AWS managed policy:
CloudWatchAgentAdminPolicy - Name the role:
CloudWatch-Agent-Role - Attach the role to your EC2 instance:
- EC2 Console β Select instance β Actions β Security β Modify IAM role
π‘ Information
- CloudWatchAgentAdminPolicy: Allows the agent to write metrics and logs to CloudWatch
- Without SSM permissions, you cannot store configurations centrally or use the
ssm:prefixβ οΈ Common Mistakes
- Forgetting SSM permissions causes “Access Denied” when saving to Parameter Store
- Not attaching the role to EC2 means no metrics will appear in CloudWatch
You might need to setup your own policy to be able to write to Systems Manager
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowWriteCloudWatchAgentConfigToSSM",
"Effect": "Allow",
"Action": "ssm:PutParameter",
"Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
}
]
}Step 2: Install Nginx and CloudWatch Agent
SSH into your EC2 instance
Install nginx as our sample application:
# Update system and install nginx sudo dnf update -y sudo dnf install nginx -y # Start nginx sudo systemctl start nginx sudo systemctl enable nginx # Create a test page echo "<h1>CloudWatch Test</h1>" | sudo tee /usr/share/nginx/html/index.htmlDownload and install CloudWatch Agent:
# Download the agent wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm # Install it sudo rpm -U ./amazon-cloudwatch-agent.rpmInstall CollectD:
# Install CollectD sudo dnf install collectd -yAmazon Linux 2023 uses journald for logging. In order to get the logs to the traditional files you need to install RSyslog.
# Install and activate rsyslog sudo dnf install rsyslog -y sudo systemctl start rsyslog sudo systemctl enable rsyslog # Configure SSH to log to traditional files echo "SyslogFacility AUTH" | sudo tee -a /etc/ssh/sshd_config echo "LogLevel INFO" | sudo tee -a /etc/ssh/sshd_config # Restart sshd sudo systemctl restart sshd
π‘ Information
- Package Installation: The RPM creates a
cwagentuser and installs files in/opt/aws/amazon-cloudwatch-agent/- You needed to install CollectD.
- The agent can collect both AWS service metrics and custom application metrics
Now you will have the logs here:
- Nginx: /var/log/nginx/access.log
- SSH: /var/log/secure
- System: /var/log/messages
Step 3: Configure CloudWatch Agent Using the Wizard
Run the configuration wizard:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizardSelect these options in the wizard:
OS: 1 (linux) EC2 or On-Premises: 1 (EC2) User: 2 (root) StatsD: 2 (no) CollectD: 1 (yes) Host metrics: 1 (yes) CPU per core: 2 (no) EC2 dimensions: 1 (yes) Collection interval: 4 (60s) Metrics level: 2 (Standard) Satisfied: 1 (yes) Import existing config: 2 (no) Monitor log files: 1 (yes)Configure Nginx access log as source:
Nginx log: File path: /var/log/nginx/access.log Group name: access.log Stream name: {instance_id} Add another: 2 (no)Save to Parameter Store:
Store in Parameter Store: 1 (yes) Parameter name: AmazonCloudWatch-linux Region: [press Enter] Credentials: 1 (use IAM role)
π‘ Information
- Standard Metrics: Includes CPU, memory, disk, and swap usage
- Log Groups: Organize different log types for easier querying
- Parameter Store: Enables centralized configuration management across multiple instances
Step 4: Start the CloudWatch Agent
Alt 1: Start the agent using the Parameter Store configuration:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \ -a fetch-config \ -m ec2 \ -s \ -c ssm:AmazonCloudWatch-al2023Alt 2: Start the agent using the local configuration:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \ -a fetch-config \ -m ec2 \ -s \ -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.jsonVerify it’s running:
# Check status sudo systemctl status amazon-cloudwatch-agent # View logs if there are issues sudo tail -f /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
π‘ Information
- fetch-config: Downloads and applies the configuration
- -m ec2: Uses EC2 metadata for instance information
- -s: Starts the agent after configuration
- ssm: prefix: Fetches config from Parameter Store (requires SSM permissions)
β οΈ Common Mistakes
- Using
ssm:without SSM permissions will fail with “Access Denied”- The agent needs 2-5 minutes before metrics appear in CloudWatch
Step 5: Create CloudWatch Dashboard
Navigate to CloudWatch Console β Dashboards β Create dashboard
Name:
testAdd four widgets:
Widget 1 - CPU:
- Type: Number
- Metrics β EC2 β Per-Instance Metrics
- Select: CPUUtilization
Widget 2 - Memory:
- Type: Number
- Metrics β CWAgent β Select your instance
- Select: mem_used_percent
Widget 3 - Disk:
- Type: Number
- Metrics β CWAgent β Select your instance
- Select: disk_used_percent
Widget 4 - System Logs:
- Type: Logs table
- Log group: access.log
- Query:
fields @timestamp, @message | sort @timestamp desc | limit 10
Save the dashboard
π‘ Information
- EC2 Namespace: Contains default metrics (CPU only, no agent needed)
- CWAgent Namespace: Contains custom metrics from CloudWatch Agent
- Metrics may take 5 minutes to appear after agent startup
π§ͺ Final Tests
Run the Application and Validate Your Work
Generate test data:
# Install stress tool sudo dnf install -y stress-ng # Generate CPU load sudo stress-ng --cpu 2 --timeout 30s # Generate nginx traffic for i in {1..20}; do curl localhost; done # Create log entries logger -t TEST "CloudWatch monitoring test"Open CloudWatch Dashboard and verify:
- CPU utilization increases during stress test
- Memory and disk percentages are displayed
- System logs show your test message
- Nginx access logs appear (if configured)
β Expected Results
- All three number widgets display percentage values
- CPU shows spike during stress test (may take 1-2 minutes)
- Logs table shows recent system messages
- No error messages in agent logs
π§ Troubleshooting
If you encounter issues:
- No metrics appearing: Check IAM role has both required policies
- SSM errors: Ensure
AmazonSSMManagedInstanceCorepolicy is attached - Agent won’t start: Review logs at
/opt/aws/amazon-cloudwatch-agent/logs/ - Missing CWAgent namespace: Wait 5 minutes, then restart agent
π Optional Challenge
Want to take your learning further? Try:
- Adding an alarm when CPU exceeds 80%
- Creating a CloudWatch Insights query to analyze nginx response codes
- Setting up SNS notifications for critical metrics
- Configuring log retention policies to manage costs
π Further Reading
- CloudWatch Agent Configuration Reference
- CloudWatch Pricing - Understand monitoring costs
Done! π
Great job! You’ve successfully implemented CloudWatch monitoring and learned how to use Systems Manager Parameter Store for configuration management. This setup provides comprehensive visibility into your EC2 instances and applications! π