Systems Monitoring

CloudWatch Dashboards
If you have any EC2 instances deployed to your AWS environment, a custom CloudWatch Dashboard named "EC2-Instances" is available under CloudWatch / Dashboards. This dashboard displays various metrics that correspond to system performance and resource utilization. For additional information, please refer to this page.

Additional metrics can be discovered by drilling down into the CloudWatch / Metrics section as well as the "Monitoring" tab for individual EC2 instances in the EC2 console.

Subscribing to System Alerts
Custom CloudWatch Alarms are used to detect system issues and depleted resources. These alarms can be viewed by going to the CloudWatch service and browsing the Alarms section. You can subscribe to theses alerts by going to the Simple Notification Service (SNS), selecting a specific topic, and then creating an "email" subscription to an external address. We recommend using an email alias or a Slack email-to-channel address to be able to easily share these alerts with your team.

The following SNS topics relate to system events:

  • high-cpu-alert
  • low-disk-space-alert
  • server-down-alert

Auto-Scaling Storage
When CloudWatch detects that disk utilization for an EBS volume reaches 90% (after a 15 minute threshold has been met), it publishes a message to the resize-ebs-volume-request SNS topic. This alarm integrates with a custom Lambda function that adds another 100 GiB of space to the existing EBS volume and resizes the filesystem. When the action is completed, another SNS topic, ebs-volume-resized, is called.

Healthcare Blocks monitors the above alerts via an internal Slack account.