Monitor the operational status of your EC2 instances with Site24x7 alerts.
Two months back, we introduced our unified Amazon EC2 monitoring capabilities which empower modern IT and cloud-first teams by providing improved visibility into the workloads of EC2 instances. How, you ask? Well, by integrating the instance metadata from our native CloudWatch integration with our dynamic server monitoring capabilities, we not only removed the inefficiencies of the CloudWatch data source (only hypervisor-level data) and standalone server agents, but also tackled complexities like AutoScaling and instance life changes associated with your dynamic AWS infrastructure.
Today, I am excited to say that we are adding another feature to our growing AWS monitoring capabilities?support for EC2 status checks. With this new addition, you can now monitor the status of all your running instances every minute and get alerts based on instance check failures, all within Site24x7 itself.
Before we get into the crux of it, you need to know a bit about two things?EC2 instance creation and status checks.
What is an EC2 instance?
AWS uses a highly customized version of the XEN hypervisor to virtualize the underneath bare metal physical host to create virtual computing environments called instances. These instances are nothing but virtual servers which provide highly scalable, flexible compute workloads to run your application. Multiple virtual servers, each running a different guest OS, can all run on the same single physical host.
What are EC2 status checks?
These are default automated tests that Amazon EC2 performs every minute to identify hardware and software issues that might plague your EC2 environment. There are two types of EC2 status checks: system status checks and instance status checks.
During a system status check, network packets are transmitted to your instance to verify whether it is accepting traffic or not. This test will fail if the physical hardware underlying your EC2 instance is experiencing unexpected issues like a loss in power or network connectivity. Frequent system status check failures could mean that you are running an older generation instance and it's time for you to upgrade to the latest generation.
An instance status check validates whether the OS is accepting traffic. Bootloader issues, EBS or instance store failure, memory exhaustion, root volume mounting issues, and filesystem corruption are some of the possible reasons that could cause an instance status check to fail.
Wait, hold on a second. If AWS is offering this status check functionality, then where does Site24x7 fit into all of this, and what additional value does it provide? Well we've got the answers to those questions, just keep reading..
Site24x7 continuously monitors the status checks performed by Amazon EC2. When one or more of the checks fail, an automated alert will get triggered so you can quickly respond and resolve the issue yourself instead of waiting for AWS to take action.
Reduce your dependency on CloudWatch alarms and SNS notifications
Imagine a scenario where Site24x7 doesn't support alerting for status checks. In that case, you would have to log in to the EC2 management console to discern the operational status of your EC2 instance. But spending a significant amount of time staring at a console and looking for updates isn't practical.
To tackle this, you could configure status check alarms. As you may know, users starting out with a free AWS account are limited to 10 free alarms. To configure more, you have to shell out $0.10 per alarm per month. While that might not seem like much, if you take into account the number of alarms you need to configure for EC2 instances, other AWS resources, and custom metrics ?and add the SNS publish and delivery charges for email notifications?you might be looking at a pretty solid figure.
With Site24x7, the cost of setting up alarms or thresholds (that's what we like to call them) for instance checks or any other metric data, will always be zero, even if you start to monitor hundreds of production server instances.
More visibility into your EC2 environment
The performance of your business-critical application mainly depends on the health of your backend web application servers. Now that Site24x7 supports status checks, you can augment the CloudWatch data and the agent driven system and application metrics with information on operational status.
We are cloud-native
With Site24x7's comprehensive monitoring suite, you not only get the ability to monitor your cloud-based applications and infrastructure, but also the various native AWS service components that are part of your application architecture. Our out-of-the-box integration with AWS resources like Elastic Load Balancers, RDS instances, DynamoDB tables, S3 buckets, EBS volumes, and SNS topics provides immediate insight into performance and resource utilization.
How Site24x7 collects information on status checks
Every minute, Site24x7 issues a single ec2: describe-instance-status API call to retrieve data sets for three components. One, information regarding the instance state, including which part of the instance life cycle the instance is currently in. Two, scheduled events, including data about any ongoing operational activity like system maintenance or software updates. And three, status checks, which have information on whether the instance passed both of the status checks, if it failed one, or if it failed both. In short, this information will tell you whether the instance is capable of running your application.
Our developers have optimized the code in such a way that a single API call can retrieve all of the information mentioned above for 50 running EC2 instances in one go. So you'll never have to worry about throttling again.
Alarms for both system and instance status check failure are enabled by default for all your running EC2 instances, including both regular and AutoScaling instances. You can take this one step further by configuring how you want the monitor status to change in the event of a status check failure. By using the toggle button in the EC2 instance threshold profile, you can either set it up as Trouble or Down, as needed. If you feel status check failure alerts are not really that important then you can turn it off by accessing the same threshold profile.
In the works
Today's release, alert support for EC2 status checks, is only the beginning. A number of other enhancements, dashboards, support for scheduled events, EC2 automation tasks, and a whole lot more are in the pipeline. So stay tuned. If you want to get a taste of what Site24x7 can do in the meantime, sign up for our 30 day, no-strings-attached free trial and try our AWS monitoring capabilities firsthand. If you are a cloud or managed service provider, we have a specially tailored subscription edition for you, our MSP plan! So get on board and get the most out of your cloud deployment.