Go to All Forums

EC2 System and Instance Checks

In Cloudwatch, you have System and Instance checks. One checks the health of the physical host your EC2 instance is running on, the other checks the health of the EC2 instance itself:


Last night we had an instance that failed the instance checks, and we were alerted by Cloudwatch itself. We did get an alert that the agent was disconnected, but no instance check failures. Are these metrics being monitored? They really are the base of EC2 monitoring.
Like (1) Reply
Replies (8)

Hi Tom,

Both these metrics are in our Feature Road Map already. We will keep you updated shortly.

Thanks,
Yamini
Like (0) Reply

Thanks for the quick reply Yamini!

Is the roadmap public? I would understand if it's not, but it might prevent feature requests that are already on the map. And it might allow us to upvote ;-)
Like (0) Reply

Hi Tom,

Good that you got alerted via agent disconnection and thats one good problem an integrated monitor(EC2+Server) may solve for you. 

If you are relying on EC2 instance monitoring alone, since our poll frequencies are for every 5 minutes , you may not get to know the next 5 minutes regarding a failure , even if you prefer the status based instance monitoring. Thats where an agent installed on the Server shall help you a lot with. 

In our case, if the instance is getting shut down or in the process of doing so , we currently say it as a down post 5 minutes. 

In order to get instant alerts of this kind , we would suggest you to go for the integrated monitor. We have decided to alert post a minute to avoid any kind of network flaps with respect to EC2(integrated version). We will bring this down may be based on user request.

Regards,

Ananthkumar K S
Like (0) Reply

Hi Ananthkumar,

That would indeed be OK for when we integrate with the agent. In most cases we do, but we should be forced to. An EC2 instance without an agent that is in a running state (and thus not captured as having an issue by your current checks), but has an instance check failure would not be reported as DOWN or TROUBLE. That is basic monitoring, and it should be reported, regardless of using an agent.

System checks on the other hand can't be checked with an agent, it's an AWS level metric that is also vital.

So while the presence of the agent solves things for some cases, for others it doesn't.
Like (0) Reply

Hi Tom,

Thanks for giving us some details on your requirement. 

We will take it up asap . 

In the meantime, we would like to understand the kinds of system checks , you currently do at the AWS level metrics. It would be great , if we can find a way to support those for you . 

Regards,

Ananthkumar K S


Like (0) Reply

Hi Ananthkumar,

There really  is nothing more you can do in Cloudwatch then set alerts to the AWS provided metrics (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html#types-of-instance-status-checks). 

AWS will alert if either of these checks.

Kind regards,
Tom
Like (0) Reply

Hi Tom, 

 

Status Checks updates and alerting based on the same has been released . 

Kindly check the following update

https://www.site24x7.com/community/what-s-new-in-aws-monitoring-dashboards-actions-status-checks-and-a-whole-lot-more

Regards,

Ananthkumar K S

Like (0) Reply

Perfect Ananthkumar! Now we can clean up some Cloudwatch alerts :)

Like (0) Reply

Was this post helpful?