Right now a monitor is either in error (TROUBLE or DOWN) or not.
There is no in between. It's very hard to limit the amount of
nightly alerts like this. For each breach, we get an alert, even for
things that could wait until the morning.
For example. If a disk space threshold is reached, we get an alert,
regardless of the time of day. We could create policies in Opsgenie to
delay these alerts, but what if the file system is really filling up
then to 100%? We wouldn't know until it causes actual downtime.
It would be very nice to have a WARNING status with a different
threshold. So that we can delay WARNING messages during the night (85%
usage), but if they further evolve to TROUBLE 95%), we do get alerted.
I realize this might be an impacting change, but it would be a very
beneficial one. Compare it to Nagios for example, where such a status