How to monitor Kubernetes in DigitalOcean
Site24x7 monitors your DigitalOcean Kubernetes (DOKS) cluster availability and operational state, providing real-time visibility into cluster health and uptime. This monitor tracks the cluster's availability status and surfaces error states such as provisioning failures or unknown conditions.
By observing these metrics in tandem, you gain early visibility into performance degradation, ensuring that your team is immediately notified when a cluster enters an unhealthy state.
Use case
Cluster health: The Summary tab gives you a real-time view of cluster availability and downtime, so you can quickly spot when a Kubernetes cluster goes offline and take immediate action.
SLA validation: Availability and Downtime metrics help you track cluster uptime against your SLAs, making it easier to support both internal reports and customer commitments.
Incident correlation: By comparing cluster downtime with application issues, you can quickly determine if problems are coming from the Kubernetes layer, helping speed up root cause analysis.
Setup and configuration
Kubernetes resources are auto-discovered and monitored during the DigitalOcean integration. To enable monitoring, follow the steps below:
- Navigate to Cloud > DigitalOcean > Add DigitalOcean Monitor. Follow these steps to add a DigitalOcean monitor.
- While adding or editing a DigitalOcean monitor, select Kubernetes from the Service/Resource Types drop-down and click Save.
- Go to Cloud > DigitalOcean, select the created DigitalOcean monitor, then click Kubernetes.
Kubernetes will be discovered during the next discovery cycle as per the discovery frequency you selected during DigitalOcean monitor creation.
Data collection frequency
Performance metrics of your DigitalOcean Kubernetes will be collected every two minutes and are updated in the Site24x7 portal every five minutes by default, based on the selected poll interval.
Supported metrics
Summary
The Summary report gives you a complete picture of your DigitalOcean Kubernetes clusters, including uptime, outage frequency, and downtime for any chosen timeframe. It displays the status of your Kubernetes clusters, such as Up (when the cluster is running and reachable) and Down (when it is in an error or unknown state and for how long), offering operations teams a bird's-eye view of cluster health across all configured monitors at a glance.
