Help Docs

Alibaba Cloud Elastic Compute Service (ECS) Monitoring Integration

Site24x7 offers comprehensive out-of-the-box monitoring for Elastic Compute Service (ECS) instances in your Alibaba Cloud environment. Monitor system-level performance using real-time metrics for CPU usage, memory consumption, disk I/O, network activity, GPU utilization, and process behavior. Once your Alibaba account is integrated with Site24x7, all associated ECS instances are auto-discovered and monitored.

Use cases

  • Instance-level health tracking: Monitor CPU, memory, and disk usage to prevent resource exhaustion.
  • Disk and network I/O visibility: Identify bottlenecks in storage and data transfer throughput.
  • GPU monitoring for ML workloads: Track GPU utilization and temperature to manage compute-intensive applications.
  • Proactive alerting: Detect anomalies like packet drops, high system load, or excessive process counts in real time.

Setup and configuration

  • Log in to your Site24x7 account and navigate to Cloud > Alibaba Cloud > Add Monitor.
  • In the Edit Alibaba Cloud Monitor page, select ECS from the Service Types list.
  • Once added, go to Cloud > Alibaba > ECS to view dashboards and performance metrics.

Supported metrics

CPU Metrics

Metric nameDescriptionUnit
CPU Utilization The percentage of total CPU capacity in use. Percentage
CPU User Time The percentage of CPU used by user processes. Percentage
CPU System Time The percentage of CPU used by system/kernel processes. Percentage
CPU Idle Time The percentage of idle CPU time. Percentage
CPU Wait Time The percentage of time the CPU spends waiting on I/O. Percentage
Total CPU Usage The total CPU usage across all cores. Percentage
Load Average (1 Minute) The average system load over the last 1 minute. Load
Load Average (5 Minutes) The average system load over the last 5 minutes. Load
Load Average (15 Minutes) The average system load over the last 15 minutes. Load
Load Average Per Core (1 Minute) The 1-minute load average per CPU core. Load

Memory Metrics

Metric nameDescriptionUnit
VM Memory Utilization The percentage of memory in use. Percentage
Memory Used Utilization The percentage of used memory relative to total. Percentage
Memory Used Space The amount of used memory. MB
Memory Free Utilization The percentage of free memory available. Percentage
Memory Free Space The amount of free memory. MB
Total Memory Space The total memory available on the instance. MB

Disk Metrics

Metric nameDescriptionUnit
Disk Read Throughput (Bps) The rate of data read from disk. Bytes/second
Disk Write Throughput (Bps) The rate of data written to disk. Bytes/second
Disk Read IOPS The number of read operations per second. Ops/second
Disk Write IOPS The number of write operations per second. Ops/second
Disk Usage Utilization The percentage of disk space used. Percentage
Disk Usage (Used) The amount of disk space used. GB
Disk I/O Queue Size The number of disk I/O requests waiting in queue. Count
Disk Read Throughput Utilization The percentage of read throughput used. Percentage
Disk Write Throughput Utilization The percentage of write throughput used. Percentage

Network Metrics

Metric nameDescriptionUnit
Network In Rate The rate of incoming network traffic. Bytes/second
Network Out Rate The rate of outgoing network traffic. Bytes/second
Network In Packets The number of incoming packets per second. Packets/second
Network Out Packets The number of outgoing packets per second. Packets/second
Dropped Packets Percentage (In) The percentage of incoming packets dropped. Percentage
Dropped Packets Percentage (Out) The percentage of outgoing packets dropped. Percentage

System and Process Metrics

Metric nameDescriptionUnit
Status Check The overall system health check result. Text
Status Check (Instance) The number of system-level health check attempts. Count
Process Count The number of processes running. Count
VM Process Count The number of virtual machine processes. Count
Concurrent Connections The number of concurrent network connections. Count

GPU Metrics

Metric nameDescriptionUnit
GPU Memory Used Utilization The percentage of GPU memory in use. Percentage
GPU Utilization The percentage of GPU compute usage. Percentage
Instance GPU Temperature The current GPU temperature. Celsius
Instance GPU Memory Used Utilization The percentage of memory used by the GPU on the instance. Percentage

Threshold configuration

  1. Go to Admin > Configuration Profiles > Threshold and Availability.
  2. Create or edit a threshold profile for ECS.
  3. Assign the profile to the respective monitors to trigger alerts.

IT automation

Site24x7's IT Automation tools help with automatically resolving performance degradation issues. When a breach occurs, the alarm engine continuously examines the system events for which thresholds have been defined and performs the mapped automation.

  1. Go to Admin > IT Automation Templates.
  2. Create a new automation rule.
  3. Map the rule to the monitor for proactive resolution.

How to configure IT Automation for a monitor

Configuration rules

With Site24x7's Configuration Rules, you can set parameters like Threshold Profile, Notification Profile, Tags, and Monitor Group for multiple monitors and automate the configuration settings of your monitoring resources. Automatically assign these settings when new ECS monitors are added.

How to add a Configuration Rule

Related links

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!