How to monitor the Document Database Service in Huawei Cloud
Site24x7 gives your team complete observability into HuaweiCloud Document Database Service (DDS), delivering real-time visibility into operation throughput, connection health, CPU and memory utilization, cache behavior, replication lag, and disk I/O performance.
Use cases
Queue control: Growing write queues alongside high CPU are signs that write throughput is exceeding capacity.
Cache resilience: WiredTiger Cache Used Percentage and Tracked Dirty Bytes reveal cache saturation risk.
Replication assurance: Replication Lag and Replication Headroom show how far replica nodes (secondaries) are behind the primary database and how much operation log time remains.
Setup and configuration
DDS resources are auto-discovered and monitored during the Huawei Cloud integration. To enable monitoring, follow the steps below:
- Navigate to Cloud > Huawei > Add Huawei Monitor. Follow the steps to add a Huawei Cloud monitor.
- While adding or editing a Huawei Cloud monitor, select DDS from the Service/Resource Types drop-down and click Save.
- Navigate to Cloud > Huawei, select the created Huawei monitor, and then click Document Database Service.
Supported metrics
Operations
Metric name | Description | Units |
| COMMAND Statements per Second | The rate of command operations executed on the DDS instance per second. | Count/second |
| DELETE Statements per Second | The rate of delete operations executed on the DDS instance per second. | Count/second |
| INSERT Statements per Second | The rate of insert operations executed on the DDS instance per second. | Count/second |
| QUERY Statements per Second | The rate of query operations executed on the DDS instance per second. | Count/second |
| UPDATE Statements per Second | The rate of update operations executed on the DDS instance per second. | Count/second |
| GETMORE Statements per Second | The rate of getMore cursor operations executed on the DDS instance per second. | Count/second |
Connections
Metric name | Description | Units |
| Active Node Connections | The number of active client connections currently established to the DDS node. | Count |
| Active Node Connections Usage | The percentage of available connection capacity currently in use. | Percentage |
| Active Sessions | The number of active sessions currently open on the DDS instance. | Count |
CPU and memory
Metric name | Description | Units |
| CPU Usage | The percentage of CPU capacity currently consumed by the DDS instance. | Percentage |
| Memory Usage | The percentage of memory capacity currently consumed by the DDS instance. | Percentage |
| Resident Memory | The amount of physical memory currently resident in RAM for the DDS process. | MB |
| Virtual Memory | The total virtual memory currently allocated to the DDS process. | MB |
| SWAP Usage | The percentage of swap space currently consumed by the DDS instance. | Percentage |
Asserts
Metric name | Description | Units |
| Regular Asserts per Second | The rate of regular assertion errors raised by the DDS instance per second. | Count/second |
| Warning Asserts per Second | The rate of warning-level assertion errors raised by the DDS instance per second. | Count/second |
| Message Asserts per Second | The rate of message-level assertion errors raised by the DDS instance per second. | Count/second |
| User Asserts per Second | The rate of user-generated assertion errors raised by the DDS instance per second. | Count/second |
Queues and cursors
Metric name | Description | Units |
| Queued Operations Waiting for Lock | The total number of operations currently queued waiting to acquire a lock. | Count |
| Queued Operations Waiting for Read Lock | The number of read operations currently queued waiting to acquire a read lock. | Count |
| Queued Operations Waiting for Write Lock | The number of write operations currently queued waiting to acquire a write lock. | Count |
| Page Faults | The number of page fault exceptions raised when requested data is not in memory. | Count |
| Slow Queries | The number of queries currently exceeding the configured slow query threshold. | Count |
| Maintained Cursors | The number of open cursors currently maintained by the DDS instance. | Count |
| Timeout Cursors | The number of cursors that have timed out since the last server restart. | Count |
WiredTiger
Metric name | Description | Units |
| Bytes in WiredTiger Cache | The total amount of data currently held in the WiredTiger storage engine cache. | MB |
| Tracked Dirty Bytes in WiredTiger Cache | The amount of dirty data in the WiredTiger cache that has not yet been flushed to disk. | MB |
| Bytes Written Into Cache per Second | The rate at which data is being written into the WiredTiger cache per second. | Bytes/second |
| Bytes Written From Cache per Second | The rate at which data is being written from the WiredTiger cache to the disk per second. | Bytes/second |
| WiredTiger Cache Used Percent | The percentage of the total WiredTiger cache currently in use. | Percentage |
| WiredTiger Cache Dirty Percent | The percentage of the WiredTiger cache currently occupied by dirty pages. | Percentage |
| Checkpoint Triggers | The number of WiredTiger checkpoint flushes triggered within the monitoring period. | Count |
| Collection Total Time | The total time spent on all collection-level operations within the monitoring period. | Millisecond |
| Collection Read Time | The total time spent on collection-level read operations within the monitoring period. | Millisecond |
| Collection Write Time | The total time spent on collection-level write operations within the monitoring period. | Millisecond |
Replication
Metric name | Description | Units |
| Replication Headroom | The remaining time before the secondary node's replication position is overwritten by the primary operation log. | Second |
| Oplog Window | The total time window covered by the current operation log on the primary node. | Hours |
| Replication Lag | The delay between the primary node and the secondary node in applying replication operations. | Second |
| Replicated COMMAND Statements per Second | The rate of replicated command operations applied on the secondary node per second. | Count/second |
| Replicated UPDATE Statements per Second | The rate of replicated update operations applied on the secondary node per second. | Count/second |
| Replicated DELETE Statements per Second | The rate of replicated delete operations applied on the secondary node per second. | Count/second |
| Replicated INSERT Statements per Second | The rate of replicated insert operations applied on the secondary node per second. | Count/second |
Network
Metric name | Description | Units |
| Network Output Throughput | The rate of data transmitted out of the DDS instance over the network per second. | Bytes/second |
| Network Input Throughput | The rate of data received by the DDS instance over the network per second. | Bytes/second |
| Received Packet Error Rate | The percentage of inbound network packets that contained errors. | Percentage |
| Received Packet Loss Rate | The percentage of inbound network packets that were dropped. | Percentage |
| Sent Packet Error Rate | The percentage of outbound network packets that contained errors. | Percentage |
| Sent Packet Loss Rate | The percentage of outbound network packets that were dropped. | Percentage |
| Retransmitted Packets | The number of TCP packets that were retransmitted due to loss or timeout. | Count |
| Retransmission Ratio | The percentage of TCP packets that required retransmission. | Percentage |
| Sent RST Packets | The number of TCP RST packets sent by the DDS instance to forcibly close connections. | Count |
Disk and IO
Metric name | Description | Units |
| Storage Space Usage | The percentage of total disk storage currently consumed by the DDS instance. | Percentage |
| IOPS | The number of read and write I/O operations processed by the disk per second. | Count/second |
| Disk Read Throughput | The rate of data read from the disk by the DDS instance per second. | Bytes/second |
| Disk Write Throughput | The rate of data written to the disk by the DDS instance per second. | Bytes/second |
| Average Time per Disk Read | The average time taken to complete a single disk read operation. | Second |
| Average Time per Disk Write | The average time taken to complete a single disk write operation. | Second |
| Total Storage Space | The total disk storage capacity provisioned for the DDS instance. | GB |
| Used Storage Space | The total disk storage currently consumed by the DDS instance. | GB |
Threshold configuration
You can configure thresholds and alerts for all DDS metrics to detect performance degradation proactively or connection issues.
- Go to Admin > Configuration Profiles > Threshold and Availability.
- Create or edit your Threshold Profile for DDS.
- Assign the profile to the respective monitors to trigger alerts.
IT Automation
Use Site24x7's IT Automation to resolve common issues with DDS performance:
- Go to Admin > IT Automation Templates. Then, click Add Automation Templates.
- Create an automation rule by selecting the automation Type (e.g., Server reboot, clear queue).
- Map the created rules to the DDS, for automatic execution during alerts.
Configuration rules
Use Configuration Rules to simplify bulk setup across DDS instances. Automatically assign Threshold Profiles, Notification Profiles, Tags, and Monitor Groups when new monitors are discovered.
