Alibaba Cloud HBase Monitoring Integration
Site24x7 offers out-of-the-box monitoring for HBase deployed in your Alibaba Cloud environment. Get deep visibility into request processing, storage file performance, block cache efficiency, and server health, enabling you to maintain fast, reliable, and scalable NoSQL operations. Once your Alibaba account is integrated with Site24x7, all associated HBase instances are automatically discovered and monitored.
Use cases
- Query latency tracking: Monitor read/write request counts and their average latency to identify bottlenecks.
- Storage insights: Understand store file sizes, memory consumption, and flush behaviors to optimize storage.
- Cache efficiency: Measure block cache hit/miss counts to improve read performance.
- Server health: Monitor region server counts, queue size, and exceptions to ensure HBase availability.
- GC and memory profiling: Analyze heap memory and garbage collection times to manage memory leaks or spikes.
Setup and configuration
- Log in to your Site24x7 account and navigate to Cloud > Alibaba Cloud > Add Monitor.
- In the Edit Alibaba Cloud Monitor page, select HBase from the Service Types list.
- Once added, go to Cloud > Alibaba > HBase to view dashboards and performance metrics.
Supported metrics
Request Metrics
| Metric name | Description | Unit |
|---|---|---|
| Read Requests (Operations) | The number of read operations processed by HBase. | Count |
| Write Requests (Operations) | The number of write operations processed by HBase. | Count |
| Put Request Latency (Mean) | The average latency of put (write) requests. | Milliseconds |
| Get Request Latency (Mean) | The average latency of get (read) requests. | Milliseconds |
| Append Operations | The number of append operations performed. | Count |
| Slow Get Count | The number of get requests considered slow. | Count |
| Slow Put Count | The number of put requests considered slow. | Count |
| Slow Append Count | The number of append requests considered slow. | Count |
| Flush Time (Average) | The average time taken for flush operations, in nanoseconds. | Nanoseconds |
| Flush Time | The time taken for the most recent flush operation. | Milliseconds |
Store Metrics
| Metric name | Description | Unit |
|---|---|---|
| Store Files | The number of store files in HBase. | Count |
| Store File Size | The total size of all store files. | Bytes |
| Store File Index Size | The total size of store file indexes. | Bytes |
| Memstore Size | The amount of data currently in the MemStore. | Bytes |
| Flush Queue Size | The size of the flush queue. | Count |
| Store File Uncompressed Size | The total uncompressed size of store files. | Bytes |
Block Cache Metrics
| Metric name | Description | Unit |
|---|---|---|
| Block Cache Hit Count | The number of successful block cache hits. | Count |
| Block Cache Miss Count | The number of block cache misses during reads. | Count |
| Block Cache Count | The total number of blocks in cache. | Count |
| Block Cache Size | The total size of the block cache. | Bytes |
| Block Cache Free | The amount of free space in the block cache. | Bytes |
System — Region Servers
| Metric name | Description | Unit |
|---|---|---|
| Number of Region Servers | The number of live region servers. | Count |
| Number of Dead Region Servers | The number of dead region servers. | Count |
| Regions | The total number of regions managed by the cluster. | Count |
| Number of Open RegionServer Connections | The number of open connections on the region server. | Count |
| Handler Queue Size | The size of the handler queue. | Count |
| Not Serving Region Exception | The number of NotServingRegion exceptions. | Count |
| Region Too Busy Exception | The number of RegionTooBusy exceptions. | Count |
System — GC and Memory
| Metric name | Description | Unit |
|---|---|---|
| GC Time (ms) | The total time spent on garbage collection. | Milliseconds |
| Heap Memory Used | The amount of heap memory currently used. | MB |
Threshold configuration
- Go to Admin > Configuration Profiles > Threshold and Availability.
- Create or edit a threshold profile for HBase.
- Assign the profile to the respective monitors to trigger alerts.
IT automation
Site24x7's IT Automation tools help with automatically resolving performance degradation issues. When a breach occurs, the alarm engine continuously examines the system events for which thresholds have been defined and performs the mapped automation.
- Go to Admin > IT Automation Templates.
- Create a new automation rule.
- Map the rule to the monitor for proactive resolution.
How to configure IT Automation for a monitor
Configuration rules
With Site24x7's Configuration Rules, you can set parameters like Threshold Profile, Notification Profile, Tags, and Monitor Group for multiple monitors and automate the configuration settings of your monitoring resources. Automatically assign these settings when new HBase monitors are added.
How to add a Configuration Rule
