Help Docs

Alibaba Cloud Kafka Monitoring Integration

Site24x7 provides comprehensive out-of-the-box monitoring support for Alibaba Cloud Kafka. By observing metrics such as message input/output, accumulation, latency, network utilization, and disk usage, you can gain real-time insights into Kafka cluster performance, client behavior, and broker efficiency. Once you integrate your Alibaba Cloud account with Site24x7, all Kafka instances are auto-discovered and monitored.

Use cases

  • Track producer/consumer throughput: Monitor input/output rates at the instance, topic, and group levels to analyze data flow.
  • Detect message backlog: Get alerted when message accumulation grows abnormally across clusters or specific topics.
  • Monitor latency and throttling: Identify slowdowns in request processing or broker-side throttling.
  • Ensure resource utilization health: Analyze disk usage, batch sizes, and connection load to avoid system saturation.
  • Spot networking bottlenecks: Track network I/O rates and utilization by node for smoother data transport.

Setup and configuration

  • Log in to your Site24x7 account and navigate to Cloud > Alibaba Cloud > Add Monitor.
  • In the Edit Alibaba Cloud Monitor page, select Kafka from the Service Types list.
  • Once added, go to Cloud > Alibaba > Kafka to view dashboards and performance metrics.

Supported metrics

Message Input and Output

Metric nameDescriptionUnit
Instance Message Input (v3) The number of messages produced to the instance. Count/second
Instance Message Output (v3) The number of messages consumed from the instance. Count/second
Instance Message Input Ratio (v3) The rate of message input for the instance. Percentage
Instance Message Output Ratio (v3) The rate of message output for the instance. Percentage
Cluster Message Input (v3) The total message input across the Kafka cluster. Count/second
Group Message Output Count (v3) The number of messages consumed by a specific group. Count/second
Topic Message Input Count (v3) The number of messages produced to a topic. Count/second
Topic Message Output Count (v3) The number of messages consumed from a topic. Count/second

Message Accumulation

Metric nameDescriptionUnit
Message Accumulation (v3) The number of unconsumed messages in the cluster. Count
Message Accumulation Total backlog of messages awaiting consumption. Count
Message Accumulation (Single Topic) Message backlog for a single topic. Count

Request and Processing

Metric nameDescriptionUnit
Instance Requests Input (v3) Number of incoming requests to the instance. Count/second
Instance Requests Output (v3) Number of outgoing responses from the instance. Count/second
Topic Requests Input (v3) Number of incoming requests targeting a specific topic. Count/second
Topic Requests Output (v3) Number of outgoing responses from a specific topic. Count/second

Latency and Throttling

Metric nameDescriptionUnit
Instance Throttle Time P99 (Input, v3) 99th percentile of input throttling time. Milliseconds
Instance Throttle Time P99 (Output, v3) 99th percentile of output throttling time. Milliseconds
Instance Fetch Throttle Queue Size (v2) Fetch throttle queue size on the instance. Count
Instance Produce Throttle Queue Size (v2) Produce throttle queue size on the instance. Count
Instance Batch Size (TP50, v2) Median batch size of producer messages. Bytes
Instance Batch Size (TP999, v2) 99.9th percentile batch size of producer messages. Bytes

Network

Metric nameDescriptionUnit
Instance Internet Receive Rate (v3) Rate of incoming network traffic to the instance. Bytes/second
Instance Internet Transmit Rate (v3) Rate of outgoing network traffic from the instance. Bytes/second
Instance Internet Receive Utilization (By Node) Network receive utilization by node. Percentage
Instance Internet Transmit Utilization (By Node) Network transmit utilization by node. Percentage

Disk and Storage

Metric nameDescriptionUnit
Instance Disk Capacity The total disk capacity allocated to the instance. GB
Instance Disk Log Size (v3) The size of Kafka log files on disk. GB

Connections

Metric nameDescriptionUnit
Instance Maximum Connection Count (v3) The maximum number of concurrent connections allowed. Count
Instance Total Connection Count (v3) The total number of active client connections. Count

Threshold configuration

  1. Go to Admin > Configuration Profiles > Threshold and Availability.
  2. Create or edit a threshold profile for Kafka.
  3. Assign the profile to the respective monitors to trigger alerts.

IT automation

Site24x7's IT Automation tools help with automatically resolving performance degradation issues. When a breach occurs, the alarm engine continuously examines the system events for which thresholds have been defined and performs the mapped automation.

  1. Go to Admin > IT Automation Templates.
  2. Create a new automation rule.
  3. Map the rule to the monitor for proactive resolution.

How to configure IT Automation for a monitor

Configuration rules

With Site24x7's Configuration Rules, you can set parameters like Threshold Profile, Notification Profile, Tags, and Monitor Group for multiple monitors and automate the configuration settings of your monitoring resources. Automatically assign these settings when new Kafka monitors are added.

How to add a Configuration Rule

Related links

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!