Riak monitoring made easy with critical metrics including total allocated memory, number of active GET/PUT FSMs and more.
Riak is an open source NOSQL database designed for high availability, fault-tolerance and great scalability. Install and configure the Riak plugin and take informed troubleshooting decisions by keeping track of critical metrics.
This document details how to configure the Riak plugin and the monitoring metrics for providing in-depth visibility into the performance, availability, and usage stats of Riak servers.
Riak performance monitoring metrics:
Use our wide array of metrics and get notified of hazardous errors that require your attention. Keep track of unexpected trends through our metric graphs and troubleshoot as quickly as possible. Various out-of-the-box metrics we support are:
Number of protocol buffer connections
Metrics pbc_connects and pbc_active give the total number of protocol buffer connections (PBC) and the number of active PBCs respectively
Memory allocated for atom storage
The metrics memory_atom and memory_atom_used give the total amount of memory currently allocated and used for atom storage respectively
Memory allocated for Binaries
Riak is a key/value store and the values are simply stored on disk as binary. The total amount of memory used for binaries is given by memory_binary
Memory allocated for Erlang
Riak runs on an Erlang virtual machine. Stats on the total memory allocated for Erlang code (memory_code), Erlang Term Storage (memory_ets) and Erlang processes (memory_processes) is critical to properly tune Erlang VM and optimize Riak performance
Number of GET FSMs
GET FSM sibling stats offer a count on the number of siblings encountered by this node on the occasion of a GET request. Metrics node_get_fsm_in_rate and node_get_fsm_out_rate give the average number of GET FSMs enqueued/dequeued by Sidejob respectively
Number of PUT FSMs
FSM time stats represent the amount of time (in microseconds) required to traverse the PUT Finite State Machine (FSM) code, offering a picture of general node health. The number of PUT FSMs active in the last minute is represented by node_put_fsm_active_60s. The metrics node_put_fsm_in_rate and node_put_fsm_out_rate give the average number of PUT FSMs enqueued/dequeued by Sidejob respectively. The number of PUT FSMs actively being rejected by Sidejob’s overload protection in the last minute is given by node_put_fsm_rejected_60s
Vnode index operations
Virtual nodes (vnodes) are processes that manage partitions in the Riak ring. Each Riak node contains multiple vnodes. Get information on the number of GET and PUT operations coordinated by vnodes on a particular node by the metrics vnode_gets and vnode_puts respectively. Also, know the number of local replicas participating in secondary index writes (vnode_index_writes), reads (vnode_index_reads) or deletes (vnode_index_deletes) in the last minute
How it works?
Log-in to your Site24x7 account. Sign up here if you don't have one.
The agent will execute the Riak plugin and push the data to the Site24x7 server
Ensure Riak is installed in the server and it is up and running
The Riak plugin extension uses '/stats' url ('http://127.0.0.1:8098/stats') to fetch the performance metrics. It is configured by default during the installation of the Riak server. If not, please configure it
Riak plugin installation:
Create a directory with the name "riak", under Site24x7 Linux agent plugin directory - /opt/site24x7/monagent/plugins/
sudo mkdir riak
Download the "riak.py" from our GitHub repository and place it under the "riak" directory