Plugin for Hadoop monitoring.

Monitor your Hadoop setup using Site24x7 and gain in-depth visibility into critical performance metrics.

Hadoop is an open source, Java-based programming framework that allows you to store and process extremely large sets of data. Use Site24x7 plugins and continually collect Hadoop statistics, events, and metrics. Identify both recent and long-term performance trends, and quickly resolve issues when they arise.

This document details how to configure our Hadoop plugin and the monitoring metrics for providing in-depth visibility into the performance, availability, and usage stats of Hadoop servers.

Hadoop performance monitoring metrics:

Troubleshoot your Hadoop environment with ease by keeping track of critical metrics including:

Total Load

"total_load" gives us the measure of file access across all data nodes in your Hadoop setup.

Used space

"used_space" gives us the total amount of space that has been consumed and is unavailable for further use in your Hadoop configured system.

Free space

"free_space" gives us the total amount of space that has not been consumed yet and is available for further use in your Hadoop configured system.

Missing blocks

The metric "missing_blocks" gives the number of missing memory blocks in your Hadoop setup.

Corrupt blocks

"corrupt_blocks" gives the number of corrupt memory blocks in your Hadoop setup.

Configured capacity

The metric "configured_capacity" lists down the total amount of space configured for name nodes in your Hadoop setup.

Percentage remaining

The metric "percent_remaining" gives us the percentage of free space remaining for use in your Hadoop setup name node.

Total blocks

"total_blocks" lists down the number of memory blocks that have been created in your Hadoop setup name node.

Total files

The metric "total_files" lists down the total number of files in your Hadoop setup name node.

Number of threads

"number_of_threads" lists down the number of threads currently running in your Hadoop setup name node.

Total space

"total_space" gives the measure of the total space available in your Hadoop setup data node.

Remaining space

"remaining_space" gives the measure of the total unused space available in your Hadoop setup data node.

DFS used space

The metric "dfs_used_space" gives the measure of the total used space in your Hadoop setup due to the data node.

Non DFS used space

"non_dfs_used_space" gives the measure of the total used space in your Hadoop setup due to reasons other than the data node.

Active nodes

"activenodes" and get the number of nodes in your Hadoop setup that are currently active for usage.

Total/allocated MB

Use the metrics "totalMB", "allocatedMB" and get the total amount of memory in your Hadoop setup as well as stats on whether they have already been allocated for some other purpose.

Available/reserved MB

Use the metrics "availableMB", "reservedMB" and get the total amount of memory in your Hadoop setup that is available for usage or is reserved for some other purpose.

Total/allocated virtual cores

Use the metrics "totalMB", "allocatedvirtualcores" and get total number of virtual cores as well as stats on whether they have already been allocated for a job in your Hadoop configured system.

Available/reserved virtual cores

Use the metrics "availablevirtualcores", "reservedMB" and get total number of virtual cores that is available for usage or is reserved for some other purpose in your Hadoop configured system.

Applications submitted/completed/failed

Use the metrics "appssubmitted", "appscompleted", "appsfailed" and get count of the total number of applications that have been submitted, completed running or failed in your Hadoop system.

Applications killed/pending/running

Use the metrics "appskilled", "appspending", "appsrunning" and get count of the total number of applications that have been killed, are pending, or are still running in your Hadoop system.

Containers allocated/pending

In Hadoop a container is a place where a unit of work occurs. Use the metrics "containersAllocated", "containersPending" and get the total count of all the containers that have been allocated or are still pending allocation in the Hadoop setup.

Containers reserved/running

In Hadoop a container is a place where a unit of work occurs. Use the metrics "containersReserved", "runningContainers" and get the total count of all the containers that are reserved, or are still running in the Hadoop setup.

Total/decommissioned nodes

Use the metrics "totalNodes", "decommissionedNodes" and get the total count of all the nodes as well as how many are decommissioned in the system.

Lost/rebooted/unhealthy nodes

Use the metrics "lostNodes", "rebootedNodes", "unhealthyNodes" and get the total count of all the nodes that are lost, rebooted or unhealthy in the system.

Elapsed Time

The metric "elapsedTime" will get the total amount of time it took a cluster for executing.

Memory seconds

The metric "memoryseconds" will get the aggregated amount of memory (in megabytes) the application has allocated times the number of seconds the application has been running.

Progress

The "progress" will record what the status of a job is currently in percentage of completion.

Last health update

The "lastHealthUpdate" will record the amount of time it has been since a health update has taken place in your Hadoop configured system.

How it works?

  • Log-in to your Site24x7 account. Sign up here if you don't have one
  • Download and install the latest version of Site24x7 Linux agent
  • Install the Hadoop plugin
  • The agent will execute the Hadoop plugin and push the data to the Site24x7 server

Prerequisites:

  • Ensure Hadoop is installed in the server and it is up and running.

Hadoop plugin installation:

  • Create separate directories for all the plugins with the name "hadoop", "hadoop_namenode", "hadoop_datanode", "hadoop_resourcemanager_metrics", "hadoop_resourcemanager_appmetrics", "hadoop_resourcemanager_nodemetrics" under the Site24x7 Linux Agent's plugin directory - /opt/site24x7/monagent/plugins/
  • cd /opt/site24x7/monagent/plugins/
    sudo mkdir hadoop
    sudo mkdir hadoop_namenode
    sudo mkdir hadoop_datanode
    sudo mkdir hadoop_resourcemanager_metrics
    sudo mkdir hadoop_resourcemanager_appmetrics
    sudo mkdir hadoop_resourcemanager_nodemetrics
  • Download the file ""hadoop.py" from our GitHub repository and place it under the "hadoop" directory
  • cd hadoop
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop/hadoop.py
  • Download the file "hadoop_namenode.py" from our GitHub repository and place it under the "hadoop_namenode" directory
  • cd hadoop_namenode
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop_namenode/hadoop_namenode.py
  • Download the file "hadoop_datanode.py" from our GitHub repository and place it under the "hadoop_datanode" directory
  • cd hadoop_datanode
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop_datanode/hadoop_datanode.py
  • Download the file "hadoop_resourcemanager_metrics.py" from our GitHub repository and place it under the "hadoop_resourcemanager_metrics" directory
  • cd hadoop_resourcemanager_metrics
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop_resourcemanager_metrics/hadoop_resourcemanager_metrics.py
  • Download the file "hadoop_resourcemanager_appmetrics.py" from our GitHub repository and place it under the "hadoop_resourcemanager_appmetrics" directory
  • cd hadoop_resourcemanager_appmetrics
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop_resourcemanager_appmetrics/hadoop_resourcemanager_appmetrics.py
  • Download the file "hadoop_resourcemanager_nodemetrics.py" from our GitHub repository and place it under the "hadoop_resourcemanager_nodemetrics" directory
  • cd hadoop_resourcemanager_nodemetrics
    sudo wget https://raw.githubusercontent.com/site24x7/plugins/master/hadoop_resourcemanager_nodemetrics/hadoop_resourcemanager_nodemetrics.py

Hadoop plugin configuration:

  • Replace the shebang character "#!" in line 1 to the appropriate path for python's version where you have installed psycopg2 in your system
  • Eg : #!/usr/local/bin/python3
  • Configure host and port values for the Hadoop plugin
  • Eg:
    HOST = "localhost"
    ADMINPORT = "4848"
  • The same edits can be done for all the other plugins as well.
  • Save the changes and restart the agent.
  • /etc/init.d/site24x7monagent restart

Monitoring additional metrics:

  • To monitor additional metrics, edit any one of the plugin files and add the new metrics that need monitoring
  • Increment the plugin version value in the plugin file to view the newly added metrics ( For e.g. Change the default plugin version from PLUGIN_VERSION = "1" to "PLUGIN_VERSION = "2")

Related plugins:

  • Redis plugin - Monitor performance metrics of your Redis databases
  • Postgres plugin - Monitor performance metrics of your Postgres databases
  • CouchDB plugin - Analyze performance of your CouchDB server
  • Nagios plugin - Execute thousands of Nagios plugins in Site24x7 without the need of running a Nagios server
  • Out-of-the-box plugins - Monitor your entire app stack with our extensive list of integrations
  • Create custom plugins - Create custom Linux and Windows plugins and monitor custom attributes
Trusted

World's leading companies

Client

Don't have a Site24x7 account? Sign up now!