Monitoring host hardware health using sensors
Hardware sensors check the functioning of VMware hardware. Different sensors are mounted on various hardware components, and all of these sensors periodically collect data and send it to Site24x7 for monitoring. The On-Premise Poller acts as the probe for data collection, and Site24x7 receives and displays this data as intuitive charts.
With VMware Monitoring, you can track the performance of your hardware using the following sensors in VMware:
The power supply may not always be uniform. It may be turned off at times, there may be fluctuations, or it may run at full capacity.
“Fan Transition to Critical from less severe” and “Fan Transition to Off Line” are common errors when it comes to host fan health.
Hardware performance depends on systems running at the optimum temperature. Temperature sensors will frequently identify errors like “Temperature Lower Critical going low,” “Temperature Transition to Critical from less severe,” “Temperature Transition to Non-recoverable from less severe,” and “Temperature Upper Critical going high.” IT teams must control the temperature before it goes beyond the appropriate limits.
Processors are prone to errors like thermal trip errors, configuration errors, machine check exceptions, correctable machine check exceptions, and internal errors (IERR), all of which can affect the performance of the CPU.
Voltage has to be monitored at the power supply input and output. Technicians generally keep an eye on errors like “Voltage Limit Exceeded” and “Voltage Transition to Critical from less severe.”
Storage sensors differ by storage type, and information on disk storage is required for capacity planning.
Since the Watchdog sensor monitors the system board, it’s important to monitor it.
Memory has a great impact on resource allocation and is prone to errors like configuration errors, uncorrectable error-correcting code (ECC) errors, “Memory Transition to Critical,” and “Memory Critical Overtemperature.”
The status of the battery, battery on array, and battery on controller have to be closely watched. The color code depicts the battery health and it must never be red.
Any hardware outside of the above categories is grouped as “other” sensors in VMware.
For all the sensors above, any change or deviation from the ideal performance should be tracked and reported to ensure uninterrupted performance and optimal hardware health.
How does Site24x7 help?
With Site24x7, you can configure threshold limits for all the sensors connected to the above hardware. You can diagnose and fix hardware issues as well as stay on top of your VMware hardware. Site24x7 enables you to:
- Receive threshold breach alerts via email, SMS, voice call, and third-party mediums like Slack, Jira, Microsoft Teams, and ManageEngine ServiceDesk Plus.
- Generate custom reports based on your needs.
- Create a custom dashboard and view all your hardware metrics from a single screen.
Monitor your virtual environment, and locate the source of potential issues using performance metrics. Sign up now!