Understanding CPU Load: Site24x7

Whether an application is running on a server or a local machine, monitoring CPU utilization and CPU load is essential for optimizing performance. While CPU utilization and load might sound similar, they’re actually quite different.

This article will explain the differences between these two important metrics, how to monitor CPU load with common commands, the impact of high CPU load, and how to bring it under control for improved system performance.

CPU utilization versus CPU load

CPU utilization is the percentage of work the CPU is handling to manage tasks. CPU load, on the other hand, is a measurement of how many processes are being executed or waiting to be executed by the CPU.

Commands like uptime or top provide the value of CPU load averages—i.e., the average number of threads actively using or requesting to use CPU over the last 1, 5, and 15-minute periods. High load averages indicate an overloaded CPU with too many processes. For example, a CPU with a load average of 1 is running at full capacity, whereas a completely idle CPU has a load value of 0.

However, this metric scales with the CPU cores: The more cores installed on the system, the more tasks it can handle in parallel.

For example, for a CPU with 4 cores, the accumulated load average will be 4, since each core can handle a load value of 1. In this scenario, even if one of the cores is running at 100% capacity, the CPU would only be loaded with one quarter of its potential load.

Now, if the load becomes greater than the number of cores installed, then the processes would start to queue up to use the CPU.

Monitoring CPU load

As noted above, CPU load is averaged over the previous 1, 5, and 15 minute periods. But there are other metrics that also help us identify the CPU load, such as:

Idle time: The idle time is inversely related to the CPU load. This means that when idle time increases, the CPU load decreases and vice versa.
User time and system time: The user time and system time directly indicate the CPU load. Basically, the sum of user time, system time, and idle time is equal to 100% of the CPU time or load. Higher user and system time values indicate a higher load in the CPU.
Wait or I/O wait time: The I/O wait time refers to instances where the CPU is idle and waiting for an I/O to complete. This increases the CPU load, as more processes wait for the CPU while it’s waiting for the I/O to complete.
Steal time: The percentage of time a virtual CPU involuntarily waits for a CPU process while the hypervisor is servicing another virtual CPU.

Effects of high CPU load

Generally, a high CPU load doesn't negatively impact a system’s performance, as long as it isn’t a long-term occurrence. But running a CPU at 100% capacity for extended periods can have mild to severe impacts on system performance.

A few possible issues are listed below:

The system might freeze or stop responding, leading to an unscheduled restart.
Multiple CPU-intensive programs and applications may take more time than expected to start, or may not be able to open at the same time.
The applications (or even the whole system) may become extremely slow and start to lag.
When a system is running at full capacity, it starts to overheat which in turn impacts its performance. To improve a system’s processing power, adjust the speed of the CPU fan. We can configure the fan speed in the BIOS setup.

Identifying and troubleshooting high CPU load

Different commands help monitor the system’s load over different periods. Usually, a smaller number is better, as a higher number indicates an overloaded machine.

The next section will cover some of the commands that make it easy to monitor the CPU load averages.

Using the `top` command

The top command displays the dynamic statistics of a running Linux system in real time. It’s one of the most-used commands for monitoring system performance. The first half of the output of the top command contains important system metrics, while the second part displays statistics about a self-updating list of processes that are currently being managed by the Linux kernel.

Running the top command will create an output similar to the one seen in the figure below:

Fig 1: Output of the top command

The first line of this output displays the uptime, the total number of active users logged into the system, and the load averages of the CPU for the last 1, 5, and 15 minute intervals.

For example, the above output shows the load averages as 0.13, 0.40, and 0.21. As stated earlier, to properly interpret these numbers, it’s important to know how many cores the CPU has. The above output is from a single-core machine, so the load average is within the acceptable limit, as all of the load averages are less than 1.0. Even if there’s a spike in a 1 and 5-minute duration, as long as the load average of the 15-minute duration is within the limit, there shouldn’t be an issue.

Using the `uptime` command

The uptime command is also useful for viewing the load average of the system. This command displays the current system time, the uptime of the machine, the number of users currently logged into the system, and the load averages for the last 1, 5 and 15-minute durations.

Running the uptime command will generate an output similar to the one shown below:

Fig 2: Output of the uptime command

The above output is very similar to the first line of the output of the top command. The load averages are displayed in the same format, and the value of the load averages are 0.53, 0.56, and 0.24. Since this is the output of a single core machine, the load averages are still under the limit, as they are under 1.0. This value is always adjusted according to the number of cores in the system.

Using the `ps` command

The ps command is a flexible and widely used tool for identifying the processes running in the system and the number of resources they’re using to run. This command can show different outputs according to various options.

Running the ps command will generate an output like the one shown below:

Fig 3: Output of the ps command

This output displays basic information about the processes running, but it can be customized with options provided by the ps command to yield more details.

For example, we can view and sort which processes are using the most CPU by running the following command:

ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10

This will result in an output like the one shown below:

Fig 4: The top 10 most CPU-consuming processes

The ps command doesn’t display the load averages of the system, instead, it’s used to troubleshoot the cause and find the processes that are causing the high CPU load. For example, if a process is using 100% of the CPU, the other processes will have to wait for the CPU and the load on the CPU will increase.

This command also helps to identify the processes that are being spawned repetitively or are in a zombie state.

Fixing high CPU load

Below are some common fixes to reduce high CPU load:

Kill or restart processes: Often, there are just one or two processes increasing the CPU load. For example, a process could be in an uninterruptible state and increase the load on the CPU by keeping all other processes waiting. The first thing to do when the CPU becomes overloaded is to identify any processes of this kind and terminate or restart them.
Update system apps and drivers: Outdated drivers and apps can also cause high CPU load because they can’t effectively perform the I/O operations. The best way to avoid this issue is to ensure the entire system is up to date.
Reinstall or downgrade apps: Sometimes, simply reinstalling an app that was causing a high CPU load can resolve the issue. If not, switching the app to a lower or previous version might improve the performance.
Reboot the system: If nothing else seems to work and you can afford it, rebooting the system may solve the problem. Though it may not be possible every time, especially if the system is a server that can’t be shut down.

Conclusion

The CPU load is an important metric that needs to be monitored regularly to ensure that the system is running smoothly. This metric is generally measured in load averages, but there are some other measurements that indicate the amount of load on the CPU.

Luckily, there are a number of useful commands that can help identify and monitor the CPU load. The top and uptime commands help to directly monitor the load averages of the CPU, while the ps command is used to identify the processes that are causing the high CPU load.

High CPU load can be an indicator of several problems, and there are various common fixes available to reduce the high load and optimize CPU performance.

Was this article helpful?

Previous What is inode usage, and how can we reduce it?

Next Understanding CPU utilization

How to fix high CPU usage

CPU utilization versus CPU load

Monitoring CPU load

Effects of high CPU load