Sunday 17 July 2011

What is the meaning of 1?

Timestamp: 2011-07-17 12:59:11
Title: What does '1' mean?
Body:
In the context of load average on a system, a load avg of 1 is
something meaningful, if you are on a single cpu system. It represents
the cpu is busy, continuously.

Now consider multicore/multicpu machines. A load avg of 1 is not
quite so meaningful. On Linux, the load average represents a moving
average of processes which are blocking. It slows ramps up and ramps
down.

Doing heavy duty work (like parallel compilation) means that "gmake -j"
doesnt have enough information to determine if the system is busy.

In the old days, when a source file compilation could take many seconds
or minutes, the load average told us what the system was doing.

On an 8-core (Intel i7) cpu, doing 'gmake -j' can invoke
tens of parallel compilations, yet, 'top' can show the system as
being idle, because the load average takes a while to ramp up.

On an 8-core system, with one cpu being busy, should we say 'the system is
busy' (system usage == 100%), or should we say it is idle (system usage == 12.5%)?

The answer depends on what you are measuring and how you want to handle
it. If 1 out of 8 cpus is busy (maybe the application is broken and
stuck, and eating cpu continuously), then that is important. The
system may be busy, but noticing that rogue application is useful.
Ignoring it until all 8 cores are busy may never happen.

An additional complexity is that on a totally idle system, a single
CPU can ramp up the clock speed; but if that cpu is not doing useful
work, then the second cpu may not be able to ramp up as high, and
get worse performance.

In the end, what is useful is to notice one or more processes
'behaving badly', e.g. consuming too much cpu, or too many failed
syscalls, or too much I/O.

Today top (or my application, 'proc') does not readily show that, but
that needs to change.

No comments:

Post a Comment