Thursday, 11 October 2012

Process Groups and fork speed

Was just trying out an experiment. Am surprised that my i7 laptop CPU
(2.0GHz) can only achieve 200 fork/sec on Ubuntu 12.04. I would expect
it to do much better.

Why do I care? Well, have been experimenting with process ids and
process groups - a part of Unix for decades, yet rarely understood,
except by those writing shells or other job control types of activities.

Run the following command:

$ ps -j
347 345 3179 pts/4 00:00:00
1374 1371 3179 pts/4 00:00:00
3179 3179 3179 pts/4 00:00:00 bash

This shows three processes - one is my shell. Note the PGID column.
What is it?

Well the process group mechanism is the thing which ensures when you
hit Ctrl-C, you kill all the child processes, but not the shell

The shell invokes the system call setpgrp() and the child and all
its children sit in a group.

The wonderful thing about process groups is they provide a means to
allow killing them all, without having to do the equivalent of
"ps -aef" to find all the procs in the system. (Imagine you want
to kill all the children and grandchildren, even if these children
are fork-bombing you; in a fork-bomb type scenario, by the time
you have done a "ps" to find the PID, it will have already forked
a copy of itself and the PID may no longer be valid).

The PGID is interesting; normally its set to the PID of the
process group leader (root of the tree of processes). You can change
it when you like, but you can only change it to the PID of yourself.

If you do this, and then fork, and have the parent pid terminate,
you can end up with a situation (such as the procs above)
where the PID != PGID.

Now the PGID have an important property. Whilst a PGID of value nnn
exists, you cannot fork a new process to have the same PID. Doing
so would mean you are joining an existing process group. (And this
would be a security issue). (I wrote a script to keep forking
til we hit a specific PID, but it never happened, and debugging showed
this scenario - PGID and PIDs exist in the same name space).

So, you could create 10,000 pids, each with distinct PGIDs, and
steal 20,000 of the pid address space. (Many Linux's limit you to
10,000 pids per user id).

I stumbled across this whilst trying to prove a theorem about
process killing - and its good, because it means the real problem
I am trying to solve is not amenable to a race condition or attack.

There is a converse issue: setpgrp() system call *CAN* fail.
If we try to set a PGID then we can *only if* session-id (SID,
3rd column in the ps listing) is the same. If we are sitting in the
same xterm, we can do this; if we are in a different xterm, we can not.

SID and PGID are confusing ideas, but effectively the SID is acting
as a kind of policeman over the PGID address space. And this stops
a disparate group of processes merging into the same PGID as another.
Although setpgrp() can be used to set a specific PGID, there is no
syscall to set a specific session-id. The setsid() syscall takes no

This potentially leads into trouble, because one could use 10,000
session ids, and then grab 10,000 process-group ids, and sit on 10,000
pids, and the system would (nearly) grind to a halt - Linux actually
allows 33000 unique pids before reusing them. But two userids can collude
to eat all the available pids.

Another note on setsid() - it will fail if you are a process group
leader (PID == PGID); typically, a child will do the setsid, in which
case the SID is set to the PID of the calling process. (So my prior
paragraph doesnt hold true - SIDs are a function of a PID;
if the proc which does a setsid() forks+exits, then you can have
a situation where no PID exists with the same value as a SID, e.g.
a launcher process terminates). But in any case, you are
not going to join someone elses process group whilst you have a distinct
SID. This is important - if you are writing forking-daemons, that
setsid() must be called, else you can interfere with the daemon in
some way, if you carry on launching processes from the same xterm session
(technically, the same SID).

Post created by CRiSP v11.0.12a-b6448

No comments:

Post a Comment