Sunday, 8 January 2012

More on the impossible

I wrote last time about the worst bug to try and diagnose, namely
one where we lose the page table or GDT or both. Doing so can result
in a double or triple fault, and no way to figure out where you came from.

In user space, a jump to a virtual method (which is in essence a call
to a function via a level of indirection), can mean the PC is set to zero,
and, is similar, in that it is annoying to not know where you came from.
But the fact the program counter is zero tells you that you went through
a null pointer.

So, where am I solving the "impossible"? Not much further forward.
I have been using the VirtualBox debugger - but it is very broken and
pathetic. You cannot set breakpoints or hardware breakpoints if
using the VT-x/AMD-V virtualisation acceleration. If you turn these
off, you can, except the semantics of breakpoints breaks the guest
operating system. In addition, writing to guest CPU registers is not

I took a quick look at qemu, but I didnt like what I found -- I prefer
a CLI in general, but for VM guests, I prefer a GUI to get me comfortable.
The GUIs on Linux are very ugly and amateurish which didnt instill confidence
in me. (I know, this is unfair of me). I may try again in the future.

I went back to kgdb - at least this let me set breakpoints and hardware
breakpoints, which is useful. But the process of using kgdb is very
clunky - the guest and remote debugger get out of sync on the comms protocol.
In any case, hitting the bug I am interested in didnt help much, with kgdb.
I could break in the doublefault_fn function, but we couldnt really
figure out where we had come from.

I modified dtrace to allow access to the GDT and IDT via
/proc/dtrace/gdt and /proc/dtrace/idt. (Not really needed, but useful
for validating that these data structures are correct).

What I am finding is that on a double trace fault, there is a suggestion
that the original offending kernel stack for a process has been set to
all zeroes. When the kernel tries to dereference an argument on the
stack, or return from the offending function, it generates a GPF, which
in turn generates a double-fault. (I'm not totally sure of this - a GPF
wouldnt normally generate a double-fault, unless the GDT, page table
or IDT were screwed up).

Lets just revisit what I am doing: having cut down dtrace to a minimalist
shell, we can override entry IDT[14], which is the page-table vector
entry. If we put in the actual value which is there already, everything is

If we modify the entry to point to our interrupt routine, and make
our interrupt routine simply jump to the original kernel routine, at
some time after this change (could be instantaneous to a minute later),
we crash the kernel. It feels like a few pages of the kernel got overwritten,
e.g. memset(random-ptr, 0, PAGE_SIZE). But tracking this down is
nearly impossible.

I have been adding debug code to the kernel source to try and do extra
validation (eg in the scheduler, just before the context switch occurs),
but this hasnt proven fruitful so far.

Its almost like looking for a root kit in the kernel - I almost wander
if the kernel has some tamper-resist code in there (it does, but not like

I need to somehow checksum the entirety of RAM and look for something
unexpected to happen, but doing this isnt viable. RAM and processes are
changing all the time. Process creation complicates things - every fork()
generates a new process with a new kernel stack. I need to keep walking
all processes kernel stacks to detect corruption, before we switch to the

I am running on a single-CPU guest, to avoid the complexity of multi cpu
operations. What I cannot determine is if something is being corrupted by
virtue of writing to the IDT, or a long time after.

Alas, google searching hasnt been helpful - the symptom and problem is
very unique (I am not writing a rootkit, although dtrace looks an awfully
lot like a rootkit in terms of what it does), and I am not booting up a
new operating system. Nobody describes the scenario of modifying an
in-use IDT and the things that can go wrong. (I did find two links,
quoted in a couple of posts ago).

Next is to try disabling dtrace's timer code - maybe that is causing
non-deterministic behavior.

Post created by CRiSP v10.0.22a-b6150

No comments:

Post a Comment