Wednesday, 4 January 2012

Is that the worst you can do ?

Whats the worst thing you can do to a CPU whilst executing code?

How about a buffer overflow .. overwriting beyond the end of a buffer.
Very soon a segmentation violation (or GPF) will happen, and the application
will terminate, or try to recover.

How about inside the kernel? Well, pretty much the same thing.

The x86 architecture is well thought out. When some form of memory
access goes awry, an interrupt is generated (technically a 'fault' or 'trap'),
and the kernel will attempt to recover from this.

The act of taking an interrupt involves pushing the current program counter
on the stack, and jumping to a predefined location.

Great. So - whether a GPF occurs in user space or kernel space, something
will happen. This is either recoverable, or a panic/blue-screen can
happen if the kernel doesnt know what to do.

The predefined location is setup in a table called the IDT (Interrupt
Descriptor Table).

If the interrupt to handle a GPF takes a fault itself, the system
will generate a double-fault. Double-faults are very rare. (GPFs are
very common, and can be caused under normal circumstances via memory
mapped/anonymous memory, as pages are faulted into existence).

A double fault typically indicates a flaw in a driver and can
be caused by using an invalid pointer or a stack exception in an
existing interrupt.

A triple fault is what can happen if a double fault generates an
exception. This would indicate the double-fault handling code hit
an unexpected condition. On the Intel/AMD architectures, a triple fault
will typically reset and reboot the CPU.

Normally, the kernel and CPU operate together on some very key data
structures. We mentioned the IDT, above. Theres also the GDT - which describes
how segments of memory map to real memory. And then theres the LDT - which
is a per-process view of memory. Corrupting any of these can
lead to double/triple fault behavior.

But theres another data structure: the page table directory. If the
page table is corrupted then all bets are off. The page table can be used to
indicate what blocks of memory are present/not-present in the system and
is the mechanism for virtual memory support. If the page table were
corrupt, then an application would generate a page fault interrupt and the
kernel would quickly shut down the offending process.

But what if the kernel version of the page table were corrupt? On an
interrupt, the CPU wouldnt be able to access the code to execute the
interrupt handler, which in turn would lead to a double fault, and thence
to a triple fault.

All of this is well documented on the web.

But I am having a hard time with dtrace on i386 architectures. After
loading dtrace, and then removing from the system, on a subsequent
reload of the driver, the system crashes/hangs. Most of the time there
is no output on the console; when there is output on the console,
its confused and corrupted. Which indicates that one of the
key data structures in the kernel is corrupt (IDT, Page Tables or GDT).

And, because of this, nearly impossible to debug. Nothing in the kernel
can help debug this scenario - we cannot print or signal what has happened
or where we were prior to the crash.

At the moment I am using the VirtualBox debugger to poke around after
a crash, but the debugger wont let me examine memory exactly because the
page table is corrupt (or the CR3 register is corrupt, but I cannot
tell the difference; CR3 is the register which points to the start
of the page table).

So, this is the worst bug to resolve - no kernel debugger, printk statements
or something in the kernel will help find the cause of the strange hang.
(Strangely, this problem does not exist in in the 64b kernel).




Post created by CRiSP v10.0.21a-b6145


2 comments:

  1. QEMU has similar capabilitie; including reading physical or virtual memory and acting as a gdb stub, which might be useful.

    http://wiki.qemu.org/download/qemu-doc.html#pcsys_005fmonitor

    describes it. Almost the hardest part is knowing that you access the monitor with Ctrl-Alt-2 (get back to the target system with Ctrl-Alt-1) in the SDL based viewer/VNC viewer

    Good luck!

    ReplyDelete
  2. Thanks for the details. I did try out kvm and qemu a long time ago, but have forgotten how to use them. Google helps. But am wary of whether I can use the VBox VDI drives without corrupting them. The qemu "gui"s are a little crude, but the CLI looks like a better option.

    I'll send a blog update when I am ready to reveal some of my findings (still not looking good...looks like a page is being blasted at random...even a while later than the offending page fault vector is modified).

    Very strange

    ReplyDelete