Wednesday 4 January 2012

Debugging with VirtualBox

Earlier, I wrote about the worst type of bug in the world -
one where we smash the internal CPU registers so badly, that nothing
recovers - no interrupts, no double/triple faults.

Ive been experimenting with the VirtualBox debugger, and its very
nice, albeit a little basic. Anyone interested in playing with this will
need to read the manual.

But heres an illustration of a CPU-smashing bug.

See

https://www.virtualbox.org/manual/ch08.html#vboxmanage-debugvm

If I run the following command, I can get a complete dump
of all registers in the CPU in the VM guest:


$ VBoxManage debugvm Ubuntu-11.10-i386 getregisters all | tee /tmp/reg
cpu0.rax = 0x0000000000000000
cpu0.rcx = 0x0000000000000000
cpu0.rdx = 0x0000000000000000
cpu0.rbx = 0x00000000c1644000
cpu0.rsp = 0x00000000c1645f80
cpu0.rbp = 0x00000000c1645f98
cpu0.rsi = 0x00000000c1698fb8
cpu0.rdi = 0x000000004fcb43de
cpu0.r8 = 0x0000000000000000
...


Now, I save this to a file, and then cause the host to crash. We
dump the registers again and now we can diff the results. We expect
to see lots of differences, but heres some of the key elements:


33,36c33,36
> cpu0.gs = 0x00e0
> cpu0.gs_attr = 0x00004091
> cpu0.gs_base = 0x00000000ecc05c00
> cpu0.gs_lim = 0x00000018
---
> cpu0.gs = 0x0000
> cpu0.gs_attr = 0x00010000
> cpu0.gs_base = 0x0000000000000000
> cpu0.gs_lim = 0x00000000


Not the GS register is smashed in the diff. Theres no base address for
the segment definitions, so any code trying to use GS will cause a
double/triple fault. Thats not good for the kernel.


98,99c98,99
> cpu0.cr2 = 0x00000000b78a0000
> cpu0.cr3 = 0x000000002a945000
---
> cpu0.cr2 = 0x00000000c1647040
> cpu0.cr3 = 0x0000000001748000
114c114
> cpu0.tsc = 0x89fd0226
---
> cpu0.tsc = 0x02307c70
119c119
> cpu0.msr_gs_base = 0x00000000ecc05c00
---
> cpu0.msr_gs_base = 0x0000000000000000


Register CR3 is the page table base address. In the crashed machine, CR3
looks "wrong". And the interactive VirtualBox debugger wont get very far with this
wrong value as it needs the page tables to map virtual addresses
to physical ones.

Likewise, the msr_gs_base (which is an internal register which holds
the place where the GS register is taken from, on a kernel switch) seems
corrupt.

This is why my guest is a smashed VM.

But, alas, I dont know whats causing this.

Still investigating....

Post created by CRiSP v10.0.21a-b6145


No comments:

Post a Comment