Wednesday, 24 October 2012

DTrace and the Art of Xen

Someone reported that DTrace was failing - on an Amazon EC2 instance.
By all accounts, this should work - its a Ubuntu 12.04 kernel, after all.

Isnt short-sighted great?! Of course it works - I test on Ubuntu - all
the many Ubuntu kernels, as well as Fedora. How dare they report this
doesnt work!

Of course, as you slowly unravel the detective story, you realise
how right they are (facts dont lie) and how my world is shaped
in some imaginary Universe....

So, the issue is the Xen virtualisation. I know a little..very little..
about Xen - its paravirtualisation; and its in the kernel.

But what does that *mean* ? You can read Wikipedia and many web articles
and rarely does the whole picture fit together. And this is where
it gets interesting.

DTrace, runs in kernel space. Inside the Linux kernel is like
running inside MSDOS - you can execute any and every instruction,
and the good thing about every cpu since the 80286, is that the
segmentation and MMU support means that bugs can be trapped when
attempting to access out of bounds areas (GPF, page faults, or core dumps).

DTrace, in many respects, is simple, and kernel agnostic (it could
be ported to Windows, for instance. A rainy day project maybe).
DTrace needs to understand the interrupt descriptor table, some
aspects about page tables, and occasionally disabling interrupts.
Most of the bulk of Dtrace is implementing the virtual machine for
when traps occur.

This applies whether you are on real hardware or inside a VM,
such as VMWare or VirtualBox (and, I believe, KVM/QEMU).

But Xen is different. Xen runs the kernel VM almost as if the VM runs
in user space, and traps the instructions which require priviledge.
Its an illusion. Where VMWare and VirtualBox trap priviledged instructions,
like STI/CLI and SIDT/LIDT, Xen can do this, but provides an escape hatch
through which the VM guest has to communicate, asking the hypervisor
to do things for it. Theres complexity over things like page
table management - in VMWare/VBox, you can modify page table entries
and 'the right thing happens'. In Xen, you cannot.

All communication with Xen takes place, via a special "portal" - via
the SYSCALL instruction, sitting in a special page. The Linux kernel
wraps the key instructions and operations via an API. On real iron, those
instructions execute directly; in a Xen guest, the functions translate
to the API calls.

If you attempt to run DTrace (or a guest O/S) without these API wrappers,
the wrong things happen. And thats what happens to DTrace - GPFs where
none are expected.

I am working through the issues experimenting to do the right thing,
and will issue an update to DTrace for Xen when I have concluded this
avenue of research.

For anyone who is interested, here is a link which describes in
some detail, aspects of page table management in Xen - which helps
reinforce that there is a "right way" for Xen.


Post created by CRiSP v11.0.12a-b6455

1 comment:

  1. Link is invalid: