Wednesday 2 March 2011

dtrace modules: kernel and linux (CTF shadowing)

Spent the last couple of weeks or so trying to fix a problem with
dtrace in handling kernel symbols.

In the original port of dtrace, I modified libdtrace/dt_module.c to
parse out /proc/kallsyms to emulate the /system/object filesystem under
Solaris. On Solaris, /system/object is the way to read the kernel
symbol table. /proc/kallsyms does that on Linux, but there is a fundamental
difference.

CTF

Under Solaris, the kernel is built with the CTF (compact type framework)
symbols. For every symbol in the kernel, we know the address of the
symbol. But the .SUNW_ctf ELF section contains the struct/typedef definitions.
Sun can do this because they build the kernel.

Over in Linux land, we arent building the kernel, so we cannot
assume the typedef info is available. (Many modern kernels are compiled
with -g and a full debug symbol table for tools like systemtap and
the kernel debugger, but we cannot mandate this is true for users of
dtrace).

So, my original code ended up with two "modules" in the dtrace data
structures - one is "kernel" which was pretty much useless, since we
had nothing, and one called "linux" (or maybe I had them the other
way around!). The linux module had all the symbols in /proc/kallsyms.

Consider this:


$ dtrace -n 'syscall::open*:{printf("%p", cur_thread);}'


Previously this would fail. (It would fail because cur_thread isnt a valid
Linux data symbol, but ignore that for now!). It failed because although
we can find the value/address in /proc/kallsyms, we didnt have any type
info for it. We could do a typecast to get the right effect, but this
rapidly gets annoying and messy when dealing with the hundreds of interesting
symbols in the kernel. Worse, we need some of these for correct
emulation of key data structures (like "curcpu").

So, what I am doing at the moment is handling this "shadow" module,
but having two modules in the kernel: "kernel" and "linux". "kernel"
contains a copy of /proc/kallsyms - i.e. the values, but "linux" contains
the CTF datatypes loaded from the build/linux-$version.ctf file (which is
simply an ELF file containing the .SUNW_ctf section).

This mapping is transparent to end user D scripts, and lets me concentrate
on fixing up "sched.d" to allow access to key info about CPUs, and move onto
to other required data structures.

Hope to have a new release in a few days which fixes this and gives me
a head start in allowing access to the full proc structure (task_struct
under Linux).

Post created by CRiSP v10.0.3b-b5945


No comments:

Post a Comment