Everything was looking promising to release a new dtrace sometime last
week. It was working on the 3.16 kernel, 3.8, 3.4 and then onto RH5.6 (2.6.18).
I ran into a lot of issues on 2.6.18 - not surprising, given the code
mutations. Much of the last 2 weeks was on the execve() system call.
It would panic the kernel. Despite a lot of experiments and reading
of the assembler and kernel code, I kept doing silly things. It
really doesnt help that the 2.6.18 kernel will hard panic on a stray
GPF - made it very difficult to figure out what was going on.
Eventually I got every line of assembler and issues with registers in C
code to work.
Along the way I had an issue with the "old_rsp" symbol. This is not exposed
in /proc/kallsyms, and not even in the /boot/System.map code. I had
to write a tool to extract this from inside the kernel. But this ran into
complications because /proc/kcore is broken on the RH/Centos kernels. I
had to create a new device driver, which has to be loaded into the kernel
prior to the build of dtrace ("/proc/dtrace_kmem"). Its a very simple
driver only designed to handle the scenario of building dtrace.
Having got this work, then the next roadblock was the rt_sigreturn() syscall
which paniced the kernel. Careful investigation showed a missing line
of assembler (for the 2.6.18 kernel). Now that works.
Now everything is looking good on RH5/Centos5 but before going on the
trawl of later kernels and proving I didnt break anything, I have an
issue with x_call.c. Either I use the native smp_call_function() interface -
which works great, until we panic the kernel, or I use my implementation,
which doesnt seem to be broadcasting to the cpus - this means
certain probes get "lost".
So, hopefully this week or next weekend - depending on the xcall issues.