Tuesday 29 March 2011

dtrace pgfault handling

Just spent the last two weeks or so debugging one of those
"but it used to work!" bugs.

Heres the script:


$ dtrace -n syscall::open*:'{printf("%s", stringof(arg0));}'


It doesnt do anything useful - intercepts all the open system
calls, prints the name of the file to be opened. Because the
last part of the predicate is wildcarded, we match the "entry" and
"return" paths of the function.

On return from a function, arg0 and friends are mostly irrelevant, random
and pointless.

So - by doing this, stringof() is being called on a bogus pointer.
Which should lead to a GPF interrupt. This works well on the later
kernels.

But on RedHat AS4, it paniced the kernel.

After a lot of investigation, it transpires, on AS4, we are taking
a page fault, not a gpf. And my page fault handler was not handling
the fact that on a page fault, the CPU pushes an extra word on to the
stack.

So dtrace is/was dangerously unstable to rogue D scripts, like this one.

Very difficult to debug - because I would keep panicing the kernel
as I tried all sorts of experiments to locate the area of the problem
(the interrupt code and the C callback code). Having located most of the
problem to the interrupt code, it took quite a few days to work out what
was wrong (I was ignoring the extra word pushed on the taken fault).
But this was good - I had often stared at the Linux interrupt handlers
to understand the very subtle effect of how the traps are handled
vs the "struct pt_regs" layout. I was having problems with the pt_regs
pointer being "garbage" in the various dtrace pieces of code, and it
was because, even if I survived panicing the kernel, pt_regs was out
by one word.

Having exercised (exorcised) the code very hard, I feel much more confident
that a user cannot crash the kernel - just as Solaris had lead us to believe.

[I note that in the Solaris kernel, special code is in place in the
interrupt trade code (assembly), to determine if the CPU_DTRACE_NOFAULT
flag is set. This flag is set within the dtrace code to tell the
gpf/pgfault handler not to take the trap, but, to skip over the offending
instruction (which is most likely a MOV instruction)].

So, now we have a better handling of gpf + pgfault (although I still
worry if during the handling of a GPF, whether we can have a pgfault.
Not sure this matters, because if its *our* pgfault, then we only
skip over the offending instruction, we dont try to read other parts of
memory.

ctfconvert / libdwarf problems



Another fix I hope to have in this release is some improvements for
building which people are reporting to me, due to the changes in the
last release (stub dwarf.h added to the ctfconvert utility). AS4 doesnt
have a viable libdwarf.so library - so either I work out what it has
and patch the ctfconvert code, or add in a libdwarf release (which would
bloat the distro). The main problem here is we *need* ctfconvert if the
files in etc/*.d are to not cause a run-time syntax error, as kernel
structs are referred to in the translators.

I may have to patch the dtrace command to ignore such errors when
auto-inclusion is enabled when parsing user scripts.

Testing



Now I am getting more familiar with dtrace, I hope to include better
tests to avoid problems where things break. I probably wont enable this
in the next release, but definitely for the one after this.


Post created by CRiSP v10.0.3b-b5950


No comments:

Post a Comment