Well, consider static dtrace providers (sdt). This is implemented
in Solaris/MacOSX by annotating the code with calls into Dtrace.
The way its done is like using USDT (user space dtrace probes) - the
location of a function call is turned into a NOP, and converted to a real
subroutine call (or trap) when the probe is enabled.
Now lets turn to Linux. Poor Linux.
We cant touch the kernel source - its easy to do, but getting the
kernel guys to adopt the changes is full of licensing issues. Never mind.
But what is a provider? Many of the core providers, like io::start, are
a bit like "macros": they are a short hand convenience for plopping
a probe on a number of locations in the kernel (along with an argument
calling convention), and then trapping any of the call spots. For example,
for "io::" all it is doing is putting traps around read/write blocking
syscalls (simplified explanation), for each file system type, e.g.
UFS, ZFS, NFS. (Internally in Solaris its done at the VFS layer, so the
number of places to patch is small).
So, how are we going to approach this in Linux?
Well, looking at the kernel source shows the likely places to place the
probes, but we need to do this at run-time (module load). The way to
do this is to compute a function which helps find the right probe area,
e.g. "3rd call function into the vfs_read() function, and all exits".
This is what I would call "formulaic". (Linux dtrace already has some
formulaic code to allow syscall interception).
Given the kernel can and will change in the future, finding a way
to map and annotate these layers in a fairly high level way is the key
to adding the static providers.
My first experiment will be the "io::" provider (because io::start and
io::done are very useful in many dtrace scenarios).
I will update the blog when I have something that looks reasonable.
Todays dtrace fix is for the interesting scenario of a 32-bit application
executing the SYSENTER syscall instruction on a 64-bit kernel.