are part of the dtrace package. To date, I have relatively ignored
it - a few tweaks and it compiles nicely on Linux.
Recently, I have been playing with SDT probes - specifically formulating
a plan to get static probes into the kernel. These static probes
act like high level macros, compared to FBT, which is to do with
planting probes on functions. When a FBT probe fires, you have
access to the raw arguments (arg0, arg1, ...), but you dont really
have access to the structures these arguments may represent.
Lets look at "io:::start". In the Solaris kernel, these probes are
placed in various places to indicate when a file system driver is about
to do I/O. This is a high level probe - you dont care what filesystem is
in effect (UFS, NFS, ZFS), and you dont have to compute which are the
relevant functions to probe - nice and sweet.
But when these probes fire, arg0, arg1 and arg2 are defined to be
pointers to structures representing the buffer, file and device info
of the underlying vnode.
But *how does this work* ?
Beats me !
What happens is that the SDT provider knows about these high level probes
and the structures to be passed to the user space dtrace application.
These structures (struct buf *, fileinfo_t, devinfo_t) are "created"
by grabbing fields from relevant internal structures. DTrace has a thing
called a "translator" which is used to map from internal representation
to the D style structure. This avoids problems with trying to get the
real structures visible into the D application. (One would need kernel
level knowledge to get the #include's correct to even make the structures
What dtrace does is scan /usr/lib/dtrace/*.d and preload various "include-files"
as your script runs, to make certain constants and structures visible to you.
But how and where does a fileinfo_t structure get created?
I *think* this is done via the CTF (Compact Type Framework) library. CTF is
a simple way to describe structures and members without the full complexity
of DWARF debugging. So, what Sun has done is made sure all libraries
in the system have a special ELF section (.SUNW_ctf) and this section is
read from the libraries (for user space apps, or the kernel for kernel
probes) to find out what structures exist.
Alas, we dont have this ELF section in the executables in Linux.
So we are going to have to be a bit more clever to get access to the
To illustrate what I mean, consider this:
$ cat io.d
#pragma D option quiet
printf("%10s %58s %2s\n", "DEVICE", "FILE", "RW");
printf("%10s %58s %2s\n", args->dev_statname,
args->fi_pathname, args->b_flags & B_READ ? "R" : "W");
- Where does B_READ come from? (Answer: /usr/lib/dtrace/io.d)
- Where does "dev_statname" come from?
- How does dtrace know that args is convertable to a structure containing dev_statname?
The answer to the last two questions, I believe, belongs to the
And that is where I am heading off to -- to see how we can do this on Linux.