Sunday 22 July 2012

PID Provider: Did you call? #4

Creating a kernel thread in Linux is easy. But I immediately
slammed into some issues.

Much of the "workqueue" API for doing this is GPL protected.
DTrace is a CDDL driver, and attempts to compile or link against
these GPL protected functions caused errors. I found a workaround,
similar to the dynamic symbol lookup already in dtrace. The
implementation of this is slightly ugly due to the functions I wanted being
embedded in #define macros. I didnt want to replicate the macros directly
to modify them, as this makes the code frail and subject to breakage
in future kernels.

Additionally, the calling sequence of one of the functions has
changed in recent kernels (3.2 .. 3.4). This means I have to be
really careful. I worked around this with a tiny piece of assembler.

But from what I read, this workqueue API is only relatively recent
addition to the kernels. They appeared in the 2.5 kernels, changed
substantively in 2.6.20. So its possible that the code I have
which compiles for later kernels, will fail abysmally for older kernels.
The community will need to feedback, or we will have to disable PID provider
for older kernels.

So, we are done! PID provider works.

Well, I say it works .. it works for a sample app of mine. It needs
a lot more testing, and I daresay reported breakage will be difficult
to debug. The good thing is that I made almost zero changes to the
Solaris code - only fixing some glue code, and making some changes in
libdtrace.

If you have read all of these blog excerpts, and understood it, good
for you. I learnt a lot debugging this, and I feel more confident in
how dtrace works architecturally, and the code stuff I have done.

Theres still a long road ahead to torture test the PID
provider.

And I need to rewrite the libdtrace/ process read/write, to avoid
the ptrace() issue or avoid leaving a process in the stopped state.

I plan to release the code - once I have done a little cleanup,
later today (20120722).


Post created by CRiSP v11.0.10a-b6436


2 comments:

  1. Hi Paul,

    I have problem with compile DTrace on CentOS 5.x.
    Compilation fails in libdtrace with:

    cd libdtrace ; make --no-print-directory
    gcc -g -I. -I../../common/ctf -I../uts/common -I../linux -I../libproc/common -I../libctf/ -DCTF_OLD_VERSIONS -D_LARGEFILE_SOURCE=1 -D_FILE_OFFSET_BITS=64 -I../build-2.6.18-274.17.1.el5 -c ../build-2.6.18-274.17.1.el5/dt_lex.c
    dt_lex.l: In function 'yylex':
    dt_lex.l:256: warning: incompatible implicit declaration of built-in function 'strndup'
    dt_lex.l:280: error: called object '325' is not a function
    dt_lex.l:310: error: called object '325' is not a function
    dt_lex.l:362: error: called object '325' is not a function
    dt_lex.l:396: warning: incompatible implicit declaration of built-in function 'strndup'
    dt_lex.l:430: error: called object '325' is not a function
    dt_lex.l: In function 'yy_create_buffer':
    dt_lex.l:1219: error: expected expression before 'struct'
    dt_lex.l:1219: error: called object '325' is not a function
    dt_lex.l: In function 'yy_scan_buffer':
    dt_lex.l:1335: error: expected expression before 'struct'
    dt_lex.l:1335: error: called object '325' is not a function
    make[2]: *** [../build-2.6.18-274.17.1.el5/libdtrace.a(dt_lex.o)] Error 1
    make[1]: *** [do_cmds] Error 2
    tools/bug.sh
    make: *** [all] Error 1

    In this system I have installed:

    $ rpm -q glibc-devel gcc flex
    glibc-devel-2.5-81
    glibc-devel-2.5-81
    gcc-4.1.2-51.el5
    flex-2.5.4a-41.fc6

    $ uname -r
    2.6.18-274.17.1.el5

    Dtrace source code version: 20120722a

    ReplyDelete
    Replies
    1. OK I found what it was .. missing bison :)

      Delete