Saturday 31 May 2014

Metrics, Statistics and Code Quality

I've been toying with writing a code coverage tool - similar
to cov/gcov, but different. Its based on a piece of work I did a long
time ago - at the time, this was a debugger, but in playing with
the concept, and being surrounded by many good tools and scripting
languages, something "new" starts to appear.

The discussion below is about C/C++ - the results may not
be applicable for Java/.NET, Python/Perl.

Consider the following question: For your favorite application,
how many functions are called to start the application?

You can guess at the answer if you dont know the answer.

Follow up question: How many function *returns* do you estimate
happen during the program startup?

You are wrong! The number of function returns may not equal
(to a close approximation) the number of function calls.

I was surprised by this observation. I tested the 'cr' console
mode CRiSP startup. It runs about at about 550,000 function calls
to start it up. But, surprisingly, only about 447,000 function returns.

The reason is: inlining. An inlined function may show up
as a function entry, but may not show as a function returning.

Once we have a trace of the function executions, the possibilities
open up to all sorts of metrics generation.

We can look at the frequency - which functions are called the
most. Or we can look at duration. Duration can be problematic
for those functions which are called once, but never return (eg.
"main" will get called once, but wont return until the application
exits - or, maybe never, because exit() may be called).

The tracing tool I have logs every function call, line executed, and
function return, for offline analysis. Execution of CRiSP generates
a 500MB log file - quite hefty for a 'small' application.

Theres some other things that can easily be done here - such
as instrumenting certain instructions or functions to gain an insight
into other things. For example, it would be possible to trace
all mutex locks, or file I/O, or log the stack of specific scenarios.

Much of this may sound familiar, because gcov, strace and dtrace can do
variants of these. The point being that each tool excels at a specific
domain of monitoring, but almost none give you programmatic access
to detailed working of an app. dtrace comes close, with the D scripting
language, but its not really very good for user space introspection
(other than trapping function calls and stack traces).

If theres interest, I will publish the tool - its a simple Perl
script for the annotation recording, and a small C library. The tool
modifies the assembler code of your application (so you really need a
special area to build in - you dont want to test or distribute these
binaries, since there is a size and speed penalty to this optimisation;
I havent finished optimising - the execution penalty when the tracing
is disabled is small, but not small enough).


Post created by CRiSP v11.0.32a-b6738


No comments:

Post a Comment