<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-8336326562741944626</id><updated>2012-03-14T15:54:21.521-07:00</updated><title type='text'>CRiSP, DTrace, and other technobabble</title><subtitle type='html'>Technoblog with some random mutterings and ramblings</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default?start-index=101&amp;max-results=100'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>181</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4324173951647995856</id><published>2012-03-14T15:54:00.001-07:00</published><updated>2012-03-14T15:54:21.527-07:00</updated><title type='text'>Blowfish</title><content type='html'>I've added blowfish support to CRiSP - mainly because I wanted to&lt;br /&gt;(add some basic encryption facilities - reasons are too boring to go&lt;br /&gt;into). Each time I added encryption, its been a nuisance - no&lt;br /&gt;matter whether its my own coding or open source offerings.&lt;br /&gt;&lt;br /&gt;Its important to ensure the code compiles identically across all &lt;br /&gt;platforms - any compiler bugs, language undefined behavior, or&lt;br /&gt;word sizes - can mean the difference between a file or block not&lt;br /&gt;being decryptable or checksummable on another platform. This behavior&lt;br /&gt;could go unnoticed for years. E.g. I used to do all my development&lt;br /&gt;on Solaris; after a number of years, I switched to Linux. Linux&lt;br /&gt;radically changes over the course of the years. And it would be "not nice"&lt;br /&gt;to find backups of files unreadable due to such a difference in behavior.&lt;br /&gt;&lt;br /&gt;Even high quality open source code can blow up (silently) due to&lt;br /&gt;bugs in compilers or 32bit vs 64bit differences. Fortunately, most&lt;br /&gt;modern code is aware of these things, although many RFCs dont understand&lt;br /&gt;the real world (they provide sample algorithms, which may be proven,&lt;br /&gt;years later, to have buggy example implementations).&lt;br /&gt;&lt;br /&gt;Good code comes with self-sanity checking (eg encrypt a block and decrypt&lt;br /&gt;should return the right/same answer). But there is little intermediate&lt;br /&gt;checking. Blowfish does this. Heres some sample code:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  Blowfish_Init (ctx, (unsigned char*)"TESTKEY", 7);&lt;br /&gt;  Blowfish_Encrypt(ctx, &amp;L, &amp;R);&lt;br /&gt;  if ((L &amp; 0xffffffffU) != 0xDF333FD2L || (R &amp; 0xffffffffU) != 0x30A71BB4L) {&lt;br /&gt;  	printf("blowfish_encrypt: L=%lx R=%lx\n", L, R);&lt;br /&gt;	printf("wanted: 0xDF333FD2L 0x30A71BB4L\n");&lt;br /&gt;    	return (-1);&lt;br /&gt;  }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, when the assertion fails - tracking down the lines of code which&lt;br /&gt;caused the issue is painful - every line of implementation code will&lt;br /&gt;"look" correct, but subtleties in sign-extension or zero filling or&lt;br /&gt;other ISO/C behavior can make obvious code never show the true issue.&lt;br /&gt;&lt;br /&gt;All code should ideally have complete test cases, but you dont really&lt;br /&gt;know what to test for, until much much later, and by then, you may&lt;br /&gt;have even forgotten how it (or your own) code works.&lt;br /&gt;&lt;br /&gt;I've recently been fixing annoyances in CRiSP and adding small/minor features.&lt;br /&gt;&lt;br /&gt;One issue I was recently tracking down was a memory leak in the MacOS port.&lt;br /&gt;MacOS and the development environment is really nice. Using the GUI (X-Code)&lt;br /&gt;is too heavyweight on my frail Mac, but the command line tools are good.&lt;br /&gt;("leaks" is a very nice tool). I spent a lot of effort tracking down&lt;br /&gt;a leak and fixing a piece of code, only to find that in trying to find&lt;br /&gt;a memory leak, involves the Mac tools *causing* a memory leak. (Effectively&lt;br /&gt;no memory is freed when one of the malloc debug options are turned on;&lt;br /&gt;took me ages to realise I had fixed my leak, but whilst monitoring&lt;br /&gt;for leaks, just showed process size growing as malloc was trying to detect&lt;br /&gt;leaks. Sometimes, its the obvious things).&lt;br /&gt;&lt;br /&gt;Whilst on the subject of tools, I finally took the plunge and tried&lt;br /&gt;out Clang (2.9). Its a nice tool for static code analysis - and managed&lt;br /&gt;to uncover a few latent bugs in CRiSP, that I never knew existed. Sometimes&lt;br /&gt;I do like compilers with new features which are helpful. (Many gcc&lt;br /&gt;compiler warnings just generate noise, as does Visual Studio).&lt;br /&gt;&lt;br /&gt;Meanwhile...back to figuring out why blowfish is not doing what&lt;br /&gt;is expected with my bits....&lt;br /&gt;&lt;br /&gt;[Will resume dtrace shortly - I need to forget how it works, before&lt;br /&gt;tackling the next issue on the table - pid provider].&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.29a-b6234&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4324173951647995856?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4324173951647995856/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/03/blowfish.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4324173951647995856'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4324173951647995856'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/03/blowfish.html' title='Blowfish'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1358942782600590710</id><published>2012-02-28T14:30:00.001-08:00</published><updated>2012-02-28T14:30:52.505-08:00</updated><title type='text'>dtrace4linux - support and issues</title><content type='html'>I would like to thank the audience of people using and&lt;br /&gt;trying out the DTrace/Linux port (now hosted at github and&lt;br /&gt;my other tarball downloads).&lt;br /&gt;&lt;br /&gt;Can I ask people - when facing compile issues - to ensure they&lt;br /&gt;have all prerequisites before reporting compilation issues.&lt;br /&gt;&lt;br /&gt;It is my goal to have DTrace work on all releases of&lt;br /&gt;Linux - both forward and backward releases, but time and space&lt;br /&gt;doesnt permit validating each release (of Linux) - especially&lt;br /&gt;older ones. I do attempt to look at Ubuntu and Fedora (usually after&lt;br /&gt;a prod by the community). I try to avoid downloading the latest&lt;br /&gt;Ubuntu releases until at least a week or two has passed, so that&lt;br /&gt;I can feel more comfortable I am not going to suffer resume+suspend or&lt;br /&gt;wifi or video glitches, like I have done in the past.&lt;br /&gt;&lt;br /&gt;One thing I have done brazenly and badly, is keep track of what I&lt;br /&gt;installed on my system in terms of packages. When I first started&lt;br /&gt;DTrace, it was not a virgin Linux distro, but one polluted with my&lt;br /&gt;favorite development packages.&lt;br /&gt;&lt;br /&gt;By the time the first DTrace port went out, I couldnt tell the&lt;br /&gt;difference between a virgin install and my own system. Over time, I have&lt;br /&gt;realised this is important for newcomers who download and try out&lt;br /&gt;DTrace, that it works "out of the box", and have attempted to create&lt;br /&gt;scripts (get-deps.pl) to semi-automate updating your system with the&lt;br /&gt;required packages. Even for Fedora and Ubuntu, and 32 and 64 bit variants,&lt;br /&gt;validating that the script works is nearly impossible.&lt;br /&gt;&lt;br /&gt;I hope to do better in the future, but there can be no guarantee.&lt;br /&gt;&lt;br /&gt;One of the commonest issues reported is missing header files, e.g.&lt;br /&gt;for 32-bit compiles. Even doing a package search using yum or apt-get&lt;br /&gt;or whatever the package installer of choice is called, is a nuisance - as&lt;br /&gt;you get flooded with possible matching libraries. Most of it makes sense&lt;br /&gt;to me, but it likely confuses newcomers to Linux, or people who&lt;br /&gt;are not programmers. Unfortunately, that is life on Linux.&lt;br /&gt;&lt;br /&gt;(Maybe I should be adopting a standard RPM format so that the&lt;br /&gt;dependencies can be described properly; something for a different&lt;br /&gt;rainy day).&lt;br /&gt;&lt;br /&gt;At the moment, for a short while, I am switching my focus back&lt;br /&gt;to CRiSP - adding features and enhancements; I find it good to switch&lt;br /&gt;back and forth from DTrace to CRiSP, as I sometimes lose focus on&lt;br /&gt;what I am trying to do. DTrace for Linux should be in a good state, and&lt;br /&gt;there are a lot of miniprojects to work on (I had started on the CPC&lt;br /&gt;provider, and theres more SDT probes to work through, along with&lt;br /&gt;refinements on the INSTR provider).&lt;br /&gt;&lt;br /&gt;If people find DTrace good for them, feel free to publicise or drop&lt;br /&gt;me a mail, so that I know it is worthwhile.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.26a-b6206&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1358942782600590710?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1358942782600590710/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/dtrace4linux-support-and-issues.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1358942782600590710'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1358942782600590710'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/dtrace4linux-support-and-issues.html' title='dtrace4linux - support and issues'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4910616937139746947</id><published>2012-02-20T12:34:00.001-08:00</published><updated>2012-02-20T12:34:45.473-08:00</updated><title type='text'>Doing a dis-service to your fans</title><content type='html'>I like Amazon - it has lots of nice features - I wont go into them.&lt;br /&gt;Most people agree, some may disagree.&lt;br /&gt;&lt;br /&gt;One feature I like, when I am mentally challenged, is the recommendations.&lt;br /&gt;Based on browsing or purchase history, it learns what you like and suggests&lt;br /&gt;related material. In general, this works really well, and  occasionally&lt;br /&gt;pops into view, Music or Videos or Series I may be interested in.&lt;br /&gt;&lt;br /&gt;One thing which I find very annoying - and its not Amazons fault - is&lt;br /&gt;the way the music industry takes loyal fans and treats them strangely.&lt;br /&gt;&lt;br /&gt;I have listed with Amazon that I have and like Pink Floyd. I am totally&lt;br /&gt;surprised at how many, for example, "Dark Side of the Moon" albums there&lt;br /&gt;are. "Basic", "Advanced", "Intermediate Edition", "With Bells on", "Advanced&lt;br /&gt;sound" (I made these up!). So, my "recommendations" consists of 5 copies&lt;br /&gt;of each of their albums. I have the albums - even a few a couple of times&lt;br /&gt;over.&lt;br /&gt;&lt;br /&gt;But I dont know what to do ! I could rate each variant as "I like/5-star",&lt;br /&gt;in which case it might just dig out more versions of the same things or&lt;br /&gt;other music I have, and I end up with no useful recommendations.&lt;br /&gt;&lt;br /&gt;I could say "Not interested" to Amazon, but will that mean it thinks I&lt;br /&gt;dislike "Pink Floyd" or will stop showing me the variants across all&lt;br /&gt;types of music.&lt;br /&gt;&lt;br /&gt;Oh well. The wonders of technology.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.26a-b6199&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4910616937139746947?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4910616937139746947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/doing-dis-service-to-your-fans.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4910616937139746947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4910616937139746947'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/doing-dis-service-to-your-fans.html' title='Doing a dis-service to your fans'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-112551030777253074</id><published>2012-02-14T14:23:00.001-08:00</published><updated>2012-02-14T14:23:08.214-08:00</updated><title type='text'>DTrace and the CPC provider</title><content type='html'>I've been looking at profiling systems on Linux and other OS's. Its&lt;br /&gt;an interesting landscape. With the advent of ever more powerful&lt;br /&gt;CPUs over the last decade, along with multicore and the impact of&lt;br /&gt;cache misses, its necessary for people to have low level tools&lt;br /&gt;to do performance analysis.&lt;br /&gt;&lt;br /&gt;Performance measuring is a large topic - I can only cover it briefly&lt;br /&gt;here. Statistical sampling (similar to classic Unix "prof" and "gprof"),&lt;br /&gt;is great for weeding out hot spots in code. The first time you profile,&lt;br /&gt;its easy to quickly find areas to optimise.&lt;br /&gt;&lt;br /&gt;After a while, using those tools runs out of steam. In multithreaded &lt;br /&gt;applications and multicore CPUs, other factors quickly come into play,&lt;br /&gt;e.g. lock contention, cache misses etc.&lt;br /&gt;&lt;br /&gt;The Intel and AMD chips provide quite sophisticated counters for measuring&lt;br /&gt;all sorts of things you may never have thought about. Unfortunately,&lt;br /&gt;not only are they different between Intel and AMD, but the counters&lt;br /&gt;supported will vary by chip family. (I dont even know if every&lt;br /&gt;new CPU is a superset of all older ones).&lt;br /&gt;&lt;br /&gt;In user space, tools like "oprofile" and "perf" provide a way to gain&lt;br /&gt;access to these counters, and are great for deeper diving into hospots.&lt;br /&gt;You may know 90% of your time is spent in a matrix multiply, but you&lt;br /&gt;may not realise that 50% of that time is wasted in cache-thrashing.&lt;br /&gt;&lt;br /&gt;Linux has had a varied past not adopting, and subsequently adopting &lt;br /&gt;profiling subsystems, and although it should be easy, it isnt. The&lt;br /&gt;difficulty of cpu family differences, and complexity due to the&lt;br /&gt;hardware of a system, means that providing a chip-independent API&lt;br /&gt;is difficult.&lt;br /&gt;&lt;br /&gt;In recent years, AMD and Intel have provided new monitoring facilities&lt;br /&gt;which aim to allow instruction accurate samples to be made of performance.&lt;br /&gt;(Prior facilities relied on counters and interrupts which couldnt&lt;br /&gt;pinpoint the exact instruction, e.g. where a cache miss occurred).&lt;br /&gt;&lt;br /&gt;In Solaris, and DTrace, they added the CPC provider - which allows&lt;br /&gt;probes to be placed based on the counter interrupts. The documentation&lt;br /&gt;is somewhat vague, because everyone is trying hard not to&lt;br /&gt;replicate the Intel/AMD documents which list the counters, since they&lt;br /&gt;evolve so rapidly. The CPC provider is not (currently) in Linux/Dtrace.&lt;br /&gt;Its been on my TODO list and I am just checking it out. It relies on&lt;br /&gt;Solaris handling user level requests and abstracts the CPU away, but,&lt;br /&gt;reading on the web, appears to suffer from inability to handle&lt;br /&gt;the "new style" counters from AMD and Intel.&lt;br /&gt;&lt;br /&gt;[I believe that the old style counters are simply counters which can&lt;br /&gt;be set up to generate an interrupt, either on reaching a threshhold&lt;br /&gt;or on a periodic basis, ie sampling based monitoring. The new counters&lt;br /&gt;likely require an area of RAM to fill up, and the code in Solaris,&lt;br /&gt;and probably Linux may not be ready to support this, at least not on older&lt;br /&gt;kernels].&lt;br /&gt;&lt;br /&gt;I may experiment with adding a CPC provider, just because I am interested&lt;br /&gt;in seeing these counters and the issues they present. &lt;br /&gt;&lt;br /&gt;[I have tried oprofile, and hit problems since it does not work inside&lt;br /&gt;a VM; the newer 'perf' subsystem does appear to work inside a VM, but requires&lt;br /&gt;rebuilding the kernel to enable the subsystem].&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.25a-b6193&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-112551030777253074?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/112551030777253074/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/dtrace-and-cpc-provider.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/112551030777253074'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/112551030777253074'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/dtrace-and-cpc-provider.html' title='DTrace and the CPC provider'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5015064767155090856</id><published>2012-02-08T13:30:00.001-08:00</published><updated>2012-02-08T13:30:53.009-08:00</updated><title type='text'>github #2</title><content type='html'>I guess peoples first reaction to the last post is "Great! Now, what is the&lt;br /&gt;link?!"&lt;br /&gt;&lt;br /&gt;&lt;a href='https://github.com/dtrace4linux/linux'&gt;https://github.com/dtrace4linux/linux&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I did say I was getting to grips with it!&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.25a-b6181&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5015064767155090856?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5015064767155090856/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/github-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5015064767155090856'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5015064767155090856'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/github-2.html' title='github #2'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8608910702272466079</id><published>2012-02-08T13:29:00.001-08:00</published><updated>2012-02-08T13:29:56.901-08:00</updated><title type='text'>github</title><content type='html'>Ive created a new github repository to host the dtrace code.&lt;br /&gt;I will continue to do the tarballs, and people can (hopefully)&lt;br /&gt;watch the github repository to pick up deltas and see change&lt;br /&gt;history.&lt;br /&gt;&lt;br /&gt;I did create it the other day, and have to destroy/recreate a couple&lt;br /&gt;of times, due to the script I used to replay the history from all &lt;br /&gt;the tarballs having some blips in them.&lt;br /&gt;&lt;br /&gt;I can offer no support on this - hopefully it is helpful, and I am&lt;br /&gt;still getting to grips with it, and git, myself.&lt;br /&gt;&lt;br /&gt;This release includes some USDT fixes. But I suspect many more to come!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.25a-b6181&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8608910702272466079?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8608910702272466079/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/github.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8608910702272466079'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8608910702272466079'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/github.html' title='github'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5578451336214576980</id><published>2012-02-02T15:12:00.001-08:00</published><updated>2012-02-02T15:12:06.346-08:00</updated><title type='text'>Its raining.</title><content type='html'>No its not raining, but its mighty cold...&lt;br /&gt;&lt;br /&gt;Started playing around with USDT - need to iron out some bugs. Alas,&lt;br /&gt;if you run the simple-c example app, and reload the driver, it will&lt;br /&gt;panic the kernel. C'est la vie.&lt;br /&gt;&lt;br /&gt;Hope to fix in the next day or two.&lt;br /&gt;&lt;br /&gt;Its worth briefly describing "why". When you run an application&lt;br /&gt;which has user space probes (USDT), the application will talk to the&lt;br /&gt;dtrace driver and dynamically create new probes against the PID&lt;br /&gt;of the process and the probes it creates. You can see these by&lt;br /&gt;running "dtrace -l" and diffing a before and after scenario.&lt;br /&gt;&lt;br /&gt;Alas, when you terminate the process, the USDT probes arent removed,&lt;br /&gt;and this tickles a problem (which I need to solve).&lt;br /&gt;&lt;br /&gt;What dtrace is trying to do is monitor processes as they die, and&lt;br /&gt;removing these stale probes, but it is not.&lt;br /&gt;&lt;br /&gt;Now that my other dtrace problems appear to be over (subject to any&lt;br /&gt;naughty regressions I introduce), I can spend a little more time&lt;br /&gt;on USDT and go into more detail.&lt;br /&gt;&lt;br /&gt;One area to understand is how a USDT works. I have written about this&lt;br /&gt;before and theres some good web articles on this. The technology is remarkably&lt;br /&gt;simple - but the implementation requires everything to be "just right"&lt;br /&gt;(we are dealing with kernel and user space, after all).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.23a-b6166&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5578451336214576980?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5578451336214576980/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/02/its-raining.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5578451336214576980'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5578451336214576980'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/02/its-raining.html' title='Its raining.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5496653966297616685</id><published>2012-01-29T14:21:00.001-08:00</published><updated>2012-01-29T14:21:36.005-08:00</updated><title type='text'>vmalloc_sync_all</title><content type='html'>Having "understood" my nested page fault issues, I have been trying&lt;br /&gt;to finalise the code changes. However, any attempt to do so leads me&lt;br /&gt;to a lot of pain.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt::page_fault:&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;is a dangerous thing to do - we intercept the page fault handler.&lt;br /&gt;But the page fault handler can be called if a D script tries&lt;br /&gt;to access an unmapped page. We could deter users from putting a probe&lt;br /&gt;on page_fault, but that seems a real shame - thats a very useful&lt;br /&gt;and interesting function to probe.&lt;br /&gt;&lt;br /&gt;This works brilliantly on x86/64 systems but fails abysmally on i386.&lt;br /&gt;Having chased the problem down to the issue of kernel page tables&lt;br /&gt;and user process page tables disagreeing about what is "visible", and&lt;br /&gt;the way the kernel does "lazy page table population", its very difficult&lt;br /&gt;to stop a page fault, for instance, in the breakpoint handler.&lt;br /&gt;&lt;br /&gt;(We hit the page_fault function, which generates a breakpoint trap to&lt;br /&gt;execute the probe, but whilst processing the breakpoint trap, we induce&lt;br /&gt;a page_fault trap: BOOM!)&lt;br /&gt;&lt;br /&gt;I've experimented with various mechanisms to avoid these lazy page&lt;br /&gt;faults. Theres a function in the kernel: vmalloc_sync_all() which&lt;br /&gt;ensures all page tables are in sync with the kernel - so that minor&lt;br /&gt;page faults cannot happen. If I ensure this is called during the driver&lt;br /&gt;load, then the problem of a nested page fault appears to go away.&lt;br /&gt;&lt;br /&gt;(This is a better job than the code I wrote which does something similar&lt;br /&gt;but only for specific locations in dtrace itself; vmalloc_sync_all is&lt;br /&gt;a generalised function to sync all page tables of all processes).&lt;br /&gt;&lt;br /&gt;So, I will need to recode and remove the cruft from my work-in-progress&lt;br /&gt;dtrace.&lt;br /&gt;&lt;br /&gt;(I am trying to track down if vmalloc_sync_all is called by the x86/64&lt;br /&gt;kernel - but not the i386 one; it would certainly explain why I see such&lt;br /&gt;a difference in behavior when tracing the page_fault code).&lt;br /&gt;&lt;br /&gt;More later this week if I can successfully resolve this issue, once and&lt;br /&gt;for all.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.23a-b6159&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5496653966297616685?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5496653966297616685/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/vmallocsyncall.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5496653966297616685'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5496653966297616685'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/vmallocsyncall.html' title='vmalloc_sync_all'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-457054788206874128</id><published>2012-01-25T12:56:00.001-08:00</published><updated>2012-01-25T12:56:35.135-08:00</updated><title type='text'>Impossible progress</title><content type='html'>Nigel was asking me today why I was bothering to spend so much&lt;br /&gt;time on a bug which is uninteresting. And if this issue happens&lt;br /&gt;on i386, why dont we see it on x64.&lt;br /&gt;&lt;br /&gt;Lets catch up: dtrace, when reloaded on an i386 system, can panic or&lt;br /&gt;hang the system. This doesnt happen on x64.&lt;br /&gt;&lt;br /&gt;As much as I like to dismiss i386 as yesterdays technology, it demonstrates&lt;br /&gt;that *something* is wrong. Ignoring this warning sign is perilous.&lt;br /&gt;Before going into the subject in detail: why after a driver load?&lt;br /&gt;Why not half way through debugging your production system?&lt;br /&gt;&lt;br /&gt;The underlying scenario may be rare but without the deep understanding,&lt;br /&gt;such a tool can never promote itself in the reliability stakes.&lt;br /&gt;&lt;br /&gt;Ok, so lets deep dive. Every process has a page table - which describes&lt;br /&gt;what the process can see. In Linux, process #0 (the 'swapper') also&lt;br /&gt;has a page table, but its a "master page table". It describes what the&lt;br /&gt;kernel can see.&lt;br /&gt;&lt;br /&gt;A process is most of the time dealing with its own address space, but&lt;br /&gt;on a system call or interrupt, we are dealing with the kernel. The CPU&lt;br /&gt;contains the circuitry to allow the kernel space to be visible when&lt;br /&gt;the interrupts or system calls happen.&lt;br /&gt;&lt;br /&gt;But, how does the kernel map and the per-process map keep in sync? When&lt;br /&gt;you load a device driver (or even plug in a USB drive, for instance),&lt;br /&gt;the kernel will allocate space for the code and data. This belongs to&lt;br /&gt;the swapper/kernel. If whilst your process is executing, the&lt;br /&gt;USB drive generates an interrupt, leading to the USB driver executing,&lt;br /&gt;it will do so in the context of your process page table. You cannot&lt;br /&gt;see this (normally). But those pages are *not* in your page table.&lt;br /&gt;&lt;br /&gt;So, as the CPU tries to jump or access this memory, a page fault&lt;br /&gt;will be generated. The page fault handler *IS* in your page table&lt;br /&gt;(as is the whole monolithic kernel). The page fault handler will realise&lt;br /&gt;the page fault happened in kernel space, and will notice that&lt;br /&gt;the swapper page table and your process page table do not agree.&lt;br /&gt;It will copy the offending page table entry from kernel(swapper) to your&lt;br /&gt;process. And the system will continue - as if "by magic". (Function vmalloc_fault()&lt;br /&gt;is the one that does this magic).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Linux/Dtrace is special&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Linux/dtrace is very special compared to all other implementations of&lt;br /&gt;dtrace and all other drivers on Linux: it is not only dynamically loaded,&lt;br /&gt;but it contains a page fault handler. (Why? Because when you do silly&lt;br /&gt;things in your D scripts, dtrace wants to prevent you from panicing the&lt;br /&gt;kernel; it has to intercept invalid page faults caused by D scripts;&lt;br /&gt;it doesnt care about normal page faults, and leaves the kernel to do its&lt;br /&gt;stuff).&lt;br /&gt;&lt;br /&gt;If the page fault handler is not in the user page table (why should it?&lt;br /&gt;after a module load, it wont be), then we are in dangerous territory.&lt;br /&gt;You cannot simply ignore a "page fault" - you *must* process it. So, heres&lt;br /&gt;the scenario: when dtrace is loaded, it only exists in kernel page&lt;br /&gt;tables - not in any processes page table. Under normal use of dtrace,&lt;br /&gt;invoking probes or syscalls, the act of these probes firing would cause&lt;br /&gt;a page fault to ensure the dtrace code is mapped into the process table&lt;br /&gt;of the process.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What is happening...&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After dtrace is loaded, we have two scenarios to consider: system processes&lt;br /&gt;(especially kernel threads, irqbalance, etc) and user procs. The system&lt;br /&gt;processes run in kernel space and have the page fault handler mapped.&lt;br /&gt;(In theory these system procs shouldnt have page faults, but they&lt;br /&gt;might do). The user procs have no knowledge of dtrace, and as they page fault,&lt;br /&gt;the CPU will try to invoke the page fault handler &lt;b&gt;which is not mapped&lt;br /&gt;into the user proc page table&lt;/b&gt;. This causes another fault and&lt;br /&gt;we eventually have stack overflow, page table corruption and a double&lt;br /&gt;fault.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The solution&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The solution is to ensure that every process has the page fault&lt;br /&gt;handler mapped as the module is loaded. Ive written/borrowed code&lt;br /&gt;to walk the process table, and ensure the page fault handler&lt;br /&gt;is properly "faulted" into the per-process page table.&lt;br /&gt;&lt;br /&gt;My first experiments were a failure: even the tiniest of coding blips&lt;br /&gt;will show up as a crash/hang/panic. After validating the code very&lt;br /&gt;carefully: it appears to work.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Forking&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;When a process forks, the new process gets a copy of the same page&lt;br /&gt;tables as its parent. So, if a process has the page fault handler&lt;br /&gt;mapped, so will its child. I.e. we just need to "seed" every process&lt;br /&gt;on the module load, and we are done.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why doesnt this happen on x64?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I believe the reason is: probability. It *can* happen but either has not&lt;br /&gt;been observed, or was assumed to be a different bug. I havent&lt;br /&gt;directly measured this (yet) on x64, but all it requires is that the&lt;br /&gt;page where the dtrace page fault handler is loaded into memory,&lt;br /&gt;be mapped into every processes page table *by accident*. This might&lt;br /&gt;happen due to bootup modprobes and other things, or it could be caused&lt;br /&gt;by the layout of the page table directory structure leading to &lt;br /&gt;likelihood of dtrace being on a previously mapped page. &lt;br /&gt;(Maybe even the layout of the ELF format module file might "help"). &lt;br /&gt;But on a large memory system, it might not be sufficient and it is likely&lt;br /&gt;the same bug would crash - at the least opportune time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What next?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Next up is to tidy up the horror of code bodges I have in my VM&lt;br /&gt;and push back to the master dtrace source; see if I can prove it could&lt;br /&gt;be a problem in x64, and ensure the new code is x64 palatable.&lt;br /&gt;(The Linux kernel, in arch/x86/mm/fault.c has two implementations of&lt;br /&gt;vmalloc_fault - one for i386 and one for x64, so I cannot assume the&lt;br /&gt;i386 "fix" will work for x64).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-457054788206874128?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/457054788206874128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/impossible-progress.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/457054788206874128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/457054788206874128'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/impossible-progress.html' title='Impossible progress'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-9009155391135938846</id><published>2012-01-19T15:16:00.001-08:00</published><updated>2012-01-19T15:16:27.262-08:00</updated><title type='text'>Free advertising! Get CRiSP Now!</title><content type='html'>I rarely do what I need to do...push CRiSP into your face.&lt;br /&gt;You are all so busy grabbing copies of dtrace, I can't spoil&lt;br /&gt;your party :-)&lt;br /&gt;&lt;br /&gt;After reading about SOPA/PIPA over the last few days and&lt;br /&gt;the current fate of megaupload.com, I thought I would do a search&lt;br /&gt;for "crisp editor crack".&lt;br /&gt;&lt;br /&gt;Quite surprised to see how many sites are hosting the software - even&lt;br /&gt;out of date versions. I'm actually quite proud that it should turn up&lt;br /&gt;on every warez site I have never heard of.&lt;br /&gt;&lt;br /&gt;I'm looking at one site, where the file size is posted as 58.6MB. Given&lt;br /&gt;that my downloads (admittedly, compressed) are less than 9MB, thats&lt;br /&gt;pretty impressive. I wander what they put into the download.&lt;br /&gt;&lt;br /&gt;Its possible some of these are genuine outlets for the software (no&lt;br /&gt;genuine outlet would accept payment, so I am fairly certain this is fake).&lt;br /&gt;&lt;br /&gt;I am looking at a URL which looks really nice - nice layout, lots of relevant &lt;br /&gt;decoration, and not very good details. I wont post the site link, except&lt;br /&gt;to say that crisp appears in the domain name.&lt;br /&gt;&lt;br /&gt;Its possible this is just a DNS grab and the page is almost certainly&lt;br /&gt;automatically generated.&lt;br /&gt;&lt;br /&gt;Its actually impressive the amount of effort people have put into auto&lt;br /&gt;generating fake sites, and CRiSP is in their targets. Thanks! I am impressed.&lt;br /&gt;&lt;br /&gt;In case you are interested, just google "crisp editor crack" to get&lt;br /&gt;a feel. The "lifted" text similarities are interesting.&lt;br /&gt;&lt;br /&gt;I suspect what you see out there is the union of two things: catalogs&lt;br /&gt;of software - maybe from websites or old shareware type listings, along&lt;br /&gt;with web site generators and automation by cheap labour to flood sites&lt;br /&gt;with everything other than the thing they are purporting to sell. I would&lt;br /&gt;expect all of these to be virus/trojan carrying candidates. One site I am&lt;br /&gt;looking at shows a reasonable filesize, but I cant be bothered to download&lt;br /&gt;and verify the version to see if its real. (The one I am looking at&lt;br /&gt;actually looks really genuine, carrying my partners web logo).&lt;br /&gt;&lt;br /&gt;So, if you want to get the best programmers editor ever invented, and&lt;br /&gt;want to change your life, then buy CRiSP N*O*W*! :-)&lt;br /&gt;&lt;br /&gt;(I'll update dtrace over the weekend...I know what the problem is, and&lt;br /&gt;the solution is slightly elusive).&lt;br /&gt;&lt;br /&gt;Ok, boredom set in ... lets google search "dtrace crack" - not so many&lt;br /&gt;links to choose from. One is a very interesting link, regarding an E-Book by&lt;br /&gt;Jon Haslam. (Apologies to Jon - I dont know him). Sitting on that page&lt;br /&gt;is a lot of links to Playboy/Penthouse forums.&lt;br /&gt;&lt;br /&gt;So, even dtrace gets the "Web" treatment.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-9009155391135938846?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/9009155391135938846/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/free-advertising-get-crisp-now.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9009155391135938846'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9009155391135938846'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/free-advertising-get-crisp-now.html' title='Free advertising! Get CRiSP Now!'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2148916676370125325</id><published>2012-01-17T14:30:00.001-08:00</published><updated>2012-01-17T14:34:38.433-08:00</updated><title type='text'>Update on the impossible</title><content type='html'>I've been walking through the scenario of why some kernel&lt;br /&gt;addresses are not visible to some processes. Think of a block of&lt;br /&gt;memory allocated by the kernel for internal use, but triggering&lt;br /&gt;a page fault, e.g. because page is swapped, or hasnt been touched&lt;br /&gt;yet, by the user space.&lt;br /&gt;&lt;br /&gt;When the page fault handler is invoked, the address of the buffer&lt;br /&gt;exists only in some process page tables.&lt;br /&gt;&lt;br /&gt;Turns out this is a (nice!) clever trick of the kernel. When things&lt;br /&gt;are allocated in the kernel, and should be visible to all processes,&lt;br /&gt;e.g. a driver module or other buffer, when the page fault kicks in,&lt;br /&gt;a check is made if the page is valid in the "master page table".&lt;br /&gt;Process #0, created on kernel boot up, houses the master table.&lt;br /&gt;The currently running process may not have the mapping in place, and instead&lt;br /&gt;of paying a large cost to update all processes page tables to represent&lt;br /&gt;these kernel pages, the page fault handler will update the local process&lt;br /&gt;page table when the fault occurs.&lt;br /&gt;&lt;br /&gt;This explains why some processes can see the page in question, and, others&lt;br /&gt;can not.&lt;br /&gt;&lt;br /&gt;Bear in mind we are putting in place a page-fault interrupt handler.&lt;br /&gt;This *must*, repeat *MUST* be visible at the time of a page fault, else&lt;br /&gt;we get a cascade of nested page-faults because the handler isnt mapped&lt;br /&gt;in the process page tables.&lt;br /&gt;&lt;br /&gt;So, we need to arrange this to be true. At the moment, the options&lt;br /&gt;include: (a) see if anything in the kernel allows us to propagate the&lt;br /&gt;page-table mapping across all procs (nobody else, other than possibly&lt;br /&gt;a Virtualisation guest, such as Xen/VMWare/VirtualBox, is likely to do this),&lt;br /&gt;or, (b) do the hard work myself, (c) move the interrupt handler&lt;br /&gt;into an existing page of mapped memory [hard], or (d) dont patch the&lt;br /&gt;IDT, but patch the existing page fault handler [not sure if this doesnt&lt;br /&gt;just put off the problem].&lt;br /&gt;&lt;br /&gt;Let me scrape some cobwebs off my brain...&lt;br /&gt;&lt;span class="post-comment-link"&gt;&lt;br /&gt;BTW - heres the relevant comment in kernel/fault.c, function do_page_fault:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;        /*&lt;br /&gt;        * We fault-in kernel-space virtual memory on-demand. The&lt;br /&gt;        * 'reference' page table is init_mm.pgd.&lt;br /&gt;        *&lt;br /&gt;        * NOTE! We MUST NOT take any locks for this case. We may&lt;br /&gt;        * be in an interrupt or a critical region, and should&lt;br /&gt;        * only copy the information from the master page table,&lt;br /&gt;        * nothing more.&lt;br /&gt;        *&lt;br /&gt;        * This verifies that the fault happens in kernel space&lt;br /&gt;        * (error_code &amp;amp; 4) == 0, and that the fault was not a&lt;br /&gt;        * protection error (error_code &amp;amp; 9) == 0.&lt;br /&gt;        */&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2148916676370125325?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2148916676370125325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/update-on-impossible.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2148916676370125325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2148916676370125325'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/update-on-impossible.html' title='Update on the impossible'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-673384051383107137</id><published>2012-01-16T14:49:00.001-08:00</published><updated>2012-01-16T14:49:13.299-08:00</updated><title type='text'>Naughty naughty bug.</title><content type='html'>I believe I have finally found / confirmed the root cause of&lt;br /&gt;the "impossible" bug.&lt;br /&gt;&lt;br /&gt;Lets go on a journey....&lt;br /&gt;&lt;br /&gt;The i386 virtual memory architecture relies on page tables.&lt;br /&gt;Each process has a (complex) array of descriptions for each page.&lt;br /&gt;Each page in the 4GB address space either has an entry, or is missing&lt;br /&gt;an entry. (Each process does not need the full 4MB to describe the&lt;br /&gt;address space, if the process is not using every page of the 4GB space;&lt;br /&gt;4MB is what is needed to describe each and every page for a fat 4GB page).&lt;br /&gt;&lt;br /&gt;Now, typically in that 4GB range is user-space (typically everything&lt;br /&gt;below around 3.5GB) and the kernel (everything in the last 0.5GB).&lt;br /&gt;[Details differ in release to release of the kernel].&lt;br /&gt;&lt;br /&gt;Now the kernel can see all the user space pages, but typically, the&lt;br /&gt;user space process *cannot* see the kernel pages. (Would be nice if you&lt;br /&gt;could but that would be a security hole). Physical RAM is&lt;br /&gt;mapped so that the kernel can see every page of memory, but the kernel&lt;br /&gt;pages are marked so you can not (in user space) "see them". (With root&lt;br /&gt;access and access to /proc/kmem, you can poke around, but thats not normal&lt;br /&gt;behaviour).&lt;br /&gt;&lt;br /&gt;Now, lets consider what happens when a (module) device driver is loaded.&lt;br /&gt;The kernel locates some free memory and loads the image into memory.&lt;br /&gt;The kernel does a lot of housekeeping to link the module into various&lt;br /&gt;lists and expose the /proc, /dev and other entries.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Here it gets interesting...&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;The driver is loaded into memory - the kernel knows about the memory&lt;br /&gt;before the driver is loaded - its the physical RAM in your box. But&lt;br /&gt;maybe/maybe not - the unallocated "free" memory in the kernel, is not&lt;br /&gt;really addressable - certainly not by user space, and possibly, not&lt;br /&gt;even by the kernel - trying to access free memory would indicate&lt;br /&gt;a rogue pointer or array-out-of-bounds exception. When the kernel&lt;br /&gt;needs a free page, in can ask the kernel allocator for it.&lt;br /&gt;&lt;br /&gt;So, this means, if you were to examine the page table for each process&lt;br /&gt;in the system, these free pages are effectively "not there" and this can&lt;br /&gt;help detect rogue pointers and bugs in the kernel.&lt;br /&gt;&lt;br /&gt;As the driver is loaded, the pages are flipped to "being there" and visible.&lt;br /&gt;Eg the code for the driver has to be visible to the rest of the kernel,&lt;br /&gt;because you are going to do an open/read/write/close, for example.&lt;br /&gt;&lt;br /&gt;Now, from user space - it doesnt know or care about the physical memory&lt;br /&gt;for a driver. You cannot just blindly execute a subroutine in a driver - you&lt;br /&gt;can only get to it, by executing a system call, which takes us from&lt;br /&gt;user space to kernel space, and, once in kernel space, we can see the&lt;br /&gt;code + data for the driver.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The Bombshell&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Now consider a driver which embeds its own interrupt routine. When&lt;br /&gt;an interrupt fires, we normally switch to supervisor mode, and the&lt;br /&gt;page where the interrupt routine resides, is visible and executable.&lt;br /&gt;&lt;br /&gt;I have been trying to track down a kernel blow up with dtrace, when&lt;br /&gt;its loaded one or more times and a page fault fires. (Only observed&lt;br /&gt;in the i386 kernel, not seen it in the x64 kernel).&lt;br /&gt;&lt;br /&gt;When the user space fires a page fault, we switch to supervisor mode&lt;br /&gt;and run the page fault handler. The first bit of this is in the dtrace&lt;br /&gt;driver. If we decide this is not interesting, we jump to the existing kernel&lt;br /&gt;handler.&lt;br /&gt;&lt;br /&gt;Half way through the kernel handler, it decides to take a context switch.&lt;br /&gt;(I dont know why - maybe its just being polite, and giving other high&lt;br /&gt;priority tasks a chance to run). As we load the %CR3 register (which points&lt;br /&gt;to the page table directory for the new process), we suddenly lose visibilty&lt;br /&gt;of the dtrace driver. It is no longer visible, in the context of that&lt;br /&gt;process, *EVEN FROM SUPERVISOR MODE*.&lt;br /&gt;&lt;br /&gt;That new process which just got the CPU takes a page-fault and *BANG*!&lt;br /&gt;GAME OVER!&lt;br /&gt;&lt;br /&gt;The page fault handler is no longer visible. In fact, trying to take&lt;br /&gt;the interrupt fires a page fault exception, which in turn fires a page&lt;br /&gt;fault exception. The stack overflows and the CPU merrily trundles along&lt;br /&gt;overwriting the entirety of memory until it shoots both its feet off.&lt;br /&gt;(Eg, it starts to overwrite the page table itself or some other important&lt;br /&gt;structure). I strongly suspect that using the *page table* as a *stack*&lt;br /&gt;is what causing the CPU to triple fault and for VMWare and VirtualBox to&lt;br /&gt;report an unexpected unrecoverable event has happened, and shuts down the VM.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Eh? Whaddya say?&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;The evidence suggests that when a driver is loaded into kernel memory,&lt;br /&gt;ONLY SOME PROCESSES HAVE IT MAPPED INTO THEIR PAGE TABLE.&lt;br /&gt;&lt;br /&gt;I did an experiment: I wrote some kernel code to let me probe&lt;br /&gt;each process on the system, to see if that process could see, in kernel&lt;br /&gt;space, a specific address. I tried a kernel address and that was fine (eg sys_open).&lt;br /&gt;I tried the dtrace interrupt (dtrace_page_fault), and it wasnt. I&lt;br /&gt;loaded a random other driver, and confirmed the same.&lt;br /&gt;&lt;br /&gt;So, lets revisit. When a driver is loaded into kernel memory - it&lt;br /&gt;is touch and go as to whether the driver should exist in the page&lt;br /&gt;tables of all processes in the system. Loading a driver could cause a lot&lt;br /&gt;of page table updates, as each processes page table would need to be&lt;br /&gt;updated to reflect the mappings. Or, instead the kernel might decide its&lt;br /&gt;not worth the bother: user space cannot access system space, except via&lt;br /&gt;a trap into the kernel via a syscall or interrupt.&lt;br /&gt;&lt;br /&gt;So, why do half of the procs in the system have the driver loaded and the&lt;br /&gt;others do not?&lt;br /&gt;&lt;br /&gt;Heres my guess: when a driver is loaded into memory at least one&lt;br /&gt;page table needs to be modified. This is a special page table which&lt;br /&gt;belongs to process zero (the swapper process). [A data structure&lt;br /&gt;called the swapper_pg_dir holds the kernel page table]. Under normal&lt;br /&gt;circumstances, every time a new process is created, it is a fork/clone&lt;br /&gt;of an existing process, so that new process gets a copy of the kernel&lt;br /&gt;page tables.&lt;br /&gt;&lt;br /&gt;But loading a driver means we cause a "warp" effect - the kernel gets&lt;br /&gt;the new mappings, but some/none of the user procs do not get this.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The solution&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Is this a bug? Is this a misinterpretation by me? It feels like a bug.&lt;br /&gt;Maybe the dtrace driver is miscompiled and I havent put the &lt;br /&gt;interrupt codes into the right ELF section (so I will go and check).&lt;br /&gt;&lt;br /&gt;If its not a compile/declaration problem, then either I need to&lt;br /&gt;update every processes page table to see the driver pages, or find&lt;br /&gt;a way to ensure that a kmalloc()ed page is visible by all processes.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The evidence&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Heres some evidence to support my findings. I invoke a kernel&lt;br /&gt;function to probe three addresses: static kernel function, dtrace page&lt;br /&gt;fault handler, "other" loadable module:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;...&lt;br /&gt;5.568007271 #0 1490:1294 d87fa000&lt;br /&gt;5.568007271 #0 1490:     lookup1: c182009c 1&lt;br /&gt;5.568007271 #0 1490:     lookup2: ebc236d0 1&lt;br /&gt;5.568007271 #0 1490:     lookup3: 00000000 3&lt;br /&gt;5.568007271 #0 1490:1489 d86ef000&lt;br /&gt;5.568007271 #0 1490:     lookup1: c182009c 1&lt;br /&gt;5.568007271 #0 1490:     lookup2: ebc236d0 1&lt;br /&gt;5.568007271 #0 1490:     lookup3: eb0c52b4 1&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;"1490" is the PID of the process invoking the tracing. The first entry&lt;br /&gt;is for pid 1294. This PID can see the kernel function and the dtrace function,&lt;br /&gt;but not the "other" driver. Pid 1489 can see all three addresses I &lt;br /&gt;specified. Theres no real logic to why pid 1294 cannot see the new &lt;br /&gt;driver.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;root      1294     1  0 21:38 ?        00:00:00 /usr/sbin/console-kit-daemon --no-daemon&lt;br /&gt;fox       1489  1289  0 21:38 ?        00:00:03 sshd: fox@pts/1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Heres the kernel code, in case anyone is interested, which&lt;br /&gt;dumps the output:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;void xx_procs(void)                                                           &lt;br /&gt;{       struct task_struct *t;&lt;br /&gt;&lt;br /&gt;// hack to call a GPL function in the kernel&lt;br /&gt;pte_t *(*lookup_address)(unsigned long, unsigned int *) = 0xc10277f0;&lt;br /&gt;	int     level = -1;&lt;br /&gt;	pte_t   *p;&lt;br /&gt;&lt;br /&gt;	printk("process list:\n");&lt;br /&gt;	p = lookup_address(0xc10277f0, &amp;level);&lt;br /&gt;	printk("  lookup: %p %d\n", p, level);                                &lt;br /&gt;        for_each_process(t) {                                                 &lt;br /&gt;                struct mm_struct *mm;                                         &lt;br /&gt;                struct vm_area_struct *vma;&lt;br /&gt;                &lt;br /&gt;                printk("%d %p\n", t-&gt;pid, t-&gt;mm ? t-&gt;mm-&gt;pgd : NULL);&lt;br /&gt;                if ((mm = t-&gt;mm) == NULL)&lt;br /&gt;                      continue;&lt;br /&gt;                &lt;br /&gt;                // lookup_address1() is the same as the kernels&lt;br /&gt;                // lookup_address() - but private copy to allow a &lt;br /&gt;                // procs mm_struct to be passed so we can probe another&lt;br /&gt;                // processes page table.&lt;br /&gt;                &lt;br /&gt;                // random kernel address&lt;br /&gt;                p = lookup_address1(mm, 0xc10277f0, &amp;level);&lt;br /&gt;                printk("     lookup1: %p %d\n", p, level);&lt;br /&gt;                &lt;br /&gt;                p = lookup_address1(mm, dtrace_page_fault, &amp;level);&lt;br /&gt;                printk("     lookup2: %p %d\n", p, level);&lt;br /&gt;                &lt;br /&gt;                // random "other" module address, gained from /proc/modules&lt;br /&gt;                p = lookup_address1(mm, 0xeecad000, &amp;level);&lt;br /&gt;                printk("     lookup3: %p %d\n", p, level);&lt;br /&gt;        }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-673384051383107137?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/673384051383107137/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/naughty-naughty-bug.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/673384051383107137'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/673384051383107137'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/naughty-naughty-bug.html' title='Naughty naughty bug.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7375791732096108962</id><published>2012-01-15T02:06:00.001-08:00</published><updated>2012-01-15T02:06:59.202-08:00</updated><title type='text'>Dtrace progress on the impossible</title><content type='html'>Since using VMWare and the gdb debugger to reliably debug this issue,&lt;br /&gt;I have an update.&lt;br /&gt;&lt;br /&gt;It turns out that as soon as the page-fault interrupt handler is enabled,&lt;br /&gt;when we take the first page fault interrupt, we pass it over to the kernel&lt;br /&gt;default handler. On return from the kernel, the page where the &lt;br /&gt;dtrace handler is located is no longer mapped in the page tables&lt;br /&gt;(for some reason).&lt;br /&gt;&lt;br /&gt;On the next interrupt, the CPU jumps to a non-existant page, resulting&lt;br /&gt;in a nested page fault interrupt. This continues for a few thousand&lt;br /&gt;iterations, until the kernel stack blasts through something, leading to&lt;br /&gt;a double fault.&lt;br /&gt;&lt;br /&gt;Interesting that the kernel stack contains thousands of copies&lt;br /&gt;of the same data (pushing of the page fault code, CS:IP and flags&lt;br /&gt;registers).&lt;br /&gt;&lt;br /&gt;gdb under VMWare player lets me set hardware breakpoints, so I can&lt;br /&gt;single step the kernel page fault handler. I've just had my first&lt;br /&gt;attempt, but unfortunately, maybe because of how long I took, the&lt;br /&gt;page fault handler decided to reschedule a different process to run,&lt;br /&gt;so I lost control.&lt;br /&gt;&lt;br /&gt;Its truly great single stepping whilst gdb is showing me the line of&lt;br /&gt;code we are on.&lt;br /&gt;&lt;br /&gt;Lets see if I can to what caused the mapping to disappear.&lt;br /&gt;&lt;br /&gt;More later.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7375791732096108962?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7375791732096108962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/dtrace-progress-on-impossible.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7375791732096108962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7375791732096108962'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/dtrace-progress-on-impossible.html' title='Dtrace progress on the impossible'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7351405419684236299</id><published>2012-01-14T10:14:00.001-08:00</published><updated>2012-01-14T10:14:09.581-08:00</updated><title type='text'>VMWare Player .. nice</title><content type='html'>I used to use VMWare server - quite a few years ago. I really liked it.&lt;br /&gt;But I got fed up with the server product lagging kernel development.&lt;br /&gt;New Linux kernels would come out and either I, or someone else, would&lt;br /&gt;have to figure out the compile-time changes for the drivers.&lt;br /&gt;&lt;br /&gt;VMWare created the Player product - which could be used for running VMs&lt;br /&gt;but not creating them. I have up with VMWare workstation.&lt;br /&gt;&lt;br /&gt;That was a few years ago.&lt;br /&gt;&lt;br /&gt;Today, I installed VMWare Player 4.0 - and it works as expected. Its nice&lt;br /&gt;to come to it after a few years of non-use. It seems highly reliable and&lt;br /&gt;works.&lt;br /&gt;&lt;br /&gt;So, I transferred my Ubuntu 11.10 i386 VM from VirtualBox to VMWare Player.&lt;br /&gt;A bit of googling showed me how to convert the disk image.&lt;br /&gt;&lt;br /&gt;The VM came up fine. I had a few issues with the network card. My&lt;br /&gt;kernel didnt have the driver - I had disabled as much as possible when&lt;br /&gt;creating a custom kernel, so spent a little while trying to find the&lt;br /&gt;PCNET32 driver. That resolved, I could remote login.&lt;br /&gt;&lt;br /&gt;So - now time to try dtrace: pretty much the same results as VirtualBox.&lt;br /&gt;Some small difference - occasionally when VB would hit the double fault&lt;br /&gt;issue, the screen would frantically scroll for 5-10s until the VM gave&lt;br /&gt;up and hung (hard). Doing the same on Player - it will scroll continously.&lt;br /&gt;There appears to be an emulation difference, which, by the looks of&lt;br /&gt;it, is that VB isnt playing as well in the face of double faults.&lt;br /&gt;&lt;br /&gt;All this would be boring. But googling for 'vmware debugger' led me to&lt;br /&gt;a page which shows how to enable local gdb debugging of the VM guest.&lt;br /&gt;Using TCP rather than the silly serial port emulation of classic kgdb&lt;br /&gt;setups in the kernel.&lt;br /&gt;&lt;br /&gt;So, I tried it. The first thing to note is: it just worked. gdb got&lt;br /&gt;a breakpoint in the native_safe_halt function in the kernel. Whats&lt;br /&gt;interesting is that when a breakpoint is hit, you get a big "Pause" bitmap&lt;br /&gt;slap in the middle of the video console. If you hit ^C in the gdb to regain&lt;br /&gt;control, or we hit a panic, the pause bitmap becomes visible and you know &lt;br /&gt;you have stopped the VM.&lt;br /&gt;&lt;br /&gt;The gdb debugger seems more resilient to debugging the doublefault.&lt;br /&gt;&lt;br /&gt;Heres a fragment of the stack trace from gdb:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;.....&lt;br /&gt;#253 0xc1004adb in show_registers (regs=0xc1732a6c)&lt;br /&gt;    at arch/x86/kernel/dumpstack_32.c:106&lt;br /&gt;#254 0xc1460113 in __die (str=0xc15865f6 "general protection fault",&lt;br /&gt;    regs=0xc1732a6c, err=0) at arch/x86/kernel/dumpstack.c:275&lt;br /&gt;#255 0xc10058c2 in die (str=0xc15865f6 "general protection fault",&lt;br /&gt;    regs=0xc1732a6c, err=0) at arch/x86/kernel/dumpstack.c:308&lt;br /&gt;#256 0xc145f836 in do_general_protection (regs=0xc1732a6c, error_code=0)&lt;br /&gt;    at arch/x86/kernel/traps.c:402&lt;br /&gt;#257 &lt;signal handler called&gt;&lt;br /&gt;#258 0xc1048c04 in vprintk (&lt;br /&gt;    fmt=0xc158f150 "&lt;0&gt;PANIC: double fault, gdt at %08lx [%d bytes]\n",&lt;br /&gt;    args=0xc1732b38 "") at kernel/printk.c:827&lt;br /&gt;#259 0xc1456956 in printk (&lt;br /&gt;    fmt=0xc158f150 "&lt;0&gt;PANIC: double fault, gdt at %08lx [%d bytes]\n")&lt;br /&gt;    at kernel/printk.c:750&lt;br /&gt;#260 0xc1021bee in doublefault_fn () at arch/x86/kernel/doublefault_32.c:26&lt;br /&gt;#261 0x00000000 in ?? ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So - its good that I am seeing the same thing with VMware, and I have&lt;br /&gt;a new "debug" route to try and diagnose.&lt;br /&gt;&lt;br /&gt;Just need to figure out who is at fault. It has to be "me". I hope its not&lt;br /&gt;the Virtual emulation (VB or Player). I hope its not a CPU bug. I hope&lt;br /&gt;its not the Linux kernel.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7351405419684236299?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7351405419684236299/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/vmware-player-nice.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7351405419684236299'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7351405419684236299'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/vmware-player-nice.html' title='VMWare Player .. nice'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7323655808419217307</id><published>2012-01-14T02:19:00.001-08:00</published><updated>2012-01-14T02:19:50.955-08:00</updated><title type='text'>The 'Great Bug' of 2011/2012</title><content type='html'>In my recent blog postings, I wrote about the "most difficult bug in the&lt;br /&gt;world" to resolve. On i386, loading dtrace and patching the page fault&lt;br /&gt;interrupt vector would panic/hang/double fault the kernel.&lt;br /&gt;&lt;br /&gt;I have spent the last few weeks trying absolutely everything conceivable,&lt;br /&gt;to no avail. When faced with such a bug - one has to try everything, but,&lt;br /&gt;in the corner of your mind, you know that eventually, it may not&lt;br /&gt;be the place you are looking at - which is why it can be so elusive.&lt;br /&gt;&lt;br /&gt;Lets recall the issue: when loading the driver, it works - the driver&lt;br /&gt;loads and full functionality is available. If we remove the driver, the&lt;br /&gt;system still works. If we reload the driver again, we get panics, where&lt;br /&gt;even the kernel stack dumper panics the kernel. Evidence shows corrupt&lt;br /&gt;page tables or interrupt stacks, and, using the VirtualBox debugger, we&lt;br /&gt;can probe (a little) to see what is happening in the VM.&lt;br /&gt;&lt;br /&gt;Removing the providers in dtrace, removing calls to activate timers and&lt;br /&gt;just about removing anything that does real work, shows the problem to&lt;br /&gt;remain. The interrupt patching code was modified to be more careful&lt;br /&gt;about how it does the job. The key difference between "it works very&lt;br /&gt;well" and it "reliably crashes" is patching the page-fault vector (0x0E).&lt;br /&gt;&lt;br /&gt;Even if the page-fault handler in dtrace is modified to be a jump to&lt;br /&gt;the original vector - no touching of registers, the problem persists.&lt;br /&gt;&lt;br /&gt;I have modified dtrace to avoid touching the page fault vector, and&lt;br /&gt;instead, allow me to on-demand update the vector to the new or&lt;br /&gt;old interrupt location with an "echo" statement to the /proc/dtrace/trace&lt;br /&gt;device.&lt;br /&gt;&lt;br /&gt;This is useful - because it helps to isolate all the things that happen&lt;br /&gt;when a driver is loaded, from the fault at hand.&lt;br /&gt;&lt;br /&gt;Still: its erratic. I have times where I can reload the driver and&lt;br /&gt;patch the vector zillions of times, and others where, on the 2nd time,&lt;br /&gt;we go bang.&lt;br /&gt;&lt;br /&gt;I've been studying in more detail the TSS register in the cpu and better&lt;br /&gt;understanding how Linux and x86 in general handles nested interrupts,&lt;br /&gt;and handling multiple interrupt stacks. I have been using kgdb and the&lt;br /&gt;kdebug debugger in the kernel, along with the VirtualBox debugger.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Working Backwards&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Nothing works deterministically: it either works brilliantly or fails&lt;br /&gt;dysmally.&lt;br /&gt;&lt;br /&gt;Lets take a trip to a different place: lets work backwards.&lt;br /&gt;A "double-fault" interrupt is caused because an interrupt has effectively&lt;br /&gt;raised a general-protection fault (invalid segment selector, invalid&lt;br /&gt;address, page-fault, etc). So how can this happen? Well, a likely scenario&lt;br /&gt;is that the segment registers (%GS, %FS, %DS, %ES) are wrong.&lt;br /&gt;&lt;br /&gt;There are really two scenarios for a page fault: it either happens in&lt;br /&gt;user space or kernel space. A user page fault might happen because&lt;br /&gt;reference to an mmapped area has yet to page-fault-in the page just&lt;br /&gt;touched. Another case for user page faults is the stack - as the level&lt;br /&gt;of nesting in an application increases, lower pages in the stack may&lt;br /&gt;need to be allocated/mapped.&lt;br /&gt;&lt;br /&gt;User page faults are typically *rare*, especially on a small system&lt;br /&gt;because the working set can be mapped into the address space on startup.&lt;br /&gt;&lt;br /&gt;Kernel faults are most common (are they?!), e.g. read() into a large&lt;br /&gt;buffer which has been mmap()ed could cause this.&lt;br /&gt;&lt;br /&gt;I may look at which faults really are most common. (Nothing in the kernel&lt;br /&gt;tracks the types of faults [I think]). The difference is the segment&lt;br /&gt;registers: when a user app faults, the DS/ES/CS registers will point to user&lt;br /&gt;space, so the interrupt routine needs to modify these to point to the&lt;br /&gt;kernel address space. (Otherwise, things like referring to a .data or .bss&lt;br /&gt;object, in C, will generate a fault). If we trap from the kernel, then these&lt;br /&gt;registers are already the correct values.&lt;br /&gt;&lt;br /&gt;[Theres more complication here - the Linux interrupt vectors handle&lt;br /&gt;nested interrupts, double faults, NMI and other things, but lets keep it&lt;br /&gt;simple for now]&lt;br /&gt;&lt;br /&gt;Now, the dtrace page fault handler keeps a count of how many interrupts&lt;br /&gt;it sees (/proc/dtrace/stats). So, when it works, we can see it working&lt;br /&gt;reliably. The figures are actually lower than I expect, but that may be&lt;br /&gt;the kernel doing a good job of typically preloading new processes to&lt;br /&gt;minimize page faults, so, I actually have to work hard to generate a high&lt;br /&gt;degree of page faults).&lt;br /&gt;&lt;br /&gt;So, when we have a double fault - we know an interrupt routine had&lt;br /&gt;a problem. The trouble is, in this case, we dont know which interrupt&lt;br /&gt;routine caused the original violation, because the double-fault handler&lt;br /&gt;generates a new fault. (I think the i386 kernel code is broken - it&lt;br /&gt;is walking a stack and the cpu registers are not consistent; i see streams&lt;br /&gt;of stack dumps where each stack dump is causing another, nested stack&lt;br /&gt;dump, and it all scrolls off the screen. I have used virtualbox debugger&lt;br /&gt;to see the true stack, and this is only partially helpful).&lt;br /&gt;&lt;br /&gt;When a nested fault occurs, the CPU will switch from the offending stack&lt;br /&gt;to a new stack, set up just for this purpose. This is a brilliant feature&lt;br /&gt;but its causing me a problem as I am having trouble finding the original&lt;br /&gt;offending stack.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Lets patch the kernel&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Ok, so we know something is wrong. Maybe we can detect the issue before&lt;br /&gt;it happens and get a deterministic panic. I modified the code in the&lt;br /&gt;kernel - just before we dispatch to a new process, to validate the&lt;br /&gt;page-table, and also, to monitor the low water mark of the kernel&lt;br /&gt;stack. Suggestions on google are that the symptoms I am seeing are&lt;br /&gt;due to stack overflow. Neither of the mods I have made have helped.&lt;br /&gt;Typically, the kernel will use about 2K of the 4K stack space available&lt;br /&gt;to it - it rarely gets close to 3K, so I dont believe we are overloading&lt;br /&gt;the stack and corrupting key data structures.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Lets give up?&lt;/h2&gt;&lt;br /&gt;I refuse to give up on this. I am mentally walking thru kernel code and&lt;br /&gt;scenarios, trying to conjure up the "it doesnt happen often" case, to detect&lt;br /&gt;what can be happening. &lt;br /&gt;&lt;br /&gt;Today, I was wondering if, in the VM, we give it a nice round number&lt;br /&gt;of memory or an odd number - whether this could impact. I typically give&lt;br /&gt;my VM about 730MB of memory. I tried giving it exactly 256MB and it worked&lt;br /&gt;perfectly! Briliant!. Rebooted and tried again at 256MB, and it failed.&lt;br /&gt;&lt;br /&gt;Theres almost a flavor to the underlying problem. Often, I am getting a scenario&lt;br /&gt;where it works for extended periods of time: I cannot crash the machine.&lt;br /&gt;Other times, it crashes exactly on the second load (sometimes, even the first&lt;br /&gt;load, although this is rare).&lt;br /&gt;&lt;br /&gt;Its almost as if the problem is to do with exactly what is allocated&lt;br /&gt;in memory. I tried a test by filling memory with a large file (full of&lt;br /&gt;zeros), and checking to see if the file was mutating. (Maybe the interrupt&lt;br /&gt;routine was firing with incorrect DS/ES registers and attempts to increment&lt;br /&gt;a counter was randomly patching a random page in memory - that would exactly&lt;br /&gt;explain the kind of problem I am seeing).&lt;br /&gt;&lt;br /&gt;So, so far: nothing. No deterministic testing is locating the root cause&lt;br /&gt;of this fault. (I am also wondering if VirtualBox is broken - I have&lt;br /&gt;seen the many bug reports and complaints about VB on the web, but I have&lt;br /&gt;no evidence that the bugs are my problem. I must get emu or VMware up and&lt;br /&gt;running to do side-by-side comparisons).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6154&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7323655808419217307?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7323655808419217307/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/bug-of-20112012.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7323655808419217307'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7323655808419217307'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/bug-of-20112012.html' title='The &amp;#39;Great Bug&amp;#39; of 2011/2012'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-420416822278230683</id><published>2012-01-08T13:47:00.001-08:00</published><updated>2012-01-08T13:47:06.691-08:00</updated><title type='text'>More on the impossible</title><content type='html'>I wrote last time about the worst bug to try and diagnose, namely&lt;br /&gt;one where we lose the page table or GDT or both. Doing so can result&lt;br /&gt;in a double or triple fault, and no way to figure out where you came from.&lt;br /&gt;&lt;br /&gt;In user space, a jump to a virtual method (which is in essence a call&lt;br /&gt;to a function via a level of indirection), can mean the PC is set to zero,&lt;br /&gt;and, is similar, in that it is annoying to not know where you came from.&lt;br /&gt;But the fact the program counter is zero tells you that you went through&lt;br /&gt;a null pointer.&lt;br /&gt;&lt;br /&gt;So, where am I solving the "impossible"? Not much further forward.&lt;br /&gt;I have been using the VirtualBox debugger - but it is very broken and&lt;br /&gt;pathetic. You cannot set breakpoints or hardware breakpoints if&lt;br /&gt;using the VT-x/AMD-V virtualisation acceleration. If you turn these&lt;br /&gt;off, you can, except the semantics of breakpoints breaks the guest&lt;br /&gt;operating system. In addition, writing to guest CPU registers is not&lt;br /&gt;implemented.&lt;br /&gt;&lt;br /&gt;I took a quick look at qemu, but I didnt like what I found -- I prefer&lt;br /&gt;a CLI in general, but for VM guests, I prefer a GUI to get me comfortable.&lt;br /&gt;The GUIs on Linux are very ugly and amateurish which didnt instill confidence&lt;br /&gt;in me. (I know, this is unfair of me). I may try again in the future.&lt;br /&gt;&lt;br /&gt;I went back to kgdb - at least this let me set breakpoints and hardware&lt;br /&gt;breakpoints, which is useful. But the process of using kgdb is very&lt;br /&gt;clunky - the guest and remote debugger get out of sync on the comms protocol.&lt;br /&gt;In any case, hitting the bug I am interested in didnt help much, with kgdb.&lt;br /&gt;I could break in the doublefault_fn function, but we couldnt really&lt;br /&gt;figure out where we had come from.&lt;br /&gt;&lt;br /&gt;I modified dtrace to allow access to the GDT and IDT via&lt;br /&gt;/proc/dtrace/gdt and /proc/dtrace/idt. (Not really needed, but useful&lt;br /&gt;for validating that these data structures are correct).&lt;br /&gt;&lt;br /&gt;What I am finding is that on a double trace fault, there is a suggestion&lt;br /&gt;that the original offending kernel stack for a process has been set to&lt;br /&gt;all zeroes. When the kernel tries to dereference an argument on the&lt;br /&gt;stack, or return from the offending function, it generates a GPF, which&lt;br /&gt;in turn generates a double-fault. (I'm not totally sure of this - a GPF&lt;br /&gt;wouldnt normally generate a double-fault, unless the GDT, page table&lt;br /&gt;or IDT were screwed up).&lt;br /&gt;&lt;br /&gt;Lets just revisit what I am doing: having cut down dtrace to a minimalist&lt;br /&gt;shell, we can override entry IDT[14], which is the page-table vector&lt;br /&gt;entry. If we put in the actual value which is there already, everything is&lt;br /&gt;fine.&lt;br /&gt;&lt;br /&gt;If we modify the entry to point to our interrupt routine, and make&lt;br /&gt;our interrupt routine simply jump to the original kernel routine, at&lt;br /&gt;some time after this change (could be instantaneous to a minute later),&lt;br /&gt;we crash the kernel. It feels like a few pages of the kernel got overwritten,&lt;br /&gt;e.g. memset(random-ptr, 0, PAGE_SIZE). But tracking this down is &lt;br /&gt;nearly impossible.&lt;br /&gt;&lt;br /&gt;I have been adding debug code to the kernel source to try and do extra&lt;br /&gt;validation (eg in the scheduler, just before the context switch occurs),&lt;br /&gt;but this hasnt proven fruitful so far.&lt;br /&gt;&lt;br /&gt;Its almost like looking for a root kit in the kernel - I almost wander&lt;br /&gt;if the kernel has some tamper-resist code in there (it does, but not like&lt;br /&gt;this).&lt;br /&gt;&lt;br /&gt;I need to somehow checksum the entirety of RAM and look for something &lt;br /&gt;unexpected to happen, but doing this isnt viable. RAM and processes are&lt;br /&gt;changing all the time. Process creation complicates things - every fork()&lt;br /&gt;generates a new process with a new kernel stack. I need to keep walking &lt;br /&gt;all processes kernel stacks to detect corruption, before we switch to the&lt;br /&gt;process.&lt;br /&gt;&lt;br /&gt;I am running on a single-CPU guest, to avoid the complexity of multi cpu&lt;br /&gt;operations. What I cannot determine is if something is being corrupted by&lt;br /&gt;virtue of writing to the IDT, or a long time after.&lt;br /&gt;&lt;br /&gt;Alas, google searching hasnt been helpful - the symptom and problem is&lt;br /&gt;very unique (I am not writing a rootkit, although dtrace looks an awfully&lt;br /&gt;lot like a rootkit in terms of what it does), and I am not booting up a&lt;br /&gt;new operating system. Nobody describes the scenario of modifying an&lt;br /&gt;in-use IDT and the things that can go wrong. (I did find two links,&lt;br /&gt;quoted in a couple of posts ago).&lt;br /&gt;&lt;br /&gt;Next is to try disabling dtrace's timer code - maybe that is causing&lt;br /&gt;non-deterministic behavior.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.22a-b6150&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-420416822278230683?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/420416822278230683/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/more-on-impossible.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/420416822278230683'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/420416822278230683'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/more-on-impossible.html' title='More on the impossible'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3990307281894415097</id><published>2012-01-04T14:22:00.001-08:00</published><updated>2012-01-04T14:22:00.154-08:00</updated><title type='text'>Debugging with VirtualBox</title><content type='html'>Earlier, I wrote about the worst type of bug in the world - &lt;br /&gt;one where we smash the internal CPU registers so badly, that nothing&lt;br /&gt;recovers - no interrupts, no double/triple faults.&lt;br /&gt;&lt;br /&gt;Ive been experimenting with the VirtualBox debugger, and its very&lt;br /&gt;nice, albeit a little basic. Anyone interested in playing with this will&lt;br /&gt;need to read the manual.&lt;br /&gt;&lt;br /&gt;But heres an illustration of a CPU-smashing bug.&lt;br /&gt;&lt;br /&gt;See&lt;br /&gt;&lt;br /&gt;https://www.virtualbox.org/manual/ch08.html#vboxmanage-debugvm&lt;br /&gt;&lt;br /&gt;If I run the following command, I can get a complete dump&lt;br /&gt;of all registers in the CPU in the VM guest:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ VBoxManage debugvm  Ubuntu-11.10-i386 getregisters all | tee /tmp/reg&lt;br /&gt;cpu0.rax               = 0x0000000000000000&lt;br /&gt;cpu0.rcx               = 0x0000000000000000&lt;br /&gt;cpu0.rdx               = 0x0000000000000000&lt;br /&gt;cpu0.rbx               = 0x00000000c1644000&lt;br /&gt;cpu0.rsp               = 0x00000000c1645f80&lt;br /&gt;cpu0.rbp               = 0x00000000c1645f98&lt;br /&gt;cpu0.rsi               = 0x00000000c1698fb8&lt;br /&gt;cpu0.rdi               = 0x000000004fcb43de&lt;br /&gt;cpu0.r8                = 0x0000000000000000&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, I save this to a file, and then cause the host to crash. We&lt;br /&gt;dump the registers again and now we can diff the results. We expect&lt;br /&gt;to see lots of differences, but heres some of the key elements:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;33,36c33,36&lt;br /&gt;&amp;gt; cpu0.gs                = 0x00e0&lt;br /&gt;&amp;gt; cpu0.gs_attr           = 0x00004091&lt;br /&gt;&amp;gt; cpu0.gs_base           = 0x00000000ecc05c00&lt;br /&gt;&amp;gt; cpu0.gs_lim            = 0x00000018&lt;br /&gt;---&lt;br /&gt;&gt; cpu0.gs                = 0x0000&lt;br /&gt;&gt; cpu0.gs_attr           = 0x00010000&lt;br /&gt;&gt; cpu0.gs_base           = 0x0000000000000000&lt;br /&gt;&gt; cpu0.gs_lim            = 0x00000000&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Not the GS register is smashed in the diff. Theres no base address for&lt;br /&gt;the segment definitions, so any code trying to use GS will cause a &lt;br /&gt;double/triple fault. Thats not good for the kernel.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;98,99c98,99&lt;br /&gt;&amp;gt; cpu0.cr2               = 0x00000000b78a0000&lt;br /&gt;&amp;gt; cpu0.cr3               = 0x000000002a945000&lt;br /&gt;---&lt;br /&gt;&gt; cpu0.cr2               = 0x00000000c1647040&lt;br /&gt;&gt; cpu0.cr3               = 0x0000000001748000&lt;br /&gt;114c114&lt;br /&gt;&amp;gt; cpu0.tsc               = 0x89fd0226&lt;br /&gt;---&lt;br /&gt;&gt; cpu0.tsc               = 0x02307c70&lt;br /&gt;119c119&lt;br /&gt;&amp;gt; cpu0.msr_gs_base       = 0x00000000ecc05c00&lt;br /&gt;---&lt;br /&gt;&gt; cpu0.msr_gs_base       = 0x0000000000000000&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Register CR3 is the page table base address. In the crashed machine, CR3&lt;br /&gt;looks "wrong". And the interactive VirtualBox debugger wont get very far with this&lt;br /&gt;wrong value as it needs the page tables to map virtual addresses&lt;br /&gt;to physical ones.&lt;br /&gt;&lt;br /&gt;Likewise, the msr_gs_base (which is an internal register which holds&lt;br /&gt;the place where the GS register is taken from, on a kernel switch) seems&lt;br /&gt;corrupt.&lt;br /&gt;&lt;br /&gt;This is why my guest is a smashed VM.&lt;br /&gt;&lt;br /&gt;But, alas, I dont know whats causing this. &lt;br /&gt;&lt;br /&gt;Still investigating....&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.21a-b6145&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3990307281894415097?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3990307281894415097/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/debugging-with-virtualbox.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3990307281894415097'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3990307281894415097'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/debugging-with-virtualbox.html' title='Debugging with VirtualBox'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6386117423454659084</id><published>2012-01-04T13:05:00.001-08:00</published><updated>2012-01-04T13:05:21.025-08:00</updated><title type='text'>Is that the worst you can do ?</title><content type='html'>Whats the worst thing you can do to a CPU whilst executing code?&lt;br /&gt;&lt;br /&gt;How about a buffer overflow .. overwriting beyond the end of a buffer.&lt;br /&gt;Very soon a segmentation violation (or GPF) will happen, and the application&lt;br /&gt;will terminate, or try to recover.&lt;br /&gt;&lt;br /&gt;How about inside the kernel? Well, pretty much the same thing.&lt;br /&gt;&lt;br /&gt;The x86 architecture is well thought out. When some form of memory&lt;br /&gt;access goes awry, an interrupt is generated (technically a 'fault' or 'trap'),&lt;br /&gt;and the kernel will attempt to recover from this.&lt;br /&gt;&lt;br /&gt;The act of taking an interrupt involves pushing the current program counter&lt;br /&gt;on the stack, and jumping to a predefined location.&lt;br /&gt;&lt;br /&gt;Great. So - whether a GPF occurs in user space or kernel space, something&lt;br /&gt;will happen. This is either recoverable, or a panic/blue-screen can&lt;br /&gt;happen if the kernel doesnt know what to do.&lt;br /&gt;&lt;br /&gt;The predefined location is setup in a table called the IDT (Interrupt&lt;br /&gt;Descriptor Table). &lt;br /&gt;&lt;br /&gt;If the interrupt to handle a GPF takes a fault itself, the system&lt;br /&gt;will generate a double-fault. Double-faults are very rare. (GPFs are&lt;br /&gt;very common, and can be caused under normal circumstances via memory&lt;br /&gt;mapped/anonymous memory, as pages are faulted into existence).&lt;br /&gt;&lt;br /&gt;A double fault typically indicates a flaw in a driver and can&lt;br /&gt;be caused by using an invalid pointer or a stack exception in an &lt;br /&gt;existing interrupt.&lt;br /&gt;&lt;br /&gt;A triple fault is what can happen if a double fault generates an &lt;br /&gt;exception. This would indicate the double-fault handling code hit&lt;br /&gt;an unexpected condition. On the Intel/AMD architectures, a triple fault&lt;br /&gt;will typically reset and reboot the CPU.&lt;br /&gt;&lt;br /&gt;Normally, the kernel and CPU operate together on some very key data&lt;br /&gt;structures. We mentioned the IDT, above. Theres also the GDT - which describes&lt;br /&gt;how segments of memory map to real memory. And then theres the LDT - which&lt;br /&gt;is a per-process view of memory. Corrupting any of these can&lt;br /&gt;lead to double/triple fault behavior.&lt;br /&gt;&lt;br /&gt;But theres another data structure: the page table directory. If the&lt;br /&gt;page table is corrupted then all bets are off. The page table can be used to&lt;br /&gt;indicate what blocks of memory are present/not-present in the system and&lt;br /&gt;is the mechanism for virtual memory support. If the page table were &lt;br /&gt;corrupt, then an application would generate a page fault interrupt and the&lt;br /&gt;kernel would quickly shut down the offending process.&lt;br /&gt;&lt;br /&gt;But what if the kernel version of the page table were corrupt? On an&lt;br /&gt;interrupt, the CPU wouldnt be able to access the code to execute the&lt;br /&gt;interrupt handler, which in turn would lead to a double fault, and thence&lt;br /&gt;to a triple fault.&lt;br /&gt;&lt;br /&gt;All of this is well documented on the web.&lt;br /&gt;&lt;br /&gt;But I am having a hard time with dtrace on i386 architectures. After&lt;br /&gt;loading dtrace, and then removing from the system, on a subsequent&lt;br /&gt;reload of the driver, the system crashes/hangs. Most of the time there&lt;br /&gt;is no output on the console; when there is output on the console,&lt;br /&gt;its confused and corrupted. Which indicates that one of the&lt;br /&gt;key data structures in the kernel is corrupt (IDT, Page Tables or GDT).&lt;br /&gt;&lt;br /&gt;And, because of this, nearly impossible to debug. Nothing in the kernel&lt;br /&gt;can help debug this scenario - we cannot print or signal what has happened&lt;br /&gt;or where we were prior to the crash.&lt;br /&gt;&lt;br /&gt;At the moment I am using the VirtualBox debugger to poke around after&lt;br /&gt;a crash, but the debugger wont let me examine memory exactly because the&lt;br /&gt;page table is corrupt (or the CR3 register is corrupt, but I cannot&lt;br /&gt;tell the difference; CR3 is the register which points to the start&lt;br /&gt;of the page table).&lt;br /&gt;&lt;br /&gt;So, this is the worst bug to resolve - no kernel debugger, printk statements&lt;br /&gt;or something in the kernel will help find the cause of the strange hang.&lt;br /&gt;(Strangely, this problem does not exist in in the 64b kernel).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.21a-b6145&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6386117423454659084?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6386117423454659084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2012/01/is-that-worst-you-can-do.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6386117423454659084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6386117423454659084'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2012/01/is-that-worst-you-can-do.html' title='Is that the worst you can do ?'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4753842773900291056</id><published>2011-12-31T09:05:00.001-08:00</published><updated>2011-12-31T09:05:17.314-08:00</updated><title type='text'>Happy New Year Dtrace...You ruined my Christmas</title><content type='html'>I have just spent the last few days tracking down a strange&lt;br /&gt;issue. Having fixed up dtrace to work on Ubuntu 11.10/i386, I found&lt;br /&gt;that reloading the driver would hang/crash the kernel.&lt;br /&gt;&lt;br /&gt;This was related to the page-fault interrupt vector. If that was&lt;br /&gt;disabled, then all was well.&lt;br /&gt;&lt;br /&gt;Strange. I dont recall this error before....&lt;br /&gt;&lt;br /&gt;Although most recent work has been on the 64b version of dtrace,&lt;br /&gt;I had assumed the 32b was in sync and all-was-well. But not so.&lt;br /&gt;&lt;br /&gt;Its an interesting trail....I thought the driver reload/reload/...&lt;br /&gt;cycle was fixed. It works well on 64b kernel, but not on the 32b one.&lt;br /&gt;&lt;br /&gt;After a lot of searching, and narrowing the problem down to the&lt;br /&gt;page fault interrupt vector, I checked on my rock-solid Ubuntu 8.04&lt;br /&gt;(2.6.28 kernel). And hit the same problem: namely a double (or even&lt;br /&gt;single) driver reload would hang the system.&lt;br /&gt;&lt;br /&gt;I spent a lot of time on the Ubuntu 11.10 kernel - the driver&lt;br /&gt;would hang the kernel on the first load after bootup. I eventually&lt;br /&gt;was tinkering with GRUB and turning off the splash screen, and got&lt;br /&gt;to a point where the first load would work, the 2nd would hang.&lt;br /&gt;&lt;br /&gt;Prior to this point - I had no way to debug the code. Any attempt&lt;br /&gt;to leave the page fault vector modification in place would hang the&lt;br /&gt;kernel .. or cause a panic in printk(). I eventually considered this&lt;br /&gt;to be a problem where the segment registers were not setup properly.&lt;br /&gt;(The kernel uses the segment registers to access kernel data and&lt;br /&gt;per-cpu items, so, if these are incorrect on a page fault, you arent&lt;br /&gt;going very far). I even cut/pasted the existing kernel assembler code&lt;br /&gt;for page_fault, but had a lot of problems getting something repeatable.&lt;br /&gt;&lt;br /&gt;Whilst investigating this, I had to do a lot of "mind-experiments": what&lt;br /&gt;was the CPU up to? Why was it having a hard time?&lt;br /&gt;&lt;br /&gt;Well, what I realised is a number of things:&lt;br /&gt;&lt;br /&gt;On a SMP system, each CPU has its own IDT register - set to the same&lt;br /&gt;location in memory. We might patch the interrupt vector table, but&lt;br /&gt;there was no guarantee the other CPUs would see these changes atomically.&lt;br /&gt;In addition CPU caching might cause the other cpus to see the old&lt;br /&gt;values of these interrupt vectors, until enough time had passed to&lt;br /&gt;force cache line flushing.&lt;br /&gt;&lt;br /&gt;Bear in mind, we are patching vectors in a table, so, the CPU may not&lt;br /&gt;know we had done this. For all we know, the CPU may have cached the &lt;br /&gt;page_fault vector and may not notice our changes. Or, ditto for the other&lt;br /&gt;cpus.&lt;br /&gt;&lt;br /&gt;So, google to rescue us. After a short while, I found these two&lt;br /&gt;links:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;  &lt;li&gt;http://stackoverflow.com/questions/2497919/changing-the-interrupt-descriptor-table&lt;/li&gt;&lt;br /&gt;  &lt;li&gt;http://www.codeproject.com/KB/system/soviet_direct_hooking.aspx&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The second link hinted at my problem: if you randomly change the interrupt&lt;br /&gt;vector table, then expect problems. The codeproject link didnt suggest&lt;br /&gt;an explanation, but hinted that the way forward was to create a new IDT,&lt;br /&gt;copy the old table to the temp, switch the CPUs away, make the updates, then&lt;br /&gt;switch back.&lt;br /&gt;&lt;br /&gt;The first link confirmed this. (Interestingly, the second link is for&lt;br /&gt;info on a Windows kernel, but the first link echoed the same sentiment&lt;br /&gt;on Linux).&lt;br /&gt;&lt;br /&gt;After a lot of fine-tuning the code and cleaning up, it now works !&lt;br /&gt;&lt;br /&gt;The trick is to switch the CPUs away whilst updating the vector&lt;br /&gt;table, and switch back when the updates are done. Also, to do&lt;br /&gt;the same during the driver teardown code, so we can load/unload&lt;br /&gt;repeatedly.&lt;br /&gt;&lt;br /&gt;On the way, I added a /proc/dtrace/idt driver so its easier to visually&lt;br /&gt;see the raw interrupt descriptor table.&lt;br /&gt;&lt;br /&gt;One interesting issue here is why the 64b driver didnt suffer the same&lt;br /&gt;problems? It feels like we hit a CPU bug/errata in this area, and&lt;br /&gt;the 64b CPU mode does not suffer this problem (or, the size of the&lt;br /&gt;64b IDT vector entries "move" the problem around).&lt;br /&gt;&lt;br /&gt;Now I just need to tidy up the code and release ... the last release&lt;br /&gt;of 2011.&lt;br /&gt;&lt;br /&gt;Happy new year.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.21a-b6142&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4753842773900291056?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4753842773900291056/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/happy-new-year-dtraceyou-ruined-my.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4753842773900291056'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4753842773900291056'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/happy-new-year-dtraceyou-ruined-my.html' title='Happy New Year Dtrace...You ruined my Christmas'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4871526437340249295</id><published>2011-12-27T10:13:00.001-08:00</published><updated>2011-12-27T10:13:47.866-08:00</updated><title type='text'>Dtrace / Ubuntu 11.10/i686</title><content type='html'>Spent some time over Xmas adding a new provider (the 'proc' provider)&lt;br /&gt;which is a simple layering on the existing providers, and provides&lt;br /&gt;easy access to process creation/termination. Its unfinished.&lt;br /&gt;&lt;br /&gt;I was asked about some compile errors on Ubuntu 11.10/i686 and&lt;br /&gt;was confident there was no issue here.&lt;br /&gt;&lt;br /&gt;What I hit upon really made my stomach churn. Firstly, a simple&lt;br /&gt;error stopped compilation because of something not quit right in &lt;br /&gt;GCC. That was easily fixed.&lt;br /&gt;&lt;br /&gt;But no /usr/include/sys directory was a stunning shock. That means&lt;br /&gt;most apps are not going to compile. Period. Scanning google showed&lt;br /&gt;that the error was common, but yet I couldnt find a solution,&lt;br /&gt;that didnt involve a "hack". The hack is to manually symlink&lt;br /&gt;/usr/include/i386-linux-gnu/sys to /usr/include/sys.&lt;br /&gt;&lt;br /&gt;Strange, because the 64bit release works perfectly.&lt;br /&gt;&lt;br /&gt;But then, the next horror story is that dtrace will quite&lt;br /&gt;happily crash the kernel. There seems to be possibly two&lt;br /&gt;avenues of investigation. Firstly, the kernel is complaining about&lt;br /&gt;stack overflow, which means they built the kernel with tiny (4k? 8k?)&lt;br /&gt;stacks, and could be a problem for dtrace. If this is the case, we will&lt;br /&gt;need to switch to our own stacks; not had to do that before.&lt;br /&gt;&lt;br /&gt;The other issue, which might be related, is that the interrupt&lt;br /&gt;handler does not work. Again, this could be related .. if we&lt;br /&gt;blow a stack then we will corrupt whatever is abutting our stack&lt;br /&gt;pages, and all bets are off. (I am getting quite reliable VirtualBox /&lt;br /&gt;Guru Meditation dialogs, which is a sure sign of a double or triple fault,&lt;br /&gt;again, hinting at bad karma with the stacks).&lt;br /&gt;&lt;br /&gt;So, off to go fix the i386 build. (The Ubuntu 8.04 release&lt;br /&gt;works really nicely, as a BTW).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.21a-b6142&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4871526437340249295?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4871526437340249295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/dtrace-ubuntu-1110i686.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4871526437340249295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4871526437340249295'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/dtrace-ubuntu-1110i686.html' title='Dtrace / Ubuntu 11.10/i686'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5114056952242698247</id><published>2011-12-22T14:20:00.001-08:00</published><updated>2011-12-22T14:20:51.715-08:00</updated><title type='text'>dtrace update</title><content type='html'>Just released a new version of dtrace. This should fix a couple of issues.&lt;br /&gt;&lt;br /&gt;The first was that 64b ELF binaries using user-space (USDT) probes&lt;br /&gt;couldnt correctly notify the kernel of the probe points. This was&lt;br /&gt;tracked down to a strange issue in the libelf binary where attempts&lt;br /&gt;to update the relocatable symbol table entries weren't being committed&lt;br /&gt;to the output file. I also found a lack of symmetry in the Solaris&lt;br /&gt;code, which probably worked because the Solaris libelf routines&lt;br /&gt;allow for a problem of storing a RELA entry into a REL slot.&lt;br /&gt;&lt;br /&gt;The second issue was that user space breakpoints being ignored&lt;br /&gt;by the interrupt routines by a previous cleanup/change. i386 and&lt;br /&gt;x64 are now in sync.&lt;br /&gt;&lt;br /&gt;Heres an example of the sample program demonstrating the USDT &lt;br /&gt;probes in action:&lt;br /&gt;&lt;br /&gt;In one terminal, we run the simple-c tool:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ build/simple-c&lt;br /&gt;__SUNW_dof header:&lt;br /&gt;dofh_flags     00000000&lt;br /&gt;dofh_hdrsize   00000040&lt;br /&gt;dofh_secsize   00000020&lt;br /&gt;dofh_secnum    00000009&lt;br /&gt;dofh_secoff    0x40&lt;br /&gt;dofh_loadsz    0x270&lt;br /&gt;dofh_filesz    0x400&lt;br /&gt;0: 0008 0001 0001 0000 0000023c 00000034&lt;br /&gt;1: 0010 0008 0001 0030 00000160 00000060&lt;br /&gt;2: 0011 0001 0001 0001 000001c0 00000002&lt;br /&gt;3: 0012 0004 0001 0004 000001c4 0000000c&lt;br /&gt;4: 000f 0004 0001 0000 000001d0 0000002c&lt;br /&gt;5: 000a 0008 0001 0018 00000200 00000030&lt;br /&gt;6: 000c 0004 0001 0000 00000230 0000000c&lt;br /&gt;7: 0001 0001 0000 0000 00000270 0000000a&lt;br /&gt;8: 0014 0001 0000 0000 0000027a 00000186&lt;br /&gt;PID:14632 0: here on line 93: crc=00008998&lt;br /&gt;PID:14632 here on line 95&lt;br /&gt;PID:14632 here on line 97&lt;br /&gt;PID:14632 here on line 99&lt;br /&gt;PID:14632 1: here on line 93: crc=00008998&lt;br /&gt;PID:14632 here on line 95&lt;br /&gt;PID:14632 here on line 97&lt;br /&gt;PID:14632 here on line 99&lt;br /&gt;PID:14632 2: here on line 93: crc=00008a4c&lt;br /&gt;PID:14632 here on line 95&lt;br /&gt;PID:14632 here on line 97&lt;br /&gt;PID:14632 here on line 99&lt;br /&gt;PID:14632 3: here on line 93: crc=00008a4c&lt;br /&gt;PID:14632 here on line 95&lt;br /&gt;PID:14632 here on line 97&lt;br /&gt;PID:14632 here on line 99&lt;br /&gt;PID:14632 4: here on line 93: crc=00008a4c&lt;br /&gt;PID:14632 here on line 95&lt;br /&gt;PID:14632 here on line 97&lt;br /&gt;PID:14632 here on line 99&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Ignore the dof dump at the top - that was for my benefit to debug&lt;br /&gt;what was being sent to the kernel. Note the "here on line" messages, and&lt;br /&gt;the CRC. As the app started, but before the USDT probes were enabled, the&lt;br /&gt;code segment had one checksum. After I started dtrace in another window,&lt;br /&gt;the checksum changes, (which proves something happened to the code segment,&lt;br /&gt;namely the NOPs are replaced by breakpoint instructions):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n :simple-c::&lt;br /&gt;dtrace: description ':simple-c::' matched 2 probes&lt;br /&gt;CPU     ID                    FUNCTION:NAME&lt;br /&gt;  1 312224                    main:saw-line&lt;br /&gt;  1 312225                    main:saw-word&lt;br /&gt;  1 312225                    main:saw-word&lt;br /&gt;  1 312224                    main:saw-line&lt;br /&gt;  1 312225                    main:saw-word&lt;br /&gt;  1 312225                    main:saw-word&lt;br /&gt;  ....&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Theres still some problems to resolve. When the simple app terminates, the&lt;br /&gt;probes are left active. We need to detect process exit (or exec) and&lt;br /&gt;remove the probes.&lt;br /&gt;&lt;br /&gt;Theres some more examples/details on dtrace.org on USDT, here:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://dtrace.org/blogs/dap/2011/12/13/usdt-providers-redux/"&gt;http://dtrace.org/blogs/dap/2011/12/13/usdt-providers-redux/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.21a-b6141&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5114056952242698247?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5114056952242698247/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/dtrace-update.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5114056952242698247'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5114056952242698247'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/dtrace-update.html' title='dtrace update'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4653731654903376252</id><published>2011-12-17T05:53:00.001-08:00</published><updated>2011-12-17T05:53:01.584-08:00</updated><title type='text'>Your dtrace fell into my dtrace :-)</title><content type='html'>Having validated we have the dtrace print() function, we can&lt;br /&gt;do a CTF-style structure dump print.&lt;br /&gt;&lt;br /&gt;You are partially on your own - go find a structure to print out.&lt;br /&gt;(You can't have mine, because its mine! All mine!)&lt;br /&gt;&lt;br /&gt;This simple example shows the ctf-style print in action:&lt;br /&gt;&lt;br /&gt;Reference: Eric Schrocks dtrace blog:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://dtrace.org/blogs/eschrock/2011/10/26/your-mdb-fell-into-my-dtrace/"&gt;http://dtrace.org/blogs/eschrock/2011/10/26/your-mdb-fell-into-my-dtrace/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;$ build/dtrace -n 'BEGIN{print(*((struct file *)0xfff&lt;br /&gt;fffff81a156a0)); exit(0);}'&lt;br /&gt;dtrace: description 'BEGIN' matched 1 probe&lt;br /&gt;CPU     ID                    FUNCTION:NAME&lt;br /&gt;  1      1                           :BEGIN struct file {&lt;br /&gt;    union f_u = {&lt;br /&gt;        struct list_head fu_list = {&lt;br /&gt;            struct list_head *next = 0&lt;br /&gt;            struct list_head *prev = 0&lt;br /&gt;        }&lt;br /&gt;        struct rcu_head fu_rcuhead = {&lt;br /&gt;            struct rcu_head *next = 0&lt;br /&gt;            void (*)() func = 0&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;    struct path f_path = {&lt;br /&gt;        struct vfsmount *mnt = 0&lt;br /&gt;        struct dentry *dentry = 0&lt;br /&gt;    }&lt;br /&gt;    const struct file_operations *f_op = 0xcf&lt;br /&gt;    spinlock_t f_lock = {&lt;br /&gt;        union  {&lt;br /&gt;            struct raw_spinlock rlock = {&lt;br /&gt;                arch_spinlock_t raw_lock = {&lt;br /&gt;                    unsigned int slock = 0&lt;br /&gt;                }&lt;br /&gt;            }&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;    int f_sb_list_cpu = 0&lt;br /&gt;    atomic_long_t f_count = {&lt;br /&gt;        long counter = 0&lt;br /&gt;    }&lt;br /&gt;    unsigned int f_flags = 0&lt;br /&gt;    fmode_t f_mode = 0&lt;br /&gt;    loff_t f_pos = 0&lt;br /&gt;    struct fown_struct f_owner = {&lt;br /&gt;        rwlock_t lock = {&lt;br /&gt;            arch_rwlock_t raw_lock = {&lt;br /&gt;                s32 lock = 0&lt;br /&gt;                s32 write = 0&lt;br /&gt;            }&lt;br /&gt;        }&lt;br /&gt;        struct pid *pid = 0&lt;br /&gt;        enum pid_type pid_type = PIDTYPE_PID&lt;br /&gt;        uid_t uid = 0&lt;br /&gt;        uid_t euid = 0&lt;br /&gt;        int signum = 0&lt;br /&gt;    }&lt;br /&gt;    const struct cred *f_cred = 0xffffffff810267c4&lt;br /&gt;    struct file_ra_state f_ra = {&lt;br /&gt;        unsigned long start = 0&lt;br /&gt;        unsigned int size = 0x17d436&lt;br /&gt;        unsigned int async_size = 0x8&lt;br /&gt;        unsigned int ra_pages = 0x3e8&lt;br /&gt;        unsigned int mmap_miss = 0&lt;br /&gt;        loff_t prev_pos = 0x200fffd058&lt;br /&gt;    }&lt;br /&gt;    u64 f_version = 0x700000000&lt;br /&gt;    void *f_security = 0&lt;br /&gt;    void *private_data = 0xffffffff810267dd&lt;br /&gt;    struct list_head f_ep_links = {&lt;br /&gt;        struct list_head *next = 0xffffffff81026995&lt;br /&gt;        struct list_head *prev = 0&lt;br /&gt;    }&lt;br /&gt;    struct list_head f_tfile_llink = {&lt;br /&gt;        struct list_head *next = 0&lt;br /&gt;        struct list_head *prev = 0xffffffff817ae848&lt;br /&gt;    }&lt;br /&gt;    struct address_space *f_mapping = 0xffffffff00000064&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For those of you who are observing, I picked a random symbol&lt;br /&gt;in the kernel to prove this works ok, so dont treat that file structure&lt;br /&gt;as having any meaning !&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.20a-b6134&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4653731654903376252?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4653731654903376252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/your-dtrace-fell-into-my-dtrace.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4653731654903376252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4653731654903376252'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/your-dtrace-fell-into-my-dtrace.html' title='Your dtrace fell into my dtrace :-)'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-772579653194314322</id><published>2011-12-17T02:39:00.001-08:00</published><updated>2011-12-17T02:39:31.163-08:00</updated><title type='text'>llquantify</title><content type='html'>Spent the week updating the dtrace source code to align with the&lt;br /&gt;latest illumos sources. One of the new features in the latest dtrace&lt;br /&gt;is the llquantify() function.&lt;br /&gt;&lt;br /&gt;Rather than me trying to do justice to this function, its better to&lt;br /&gt;let Bryan Cantrill give you the low-down, as in here:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://dtrace.org/blogs/bmc/2011/02/08/llquantize/"&gt;http://dtrace.org/blogs/bmc/2011/02/08/llquantize/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Theres some kernel fixes/enhancements in here. (A fix for&lt;br /&gt;ungarbage collected ecb's when a module is unloaded, but thats&lt;br /&gt;currently disabled until I put in a fix for the timer callback;&lt;br /&gt;module unloading is relatively rare, and untested on Linux/dtrace).&lt;br /&gt;&lt;br /&gt;Heres the output of a run (inside a VirtualBox VM):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n "tick-1ms{@ = llquantize(i++, 10, 0, 6, 20);}&lt;br /&gt;	tick-1ms/i==1500/{exit(0);}"&lt;br /&gt;dtrace: description 'tick-1ms' matched 2 probes&lt;br /&gt;CPU     ID                    FUNCTION:NAME&lt;br /&gt;  1 279854                        :tick-1ms&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;           value  ------------- Distribution ------------- count&lt;br /&gt;             &lt; 1 |                                         1&lt;br /&gt;               1 |                                         1&lt;br /&gt;               2 |                                         1&lt;br /&gt;               3 |                                         1&lt;br /&gt;               4 |                                         1&lt;br /&gt;               5 |                                         1&lt;br /&gt;               6 |                                         1&lt;br /&gt;               7 |                                         1&lt;br /&gt;               8 |                                         1&lt;br /&gt;               9 |                                         1&lt;br /&gt;              10 |                                         5&lt;br /&gt;              15 |                                         5&lt;br /&gt;              20 |                                         5&lt;br /&gt;              25 |                                         5&lt;br /&gt;              30 |                                         5&lt;br /&gt;              35 |                                         5&lt;br /&gt;              40 |                                         5&lt;br /&gt;              45 |                                         5&lt;br /&gt;              50 |                                         5&lt;br /&gt;              55 |                                         5&lt;br /&gt;              60 |                                         5&lt;br /&gt;              65 |                                         5&lt;br /&gt;              70 |                                         5&lt;br /&gt;              75 |                                         5&lt;br /&gt;              80 |                                         5&lt;br /&gt;              85 |                                         5&lt;br /&gt;              90 |                                         5&lt;br /&gt;              95 |                                         5&lt;br /&gt;             100 |@                                        50&lt;br /&gt;             150 |@                                        50&lt;br /&gt;             200 |@                                        50&lt;br /&gt;             250 |@                                        50&lt;br /&gt;             300 |@                                        50&lt;br /&gt;             350 |@                                        50&lt;br /&gt;             400 |@                                        50&lt;br /&gt;             450 |@                                        50&lt;br /&gt;             500 |@                                        50&lt;br /&gt;             550 |@                                        50&lt;br /&gt;             600 |@                                        50&lt;br /&gt;             650 |@                                        50&lt;br /&gt;             700 |@                                        50&lt;br /&gt;             750 |@                                        50&lt;br /&gt;             800 |@                                        50&lt;br /&gt;             850 |@                                        50&lt;br /&gt;             900 |@                                        50&lt;br /&gt;             950 |@                                        50&lt;br /&gt;            1000 |@@@@@@@@@@@@@                            500&lt;br /&gt;            1500 |                                         0&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.20a-b6134&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-772579653194314322?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/772579653194314322/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/llquantify.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/772579653194314322'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/772579653194314322'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/llquantify.html' title='llquantify'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2826033009104633719</id><published>2011-12-11T13:41:00.001-08:00</published><updated>2011-12-11T13:41:01.290-08:00</updated><title type='text'>Ding dong! Who's there? Anybody? Someone say hello!</title><content type='html'>Nigel has been up to his tricks again. No sooner do I give him&lt;br /&gt;a new release, and after a 21h marathon, some dtrace problems surfaced.&lt;br /&gt;&lt;br /&gt;I'm not sure if I have fixed or found the issue...but whilst trying to&lt;br /&gt;reproduce the issue (and I am not happy having to wait 21h to get&lt;br /&gt;a feel for the issue), I found a new bad scenario.&lt;br /&gt;&lt;br /&gt;On a 3 CPU VM, if I run a simple syscall:::{exit(0);} type of probe,&lt;br /&gt;repeatedly, then running *four* of these will deadlock the system.&lt;br /&gt;&lt;br /&gt;After a lot of poking around, adding debug, and trying to understand it,&lt;br /&gt;I think I located the source of the problem.&lt;br /&gt;&lt;br /&gt;So, a process runs on cpu#0, and whilst holding on to the&lt;br /&gt;locks, is suspended by the kernel. Next, another dtrace process&lt;br /&gt;comes along and blocks waiting on the mutex held by cpu#0. Repeat&lt;br /&gt;two more times.&lt;br /&gt;&lt;br /&gt;Now, the mutex implementation is effectively a spinlock, and we dont&lt;br /&gt;allow the first process, holding the locks to run. So we have deadlocks&lt;br /&gt;and a hung system.&lt;br /&gt;&lt;br /&gt;The cure appears to be calling schedule() in the middle of the mutex-wait&lt;br /&gt;loop, avoiding cpu starvation.&lt;br /&gt;&lt;br /&gt;This appears to work fine.&lt;br /&gt;&lt;br /&gt;New release with this fix, and Nigel can spend another 21h finding&lt;br /&gt;more bugs for me. :-)&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.20a-b6134&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2826033009104633719?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2826033009104633719/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/ding-dong-who-there-anybody-someone-say.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2826033009104633719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2826033009104633719'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/ding-dong-who-there-anybody-someone-say.html' title='Ding dong! Who&amp;#39;s there? Anybody? Someone say hello!'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1396914694057202007</id><published>2011-12-09T12:49:00.001-08:00</published><updated>2011-12-09T12:49:29.931-08:00</updated><title type='text'>Couple of tools</title><content type='html'>In the previous post, I talked about the difficulty of debugging&lt;br /&gt;the cpu-stuck lock syndrome.&lt;br /&gt;&lt;br /&gt;I thought it worth documenting a couple of trivial tools I wrote.&lt;br /&gt;&lt;br /&gt;The first tool, is a simple "tail -f" tool. It seems that the GNU&lt;br /&gt;"tail" tool doesnt work for character special device files. So:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ tail -f /proc/dtrace/trace&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;doesnt track the new data in there. A char-special file doesnt have&lt;br /&gt;a "size", so, when EOF is hit, you cannot necessarily do the standard&lt;br /&gt;trick of seeking and rereading the new data.&lt;br /&gt;&lt;br /&gt;Whilst debugging dtrace, its useful to keep an eye on /proc/dtrace/trace.&lt;br /&gt;I had given up using kernel printk() messages for debugging, because&lt;br /&gt;when faced with cpu lockups, you may not be able to see the messages&lt;br /&gt;and earlier lockups had implied that the lockups were a by product&lt;br /&gt;of multiple cpus causing printk() which in the later kernels, has&lt;br /&gt;its own mutex.&lt;br /&gt;&lt;br /&gt;So, tools/tails.pl does the magic of doing the tail. The algorithm&lt;br /&gt;is very simple: keep reading lines, storing them in a hash table. If&lt;br /&gt;we havent seen the line before, print it out. On EOF, sleep for a while&lt;br /&gt;and keep rereading the file. Its cpu intensive (and potentially memory&lt;br /&gt;intensive) but for the purposes, it fits the bill.&lt;br /&gt;&lt;br /&gt;I also found it useful for tracking /var/log/messages, since in the event&lt;br /&gt;of a stuck-cpu, you may not have a chance to see the file.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;cpustuck.pl&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The next tool - which is quite interesting, is trying to visually see&lt;br /&gt;a stuck cpu. Almost, like a black hole, you cannot see it - its likely&lt;br /&gt;stuck in a tight loop with interrupts disabled.&lt;br /&gt;&lt;br /&gt;Heres how we do it: have a program which prints, to the console, once&lt;br /&gt;a second, the cpu we are running on. Using the taskset utility, we can&lt;br /&gt;ensure that this process *only runs on the nominated cpu*.&lt;br /&gt;&lt;br /&gt;So, now the algorithm is: foreach cpu, spawn a child process, which can only run&lt;br /&gt;on the specified cpu, and prints out the cpu it is on. By running these&lt;br /&gt;processes in parallel, we will see an output sequence like (this is a 3-cpu &lt;br /&gt;example):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;0120120122100010002222220012012...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The order of printing is indeterminate. Under normal circumstances,&lt;br /&gt;we expect all cpus to say something (in any order). If one or more &lt;br /&gt;cpus lock up, then we will miss the notification, and the pattern breaks.&lt;br /&gt;Eventually, if all cpus lock up, we will see no more output, and&lt;br /&gt;usually no response to the keyboard, and pretty soon...the need to reset&lt;br /&gt;the machine.&lt;br /&gt;&lt;br /&gt;What this allowed was to detect *which* cpu was stuck, quite easily.&lt;br /&gt;Unfortunately, that is all it does. We know a cpu is stuck but we&lt;br /&gt;dont know what/where it is stuck.&lt;br /&gt;&lt;br /&gt;Combine this with the "tail.pl /var/log/messages", and, if we are lucky,&lt;br /&gt;when the kernel sisues the cpu stuck diagnostic, we can get a stack trace,&lt;br /&gt;and get a better understanding of whats going on.&lt;br /&gt;&lt;br /&gt;Its not sufficient - if we know the stack trace for a stuck cpu,&lt;br /&gt;it may not be clear *why* it is stuck. In my case, it seemed to be&lt;br /&gt;inside the code for kfree(), on a spinlock.&lt;br /&gt;&lt;br /&gt;But to be stuck on a spinlock implies some other cpu is doing something,&lt;br /&gt;which was not found to be the case.&lt;br /&gt;&lt;br /&gt;In the end, the problem was really a deadlock which was not&lt;br /&gt;easily understandable. One cpu is asking the other cpus to do something.&lt;br /&gt;Whilst this is happening, the original cpu is being asked to do something.&lt;br /&gt;But the first cpu is not listening (interrupts disabled).&lt;br /&gt;&lt;br /&gt;The ultimate cure was to understand fully the two main areas where&lt;br /&gt;dtrace blocks (mutexes and xcalls) and ensure they "knew" about&lt;br /&gt;each other (mutexes will drain the xcall queue whilst waiting for&lt;br /&gt;a mutex).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.19a-b6127&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1396914694057202007?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1396914694057202007/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/couple-of-tools.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1396914694057202007'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1396914694057202007'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/couple-of-tools.html' title='Couple of tools'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5720012405168534935</id><published>2011-12-09T12:30:00.001-08:00</published><updated>2011-12-09T12:30:42.255-08:00</updated><title type='text'>Losing my marbles, err..mutexes.</title><content type='html'>Interesting week on the dtrace front. At the beginning of the week,&lt;br /&gt;dtrace was looking good. Surviving various torture tests.&lt;br /&gt;&lt;br /&gt;But then Nigel had a go, and he reported a no-go. Slightly better&lt;br /&gt;but not robust enough and could crash/hang the kernel within a few&lt;br /&gt;minutes. The test was very simple:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt::[a-e]*:'{exit(0);}'&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;$ dtrace -ln syscall:::&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;running in parallel. The long teardown from the first dtrace&lt;br /&gt;would allow the much shorter syscall::: list to interleave, and&lt;br /&gt;sooner or later, lead to deadlock.&lt;br /&gt;&lt;br /&gt;I modified the first dtrace to fbt::[a-m]*: to improve the chance&lt;br /&gt;of the deadlock. (fbt::: is very draconian, enabling probes on all&lt;br /&gt;functions, and makes ^C or other checks for responsiveness, a little&lt;br /&gt;excessive. fbt::[a-m]*: was a middle ground - quick to fail.&lt;br /&gt;&lt;br /&gt;In investigating this, I ended up writing a couple of very simple&lt;br /&gt;but useful tools. The mode of failure was to cause "stuck CPU" messages&lt;br /&gt;to display on the console. Linux has a periodic timer fire which&lt;br /&gt;can be used to detect CPUs stuck in an infinite loop - a useful feature.&lt;br /&gt;When the stuck cpu message fires, you should get a stack&lt;br /&gt;trace in /var/log/messages (Fedora), or /var/log/kern.log (Ubuntu).&lt;br /&gt;&lt;br /&gt;Unfortunately, once a CPU is stuck, chances are high that the system&lt;br /&gt;is in an unrecoverable state - and attempting to look at the stacks&lt;br /&gt;doesnt help. I occasionally could get the kernel logs.&lt;br /&gt;&lt;br /&gt;Each time, I could see a suggestion that the cpu was stuck&lt;br /&gt;in the kfree() function, freeing memory. As the dtrace session ends,&lt;br /&gt;all the probes need to be unwound, and the memory freed. For fbt::[a-m]*:,&lt;br /&gt;this is around 29,000 probes (and some multiple of this in memory allocations).&lt;br /&gt;&lt;br /&gt;Freeing 29000 memory blocks should take well under a millisecond.&lt;br /&gt;But it takes *seconds*. (See prior threads on slow teardowns of&lt;br /&gt;fbt).&lt;br /&gt;&lt;br /&gt;The reason it is so slow, is that dtrace needs to broadcast to the&lt;br /&gt;other cpus, effectively saying "Hands off this memory block". This&lt;br /&gt;involves a cpu cross-call to synchronise state. Unfortunately, this is&lt;br /&gt;slow. It also does 3 of these per probe that is dismantled.&lt;br /&gt;&lt;br /&gt;Previous attempts to optimise this failed. (Nigel had pointed out&lt;br /&gt;this is significantly faster on real hardware vs a VM). I didnt&lt;br /&gt;quite get to the bottom of why it failed...&lt;br /&gt;&lt;br /&gt;So, we know we have a stuck cpu, something to do with freeing memory.&lt;br /&gt;Very difficult to debug. I wanted to know what the other CPUs were doing.&lt;br /&gt;If one cpu is waiting for something (a lock, semaphore, spinlock, whatever),&lt;br /&gt;then the others must be causing the lock. The evidence showed they werent&lt;br /&gt;really doing much.&lt;br /&gt;&lt;br /&gt;During a teardown, the other cpus may continue to fire probes, so the&lt;br /&gt;chances were that another cpu was causing a deadlock.&lt;br /&gt;&lt;br /&gt;The big implication was that either my mutex implementation or my&lt;br /&gt;xcall implementation was at fault. I spent a lot of time going through&lt;br /&gt;both with a fine tooth comb, and trying various experiments.&lt;br /&gt;&lt;br /&gt;It really didnt make sense. To cut a long story short, eventually,&lt;br /&gt;after despairing I could ever find it, it started working! Having&lt;br /&gt;made lots of little changes, eventually it survived over an hour, vs&lt;br /&gt;the 1-3 minutes previously.&lt;br /&gt;&lt;br /&gt;In the end, a couple of small changes were made (it wasnt obvious&lt;br /&gt;which of the many changes were the cause of the fix). By suitable&lt;br /&gt;diffing, and cutting back of the noise, I found that the xcall code&lt;br /&gt;and mutex code needed to occasionally drain the pending xcall&lt;br /&gt;buffers. If we run a cpu with interrupts disabled, and another cpu&lt;br /&gt;is trying to call us, then we put a long delay in responding to the&lt;br /&gt;function call. Worse, we can deadlock - the other cpu is waiting for us,&lt;br /&gt;and we are waiting for that cpu. By suitable "draining" calls, the&lt;br /&gt;problems disappeared.&lt;br /&gt;&lt;br /&gt;So, hopefully it is hardened.&lt;br /&gt;&lt;br /&gt;Lets see how Nigel copes with this, over the weekend.&lt;br /&gt;&lt;br /&gt;Its worth noting that debugging this kind of code is very difficult.&lt;br /&gt;I tried using the kernel debuggers, but faced a number of problems.&lt;br /&gt;One is that you cannot interrupt the kernel to see whats going on if&lt;br /&gt;the cpus have turned off interrupts. (I believe).&lt;br /&gt;&lt;br /&gt;I'll document a couple of the tools I added to the war chest in a &lt;br /&gt;follow up blog post.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.19a-b6127&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5720012405168534935?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5720012405168534935/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/losing-my-marbles-errmutexes.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5720012405168534935'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5720012405168534935'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/losing-my-marbles-errmutexes.html' title='Losing my marbles, err..mutexes.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8049662369367407987</id><published>2011-12-04T07:08:00.001-08:00</published><updated>2011-12-04T07:08:41.369-08:00</updated><title type='text'>CRiSP 10.0.19</title><content type='html'>I am going to release CRiSP 10.0.19, which has a subtle change&lt;br /&gt;to the way configs are saved. It turns out that the gridlines/outline&lt;br /&gt;mode cannot be persistently turned off. When you restart CRiSP, they&lt;br /&gt;are back on.&lt;br /&gt;&lt;br /&gt;This is because this is the first time (in a long while) that&lt;br /&gt;the clean-install startup defaults changed, to enable these features.&lt;br /&gt;&lt;br /&gt;When the config is saved, the internal defaults were defeating the&lt;br /&gt;user selection.&lt;br /&gt;&lt;br /&gt;So, in $HOME/.Crisp/crisp8.ini, the "display_mode" call is replaced&lt;br /&gt;with two new lines: "set_display_mode" and "set_display_shift".&lt;br /&gt;&lt;br /&gt;Most CRiSP primitives use a variety of flags to control behavior, rather&lt;br /&gt;than having a single get/set(inq) function to control behavior. This&lt;br /&gt;was for back in the 4MB-was-large days. So its time to liberate some&lt;br /&gt;of the functions, where it makes sense.&lt;br /&gt;&lt;br /&gt;What this means is that if you pick up 10.0.19 and dont like it,&lt;br /&gt;and go back to an older release, those config lines will cause a &lt;br /&gt;problem (a startup warning, possibly) until the older defaults are resaved.&lt;br /&gt;&lt;br /&gt;I may upgrade crisp8.ini to crisp10.ini if this presents itself as a &lt;br /&gt;problem.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.19a-b6127&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8049662369367407987?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8049662369367407987/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/crisp-10019.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8049662369367407987'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8049662369367407987'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/crisp-10019.html' title='CRiSP 10.0.19'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-738951246890968130</id><published>2011-12-04T04:08:00.001-08:00</published><updated>2011-12-04T04:08:06.298-08:00</updated><title type='text'>You locked me out! No I *didnt*.</title><content type='html'>Just fixed a problem with the mutex implementation. It was trying&lt;br /&gt;to support recursive mutexes, but dtrace doesnt like or expect that.&lt;br /&gt;&lt;br /&gt;The effect was that "dtrace -l &amp; dtrace -l &amp;" would cause havoc&lt;br /&gt;and assertion failures.&lt;br /&gt;&lt;br /&gt;New code seems to fix that, and run much tighter.&lt;br /&gt;&lt;br /&gt;Lets see if this cures problems.&lt;br /&gt;&lt;br /&gt;Someone raised an issue today on USDT probes - that they are broken.&lt;br /&gt;My response is "probably". I havent properly validated them in a while&lt;br /&gt;in an attempt to fix all the other issues.&lt;br /&gt;&lt;br /&gt;If nothing comes up, then I shall take a look at this next.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.19a-b6127&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-738951246890968130?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/738951246890968130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/you-locked-me-out-no-i-didnt.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/738951246890968130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/738951246890968130'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/you-locked-me-out-no-i-didnt.html' title='You locked me out! No I *didnt*.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5555730072328598179</id><published>2011-12-03T02:39:00.001-08:00</published><updated>2011-12-03T02:39:45.038-08:00</updated><title type='text'>Whats a nice byte doing in an instruction like this?</title><content type='html'>Just spent a few days trying to debug a strange scenario in Fedora Core 16.&lt;br /&gt;&lt;br /&gt;Trying to enable all probes would crash the kernel. After a binary&lt;br /&gt;search, the function flush_old_exec was found to be the culprit.&lt;br /&gt;&lt;br /&gt;Nothing special in that function makes it stand out, but putting&lt;br /&gt;a fbt::flush_old_exec:return probe in would cause the next fork/exec&lt;br /&gt;to kill the process. After trying every conceivable thing, nothing worked.&lt;br /&gt;&lt;br /&gt;Obviously a bug in dtrace - could it be the trap handler? Interrupts?&lt;br /&gt;Pre-emption? CPU rescheduling?&lt;br /&gt;&lt;br /&gt;Whilst analysing and trying to resolve this, I did some interesting things.&lt;br /&gt;&lt;br /&gt;The nice thing about localising the probe at error, was that I&lt;br /&gt;could test (simply start a new process), and it wouldnt crash the kernel&lt;br /&gt;but kill the new process. So, a very controlled environment for&lt;br /&gt;making small changes and adding monitoring was possible.&lt;br /&gt;&lt;br /&gt;Firstly, which probe was firing? Looking at /proc/dtrace/stats showed&lt;br /&gt;that *no* probe was firing. I added some extra debug to the int1 and int3&lt;br /&gt;handlers (single step and breakpoint), and this too, showed no&lt;br /&gt;probe was firing.&lt;br /&gt;&lt;br /&gt;Not possible ! Really not possible !&lt;br /&gt;&lt;br /&gt;Ok, so next, had we actually armed the probe? Well, we can use&lt;br /&gt;/proc/dtrace/fbt to examine probes, and we can tell if a probe is armed&lt;br /&gt;(tell-tale sign is "cc" instruction as the opcode at the location). Yes,&lt;br /&gt;we are arming the probe, but, no, it does not fire.&lt;br /&gt;&lt;br /&gt;Next up is to disassemble the function itself. I have found it very&lt;br /&gt;annoying with Linux that there is no /vmlinux binary on the system -&lt;br /&gt;only the /vmlinuz (and /boot equivalents), which are not proper ELF&lt;br /&gt;files, but bootable images. Something as simple as examining the instructions&lt;br /&gt;and bytes at physical addresses is tricky. I had written a &lt;br /&gt;"vmlinux" extractor, but it never worked reliably.&lt;br /&gt;&lt;br /&gt;One trick I have to do this is the following:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ sudo gdb /bin/ls /proc/kcore&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We dont care about /bin/ls but use it so we can examine /proc/kcore, and&lt;br /&gt;from this, we can access physical memory addresses (eg as reported&lt;br /&gt;by kernel stack traces or the dtrace probes).&lt;br /&gt;&lt;br /&gt;What I found was curious. The two RET instructions in the function&lt;br /&gt;were slap in the middle of a CALL instruction.&lt;br /&gt;&lt;br /&gt;This meant the instruction disassembler was wrong. Looking back&lt;br /&gt;a few instructions to see why, we came across the infamous "UD2" &lt;br /&gt;instruction. UD2 is a special instruction to generate an &lt;br /&gt;undefined-opcode fault. In the old days, lots of opcodes could do this,&lt;br /&gt;but Intel formally added this instruction so that compilers and&lt;br /&gt;operating systems had a real instruction, that would never change&lt;br /&gt;in future CPUs, for the purpose of generating an illegal instruction&lt;br /&gt;trap.&lt;br /&gt;&lt;br /&gt;The Linux kernel uses this in the BUG and BUG_ON macros. Since&lt;br /&gt;these calls are called rarely, the kernel maps to an UD2 instruction&lt;br /&gt;and the fault handler can gracefully report the fault and the location&lt;br /&gt;of the error.&lt;br /&gt;&lt;br /&gt;When the INSTR provider was implemented, I came across these &lt;br /&gt;instructions and had put some special "jump-over-it" code in place&lt;br /&gt;to handle this, but either I misread the assembler, or&lt;br /&gt;the kernel changed. Whenever a UD2 instruction is met, the&lt;br /&gt;disassembler would jump 10 bytes forward and continue from there.&lt;br /&gt;&lt;br /&gt;This just so happened to be in the middle of a call instruction&lt;br /&gt;which happened to have 0xC3 as part of the relative address field.&lt;br /&gt;Dtrace then slapped a breakpoint on that 0xC3 instruction and&lt;br /&gt;changed the call to something that was wrong. &lt;br /&gt;&lt;br /&gt;As soon as we hit the call, all bets were off, and we were lucky&lt;br /&gt;not to crash the kernel.&lt;br /&gt;&lt;br /&gt;Interesting that this didnt show up in Ubuntu 11.x or FC15, despite&lt;br /&gt;that code note really changing in a while, but it could be the&lt;br /&gt;quality of the GCC compiler code changed to cause the opcode&lt;br /&gt;to just match something plausible, whilst never tickling the bug&lt;br /&gt;on different kernels.&lt;br /&gt;&lt;br /&gt;So, one more bug down for now.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.19a-b6122&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5555730072328598179?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5555730072328598179/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/12/whats-nice-byte-doing-in-instruction.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5555730072328598179'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5555730072328598179'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/12/whats-nice-byte-doing-in-instruction.html' title='Whats a nice byte doing in an instruction like this?'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7358135520335641407</id><published>2011-11-29T10:48:00.001-08:00</published><updated>2011-11-29T10:48:46.874-08:00</updated><title type='text'>Dtrace and Ftrace : cross-modification</title><content type='html'>Brendan Gregg does a good write up of dtrace vs system tap, here:&lt;br /&gt;&lt;br /&gt;http://dtrace.org/blogs/brendan/2011/10/15/using-systemtap/&lt;br /&gt;&lt;br /&gt;I have been thinking about systemtap and ftrace, and stumbled&lt;br /&gt;across a peculiarity in the performance game.&lt;br /&gt;&lt;br /&gt;Solaris/dtrace has very little in the way of code when wondering&lt;br /&gt;about dropping or removing probes. On x86, a probe translates into&lt;br /&gt;a single byte breakpoint instruction. Dumping this on top of an&lt;br /&gt;instruction is effectively atomic (forget about barriers and all&lt;br /&gt;the other SMP complexities...for a moment).&lt;br /&gt;&lt;br /&gt;But in diagnosing issues in linux/dtrace, I had to look at ftrace to&lt;br /&gt;check my sanity.&lt;br /&gt;&lt;br /&gt;On x86, you can safely write self-modifying code. For future portability,&lt;br /&gt;Intel recommends instruction sequences to avoid problems with other CPUs,&lt;br /&gt;and the term "cross-modifying code" is used, whereby you are modifying &lt;br /&gt;instructions which might be sitting in the i-cache.&lt;br /&gt;&lt;br /&gt;Now, in general, replacing an N-byte instruction with an M-byte, where&lt;br /&gt;N != M is a problem, depending on if M &gt; 1, and M &gt; N. Atomically&lt;br /&gt;modifying memory which might be executed by another cpu is "hard".&lt;br /&gt;&lt;br /&gt;Linux 3.x and the ftrace code honor the Intel recommendation. They&lt;br /&gt;*stop* all the other CPUs whilst making these probe changes and&lt;br /&gt;use NMI interrupts to implement a hard locking region. Neat trick.&lt;br /&gt;&lt;br /&gt;But very costly.&lt;br /&gt;&lt;br /&gt;Solaris/dtrace doesnt. Neither does Linux/dtrace. We just "plop" our breakpoint&lt;br /&gt;wherever we see fit and the CPU does the rest of the work.&lt;br /&gt;&lt;br /&gt;This is great. But, Solaris/dtrace has a problem which they havent&lt;br /&gt;noticed yet. In Linux/dtrace, I provide the instruction provider.&lt;br /&gt;This provides additional ways to drop probes on "interesting instructions"&lt;br /&gt;and there is a chance that the user could drop the same address&lt;br /&gt;probe via FBT and INSTR.&lt;br /&gt;&lt;br /&gt;So, when a breakpoint is hit, which one wins? On Solaris, I dont think&lt;br /&gt;this can happen. It could in Linux, and at the moment, FBT will win&lt;br /&gt;and INSTR wont get a chance to handle the probe.&lt;br /&gt;&lt;br /&gt;Thats not so much a problem.&lt;br /&gt;&lt;br /&gt;But consider this: how do you remove a probe? Well, you&lt;br /&gt;just overwrite the instruction where you placed a breakpoint with&lt;br /&gt;the original opcode byte.&lt;br /&gt;&lt;br /&gt;There. Done. Nice. Neat.&lt;br /&gt;&lt;br /&gt;Hm...not so...&lt;br /&gt;&lt;br /&gt;Solaris/FBT does not have a notion of a probe being enabled or not. &lt;br /&gt;It does...but it uses the breakpoint byte to indicate that the probe is&lt;br /&gt;set or not. In fact, Solaris/FBT doesnt really care.&lt;br /&gt;&lt;br /&gt;So consider this: when disabling an FBT probe by overwriting the instruction,&lt;br /&gt;and another CPU, who has yet to see the byte modify, executes that&lt;br /&gt;instruction, may in fact, execute the breakpoint trap, even tho the probe&lt;br /&gt;is undone. So, now another CPU fires a FBT probe which was disabled.&lt;br /&gt;And Solaris/FBT lets it happen. It blindly fires the probe.&lt;br /&gt;&lt;br /&gt;But, at this time, *nobody is listening*. This is fine - but a probe&lt;br /&gt;is potentially firing when it should not do.&lt;br /&gt;&lt;br /&gt;The reason is, that Solaris doesnt flush the other CPUs i-cache. &lt;br /&gt;&lt;br /&gt;I noticed this on Linux, that probes were firing, even after dtrace had&lt;br /&gt;terminated because I was handling the enabled/disabled state. This&lt;br /&gt;caused a problem. If FBT knows the probe cannot fire, then it wont&lt;br /&gt;intercept it. And if this happens, then we have a breakpoint&lt;br /&gt;trap and the rest of dtrace wont know how to handle the breakpoint - we&lt;br /&gt;wont know what the original byte of the instruction was.&lt;br /&gt;&lt;br /&gt;I had to disable this feature and allow FBT to process probe traps&lt;br /&gt;even if the probe points had been disabled.&lt;br /&gt;&lt;br /&gt;Which all comes down to the fact that cross-modifying code is&lt;br /&gt;extremely intrusive, but you can get away with it, until one day,&lt;br /&gt;you might not. ("One day" might mean on a different CPU architecture&lt;br /&gt;or some future Intel chips).&lt;br /&gt;&lt;br /&gt;And finally: if Solaris had done the "right thing" it might be very&lt;br /&gt;slow at placing lots of traps or removing them. And this might account&lt;br /&gt;for ftrace losing some aspect of performance compared to dtrace. &lt;br /&gt;&lt;br /&gt;BTW the current latest release of dtrace is proving remarkably resilient.&lt;br /&gt;I am up to 5.3 billion probes running the torture tests, on real hardware,&lt;br /&gt;and no problems so far.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.18a-b6115&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7358135520335641407?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7358135520335641407/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-and-ftrace-cross-modification.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7358135520335641407'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7358135520335641407'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-and-ftrace-cross-modification.html' title='Dtrace and Ftrace : cross-modification'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5385667776024740206</id><published>2011-11-28T14:56:00.001-08:00</published><updated>2011-11-28T14:56:37.435-08:00</updated><title type='text'>I am going nuts.</title><content type='html'>So, Nigel got me to check dtrace with real hardware. And my heart stopped.&lt;br /&gt;It ran, and then hung the machine. I narrowed down to an "fbt::form*:" (46&lt;br /&gt;probes) and it kept hanging.&lt;br /&gt;&lt;br /&gt;Nothing I could do could make it work. (Relaxing locks, removing&lt;br /&gt;all the actual probe capture code). I assumed the worst - a CPU&lt;br /&gt;L1/L2 caching issue against the SMP cores. &lt;br /&gt;&lt;br /&gt;So, I tried on my VirtualBox VMs, and it was the same! Now, this&lt;br /&gt;worked perfectly yesterday, and I hadnt changed anything. Amazingly,&lt;br /&gt;"fbt:::" worked well, but a narrow FBT probe did not.&lt;br /&gt;&lt;br /&gt;After narrowing it down further, it transpires that some of the calls&lt;br /&gt;to printk() (which now map to the internal dtrace_printf, rather than&lt;br /&gt;the real kernel printk) were causing some form of recursion fault.&lt;br /&gt;After modifying the dtrace_printf() and printk() code to avoid this,&lt;br /&gt;it worked as I had expected before. Nicely. On real hardware.&lt;br /&gt;&lt;br /&gt;Now, something else is strange. I keep talking about rapid-fbt-teardowns&lt;br /&gt;being slow. I had done a lot of work to optimise this, and was proud&lt;br /&gt;of my ~10s of execution, down from more than 30-40s. On real hardware,&lt;br /&gt;this was coming in at sub-second. Now, on my VirtualBox, it was&lt;br /&gt;subsecond too! I dont know if the act of rebooting my laptop and&lt;br /&gt;starting afresh had fixed the issue, or some other nastiness (like&lt;br /&gt;the recursion above) had caused a performance issue to disappear.&lt;br /&gt;&lt;br /&gt;I checked the timer (tick-1ms) again - and thats still bad,&lt;br /&gt;at about 350 x 1ms ticks per real second.&lt;br /&gt;&lt;br /&gt;(I wonder if my VirtualBox slowdowns could be triggered by a paused VM).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.18a-b6115&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5385667776024740206?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5385667776024740206/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/i-am-going-nuts.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5385667776024740206'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5385667776024740206'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/i-am-going-nuts.html' title='I am going nuts.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6199308875415936570</id><published>2011-11-28T09:36:00.001-08:00</published><updated>2011-11-28T09:36:21.370-08:00</updated><title type='text'>Slaphead time...</title><content type='html'>Whilst diagnosing something fishy in dtrace the other week&lt;br /&gt;(why xcalls are so slow), I was going to write that its obvious:&lt;br /&gt;when running in a VM, things are slower.&lt;br /&gt;&lt;br /&gt;I got distracted, and have been chasing up issues and breakages in dtrace.&lt;br /&gt;&lt;br /&gt;Nigel - such a great person for feeding back breakage and issues,&lt;br /&gt;was asking me why I was spending so much time on the teardown code.&lt;br /&gt;Its rare that people do "dtrace -n fbt:::" and they get what they deserve.&lt;br /&gt;I said, well, it has to be safe.&lt;br /&gt;&lt;br /&gt;He then said, well, the figures in my prior blog didnt add up: I was&lt;br /&gt;quoting a great achievement of ~10s to teardown, and he was seeing&lt;br /&gt;sub-second teardowns. *On Real Hardware*.&lt;br /&gt;&lt;br /&gt;Duh!&lt;br /&gt;&lt;br /&gt;So, I just checked. My 10s teardown is about 0.5s on a Pentium Core 2.33GHz&lt;br /&gt;machine:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;218.620320647 #0 teardown start 1322505021.895618798 xcalls=90677 probes=423796&lt;br /&gt;218.680320738 #0 teardown done 0.60000091 xcalls=181282 probes=291&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So, double Duh! I also checked the tick-1 test:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n '&lt;br /&gt;	int cnt_1ms, cnt_1s;&lt;br /&gt;	tick-1ms {&lt;br /&gt;	        cnt_1ms++;&lt;br /&gt;	        printf("%d %d.%09d", cnt_1ms, timestamp / 1000000000, &lt;br /&gt;			timestamp % (1000000000));&lt;br /&gt;	        }&lt;br /&gt;	tick-1s { cnt_1s++;&lt;br /&gt;	        printf("tick-1ms=%d tick-1s=%d", cnt_1ms, cnt_1s);&lt;br /&gt;		cnt_1ms = 0;&lt;br /&gt;	        }&lt;br /&gt;	tick-5s {&lt;br /&gt;	        printf("the end: got %d + %d\n", cnt_1ms, cnt_1s);&lt;br /&gt;	        exit(0);&lt;br /&gt;	        }&lt;br /&gt;'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And I get around 980 x 1ms ticks per second, vs around 250 x 1ms on a &lt;br /&gt;VirtualBox VM (running on an i7 2630 chip).&lt;br /&gt;&lt;br /&gt;So, there we have it. Proof that a VM is slower -- very significantly&lt;br /&gt;in some areas, and one must be very careful to attribute performance&lt;br /&gt;in a VM to anything like real world. (Its not really surprising -&lt;br /&gt;playing with timers or doing cross-cpu interrupts, has to be mediated&lt;br /&gt;by the host VM). These dtrace tests are extreme and so amplify the weakness&lt;br /&gt;of a guest VM.&lt;br /&gt;&lt;br /&gt;So, I have partially wasted my time in doing these optimisations, but&lt;br /&gt;actually, they are useful, allowing me to refine the port to Linux, but&lt;br /&gt;also to look for and find, potential issues which might show up after&lt;br /&gt;a much longer period of soak testing on real hardware.&lt;br /&gt;&lt;br /&gt;Now...off to fix the next bug....&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.18a-b6115&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6199308875415936570?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6199308875415936570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/slaphead-time.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6199308875415936570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6199308875415936570'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/slaphead-time.html' title='Slaphead time...'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2025113620064350329</id><published>2011-11-27T09:55:00.001-08:00</published><updated>2011-11-27T09:55:47.384-08:00</updated><title type='text'>DTrace and the quest for resilience.</title><content type='html'>So, third time lucky at resolving the fast-teardown issue.&lt;br /&gt;The fast-teardown was proving too erratic.&lt;br /&gt;&lt;br /&gt;This release re-enables the standard teardown. The performance difference&lt;br /&gt;here of fast teardown vs classic is about 40+s to terminate an fbt:::&lt;br /&gt;vs 3-4s.&lt;br /&gt;&lt;br /&gt;I've put in a different optimisation, which not as good as the&lt;br /&gt;original fast teardown, is a decent optimisation (around ~10s).&lt;br /&gt;&lt;br /&gt;The new optimisation realises that during teardown, not only is&lt;br /&gt;the current cpu invoking probes (eg dtrace_xcall invokes the functions&lt;br /&gt;to send an IPI interrupt to the other cpus, and the probes for our&lt;br /&gt;current cpu are totally pointless), but also the others are&lt;br /&gt;just passing the time, doing "stuff" and invoking probes. So, we try&lt;br /&gt;to "slow down" the other cpus - as soon as they try to probe, we&lt;br /&gt;put them in a small poll loop, checking for xcall calls.&lt;br /&gt;&lt;br /&gt;I put some stats into /proc/dtrace/trace to show the tail of the&lt;br /&gt;teardown. Heres an example. (4-cpu VM):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;46.251921756 #3 teardown start 1322415220.918787912 xcalls=35 probes=2964975&lt;br /&gt;51.106543255 #3 [3] x_call: re-entrant call in progress.&lt;br /&gt;56.645252318 #3 teardown done 10.393330562 xcalls=108922 probes=1161780&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The teardown took 10.3 seconds, and during this time, 1,161,780 probes&lt;br /&gt;fired. We did 108,922 xcalls. (Previously we did a handful only - the&lt;br /&gt;xcalls are very expensive).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.18a-b6115&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2025113620064350329?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2025113620064350329/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-and-quest-for-resilience.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2025113620064350329'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2025113620064350329'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-and-quest-for-resilience.html' title='DTrace and the quest for resilience.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8600022844462404527</id><published>2011-11-26T10:12:00.001-08:00</published><updated>2011-11-26T10:12:50.489-08:00</updated><title type='text'>gcc 4.6.1</title><content type='html'>The world of compilers should be boring. But I am impressed by&lt;br /&gt;GCC 4.6.1.&lt;br /&gt;&lt;br /&gt;Having upgraded my Ubuntu system, I now get the pleasure of the&lt;br /&gt;latest compiler. &lt;br /&gt;&lt;br /&gt;After 25+ years of CRiSP development, the code is pretty stable,&lt;br /&gt;and most bad things have been ironed out of the code base. Having&lt;br /&gt;been ported to just about every operating system out there&lt;br /&gt;(from Cray, to a 512K RAM 80186 MSDOS handheld), and used just&lt;br /&gt;about every compiler ever, its refreshing when the compiler takes&lt;br /&gt;warning-free code, and starts telling you about variables which&lt;br /&gt;are set and never used.&lt;br /&gt;&lt;br /&gt;Previously, the compiler would warn on a variable defined, but&lt;br /&gt;unused, but wouldn't notice something like:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;	void func(void)&lt;br /&gt;	{	int	n;&lt;br /&gt;&lt;br /&gt;		n = some_other_func();&lt;br /&gt;	}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;gcc 4.6 does. So, you get to look to see why you are assigning a&lt;br /&gt;result but never using it, and, in some instances, the call to the&lt;br /&gt;function might hint the call is not needed, especially if that&lt;br /&gt;function is a "pure" function.&lt;br /&gt;&lt;br /&gt;Nice.&lt;br /&gt;&lt;br /&gt;My only complaint now is that its a bit noisier in a couple of other&lt;br /&gt;benign warnings (and hence, its more difficult to see the errors&lt;br /&gt;of interest).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.18a-b6115&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8600022844462404527?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8600022844462404527/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/gcc-461.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8600022844462404527'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8600022844462404527'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/gcc-461.html' title='gcc 4.6.1'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6650024797765322190</id><published>2011-11-25T15:29:00.001-08:00</published><updated>2011-11-25T15:29:46.011-08:00</updated><title type='text'>New dtrace release</title><content type='html'>After the release last weekend, more work was done to chase down&lt;br /&gt;reliability issues. They never seem to disappear. Lets hope this is&lt;br /&gt;better than the last "good one".&lt;br /&gt;&lt;br /&gt;The Fedora Core 15 kernel has some "different" stuff in it compared&lt;br /&gt;to Ubuntu - I tracked down a few unique instructions at the probe&lt;br /&gt;entry points which werent emulated properly.&lt;br /&gt;&lt;br /&gt;I also fixed some other bugs in the instruction emulator.&lt;br /&gt;&lt;br /&gt;Next, the fast-teardown mode wasnt resilient enough and had to be&lt;br /&gt;rewritten. Without this, Ctrl-C a fbt::: probe could take 20+ seconds&lt;br /&gt;due to the frantic xcall conversations as each probe is removed one by one.&lt;br /&gt;Ubuntu and Fedora would complain about a stuck CPU when this happens, &lt;br /&gt;which isnt very social.&lt;br /&gt;&lt;br /&gt;The new fast-teardown removes all dtrace_sync() calls, bar a few.&lt;br /&gt;(I would like to know how Apple does it; I know how Solaris does it; I&lt;br /&gt;dont know if FreeBSD is "as good as" Apple, or very poor). What is very&lt;br /&gt;interesting in this area is that I dont believe the Solaris mechanism&lt;br /&gt;can work on Linux, since Linux can have interrupt-disabled spinlocks which&lt;br /&gt;can prevent cross-cpu interrupts from happening and causing deadlock.&lt;br /&gt;My solution was to keep the lockless solution from before, but&lt;br /&gt;avoid the 2x calls to dtrace_sync per probe teardown, but "using a better&lt;br /&gt;algorithm". For simple one/two probes, it doesnt matter, but when you&lt;br /&gt;have 40,000+ probes in flight it makes a big difference.&lt;br /&gt;&lt;br /&gt;Now the next issue is the tick/profile provider. I found today&lt;br /&gt;that the "profile" provider isnt implemented (it generates a TODO&lt;br /&gt;warnings). On re-reading the Solaris tick/profile provider pages, I &lt;br /&gt;found they are "wrong". The documentation refers to the fact that&lt;br /&gt;you can use, e.g. "tick-1us" to have microsecond level granularity probes,&lt;br /&gt;but they are not supported. Dont know why (maybe my source code is&lt;br /&gt;out of date, or they felt it too worrisome to support them).&lt;br /&gt;&lt;br /&gt;My favorite provider at present is "tick-1ms" - a one millisecond tick.&lt;br /&gt;I knew dtrace/linux could "drop" probes when under pressure, but I havent&lt;br /&gt;had time to investigate why. A simple test:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dtrace -n '       int cnt_1ms, cnt_1s;&lt;br /&gt;        tick-1ms {&lt;br /&gt;                cnt_1ms++;&lt;br /&gt;                printf("%d %d.%09d", cnt_1ms, timestamp / 1000000000, timestamp % (1000000000));&lt;br /&gt;                }&lt;br /&gt;        tick-1s { cnt_1s++;&lt;br /&gt;                printf("tick-1ms=%d tick-1s=%d", cnt_1ms, cnt_1s);&lt;br /&gt;                }&lt;br /&gt;        tick-5s {&lt;br /&gt;                printf("the end: got %d + %d\n", cnt_1ms, cnt_1s);&lt;br /&gt;                exit(0);&lt;br /&gt;                }&lt;br /&gt;'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Lets you quickly see if there are 1000 tick-1ms probes in a second.&lt;br /&gt;&lt;br /&gt;There arent.&lt;br /&gt;&lt;br /&gt;Linux is achieving about 750/sec, and MacOS achieve around 997/second.&lt;br /&gt;&lt;br /&gt;I suspect my attempts to get it to "work" have left a gap in that the&lt;br /&gt;timer is not a cyclic timer, but a refiring timer, so the latency between&lt;br /&gt;handling the end of one tick and repriming the timer, is causing is to expose&lt;br /&gt;a window. Thats probably a part of the explanation; there may be more.&lt;br /&gt;On an idle system, losing 25% of the ticks seems a bit excessive.&lt;br /&gt;(Dtrace/Linux can achieve in excess of 1-2m probes/sec, so, 1000 measily&lt;br /&gt;ticks shouldnt be too bad).&lt;br /&gt;&lt;br /&gt;I need to experiment further. Some things that could be affecting this include: &lt;br /&gt;(a) bad (my) code, (b) virtualisation, (c) high latency kernel scheduling.&lt;br /&gt;&lt;br /&gt;I hope it is (a).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6112&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6650024797765322190?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6650024797765322190/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/new-dtrace-release.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6650024797765322190'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6650024797765322190'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/new-dtrace-release.html' title='New dtrace release'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2903557857236871761</id><published>2011-11-22T14:52:00.001-08:00</published><updated>2011-11-22T14:52:23.355-08:00</updated><title type='text'>1 in a million</title><content type='html'>So, Nigel had reported on the sporadic crashes. Had me very&lt;br /&gt;stumped - running the test suite and being really nasty to the kernel&lt;br /&gt;caused everything to pass. However, I did get a "simple" test to try:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt:::entry'{cnt++;}'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Some minutes after leaving this running would cause a kernel&lt;br /&gt;assertion about interrupts being disabled when they shouldnt be.&lt;br /&gt;&lt;br /&gt;This was a real needle in a haystack - a problem, which occurs&lt;br /&gt;after millions of probes is almost impossible to find. Was it one&lt;br /&gt;specific probe, or maybe single-step, or a race condition in&lt;br /&gt;terminating interrupt routines.&lt;br /&gt;&lt;br /&gt;How to find it?&lt;br /&gt;&lt;br /&gt;Well, I started by adding /proc/dtrace/fbt - which gives a birds eye&lt;br /&gt;view of every FBT probe, along with state, and how many times it had&lt;br /&gt;been hit. I augmented this with the instruction which was probed.&lt;br /&gt;&lt;br /&gt;Running a full fbt::: would cause a large fraction of all kernel&lt;br /&gt;functions to be hit, but, equally, large numbers of functions that are&lt;br /&gt;never called (eg for device drivers which are not active).&lt;br /&gt;&lt;br /&gt;In a sense, this device gives a coverage view of FBT traps.&lt;br /&gt;&lt;br /&gt;By casting out the known "good" probes, I was left with potentially&lt;br /&gt;hundreds of distinct instructions/probes to wade through.&lt;br /&gt;&lt;br /&gt;I didnt get far, other than to look for patterns. If we have a 1:1,000,000&lt;br /&gt;failure, then something has to be happening infrequently. Using the&lt;br /&gt;probe counter for each probe, we get some idea of the rarely called functions.&lt;br /&gt;&lt;br /&gt;Whatever was causing the issue wasnt so much a single function being called&lt;br /&gt;and blowing up the kernel, but a small piece of "damage" because the &lt;br /&gt;trap handler didnt do the right thing. &lt;br /&gt;&lt;br /&gt;The kernel kept saying the same thing: interrupts were unexpectedly disabled,&lt;br /&gt;but knowing which trap was called just prior to this event was nearly&lt;br /&gt;impossible to determine.&lt;br /&gt;&lt;br /&gt;Finally, I started the "binary search" mode of attack: Lets ignore&lt;br /&gt;all instructions which start 0xFn, then 0xEn, ... At some point, around about&lt;br /&gt;0x9n, the problem seemed to disappear. (Absence of evidence isnt evidence of&lt;br /&gt;absence!). Disabling all 0x9n instructions would allow dtrace to run&lt;br /&gt;for hours without issue. Re-enabling would cause a warning within minutes.&lt;br /&gt;&lt;br /&gt;Ok, so what are the 0x9n family of instructions? Well, 0x90 is a NOP - quite&lt;br /&gt;common in the kernel, and nothing of interest. Heres the full list:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;       90                      nop&lt;br /&gt;       91                      xchg   %eax,%ecx&lt;br /&gt;       92                      xchg   %eax,%edx&lt;br /&gt;       93                      xchg   %eax,%ebx&lt;br /&gt;       94                      xchg   %eax,%esp&lt;br /&gt;       95                      xchg   %eax,%ebp&lt;br /&gt;       96                      xchg   %eax,%esi&lt;br /&gt;       97                      xchg   %eax,%edi&lt;br /&gt;       98                      cwtl&lt;br /&gt;       99                      cltd&lt;br /&gt;       9a                      (bad)&lt;br /&gt;       9b                      fwait&lt;br /&gt;       9c                      pushfq&lt;br /&gt;       9d                      popfq&lt;br /&gt;       9e                      sahf&lt;br /&gt;       9f                      lahf&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The main instruction of interest above is pushfq. So, exactly how many probes&lt;br /&gt;have PUSHFQ in the first instruction of the function? Well, exactly two, on my&lt;br /&gt;kernel:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;# count patchpoint opcode inslen modrm name&lt;br /&gt;...&lt;br /&gt;21603 0000 ffffffff81032704 9c 1 -1 kernel:end_pv_irq_ops_restore_fl:entry 9c&lt;br /&gt;...&lt;br /&gt;24295 0010 ffffffff81539c00 9c 1 -1 kernel:native_load_gs_index:entry 9c&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, env_pv_irq_ops_restore_fl hasnt been called (column 2), but&lt;br /&gt;native_load_gs_index, was called 16 times. So, this isnt on every&lt;br /&gt;syscall or interrupt, but its a rare occurrence. And that is what&lt;br /&gt;we are after.&lt;br /&gt;&lt;br /&gt;So, off to examine the code, and I found a bug. PUSHFQ was deliberately&lt;br /&gt;disabling interrupts (because of a bug), and PUSHFQ doesnt touch the&lt;br /&gt;interrupt flag. So, bingo! We could return back to the kernel with interrupts&lt;br /&gt;disabled, and, luckily, the sporadic consistency checks would tell&lt;br /&gt;us that interrupts were disabled when they shouldnt be.&lt;br /&gt;&lt;br /&gt;When writing code in an SMP kernel, having interrupts disabled and sleeping&lt;br /&gt;waiting for an event, can lead to deadlock. Thats why the checks are there.&lt;br /&gt;And they helped, very nicely, to say "something is wrong" although, not&lt;br /&gt;exactly where.&lt;br /&gt;&lt;br /&gt;So far, dtrace is holding up well.&lt;br /&gt;&lt;br /&gt;Lets see if Nigel is happy now.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6112&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2903557857236871761?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2903557857236871761/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/so-nigel-had-reported-on-sporadic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2903557857236871761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2903557857236871761'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/so-nigel-had-reported-on-sporadic.html' title='1 in a million'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-9108448796280917615</id><published>2011-11-20T14:21:00.001-08:00</published><updated>2011-11-20T14:21:04.357-08:00</updated><title type='text'>Dtrace bugs...</title><content type='html'>Nigel has been letting me know things are not 100% yet. He has a very&lt;br /&gt;rare scenario of a crash, and I have some issues with launching&lt;br /&gt;firefox whilst tracing all system calls.&lt;br /&gt;&lt;br /&gt;Looks like at least one isnt totally correct for some reason.&lt;br /&gt;&lt;br /&gt;May take a while to debug these scenarios (syscall tracing should be&lt;br /&gt;deterministically easy), but 1:1,000,000,000 is more tricky to&lt;br /&gt;diagnose. Maybe I can add enforced long delays to force interrupt&lt;br /&gt;traffic to jam up across the cpus.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6112&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-9108448796280917615?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/9108448796280917615/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/nigel-has-been-letting-me-know-things.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9108448796280917615'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9108448796280917615'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/nigel-has-been-letting-me-know-things.html' title='Dtrace bugs...'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2812758526222610269</id><published>2011-11-20T14:18:00.001-08:00</published><updated>2011-11-20T14:18:56.037-08:00</updated><title type='text'>NMI revisited</title><content type='html'>After researching and reminding myself how it works, we can have&lt;br /&gt;probe points from an NMI, but first, we need to fix the interrupt&lt;br /&gt;handlers.&lt;br /&gt;&lt;br /&gt;An NMI interrupt can interrupt a normal interrupt routine. A normal&lt;br /&gt;interrupt cannot interrupt the NMI. *But* an NMI can take a trap, e.g.&lt;br /&gt;a breakpoint trap.&lt;br /&gt;&lt;br /&gt;So, now the fun starts. When an interrupt terminates, the handler&lt;br /&gt;executes an IRET instruction. An IRET is very similar to a POPF/RET&lt;br /&gt;instruction sequence except for one very subtle point.&lt;br /&gt;&lt;br /&gt;The subtle point is that the IRET will dismiss an NMI. If we&lt;br /&gt;execute an IRET from a breakpoint trap which trapped inside an NMI interrupt,&lt;br /&gt;then chances are that the NMI will be immediately reasserted - from&lt;br /&gt;inside the NMI interrupt handler.&lt;br /&gt;&lt;br /&gt;This blows up and the CPU will hit a double or triple fault and reboot.&lt;br /&gt;&lt;br /&gt;So, to restate, no interrupt can execute an IRET if we are nested inside&lt;br /&gt;an NMI. Therefore we need to keep some state.&lt;br /&gt;&lt;br /&gt;Heres the hint at what can happen:&lt;br /&gt;&lt;br /&gt;https://lkml.org/lkml/2010/7/14/417&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-07/msg05459.html&lt;br /&gt;&lt;br /&gt;which details the tricks the kernel is doing to handle the nested&lt;br /&gt;interrupt structure.&lt;br /&gt;&lt;br /&gt;Now, I need to figure out how to do the same thing without damaging&lt;br /&gt;the kernel. (I have some prototype code but need to fix one issue).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6112&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2812758526222610269?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2812758526222610269/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/after-researching-and-reminding-myself.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2812758526222610269'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2812758526222610269'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/after-researching-and-reminding-myself.html' title='NMI revisited'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3284255927457074941</id><published>2011-11-18T14:58:00.001-08:00</published><updated>2011-11-18T14:58:17.016-08:00</updated><title type='text'>NMI #2</title><content type='html'>Just put out a new release which hacks around the NMI issue - whilst&lt;br /&gt;dtrace is loaded, we do not propagate the NMI to the kernel. This&lt;br /&gt;certainly seems to fix the dtrace-on-real-hardware issue.&lt;br /&gt;&lt;br /&gt;At a cost/loss of being able to have NMIs.&lt;br /&gt;&lt;br /&gt;Next up is to contemplate only disabling NMI when no probes are active,&lt;br /&gt;but better is to figure out how to avoid a double-trap if we hit an NMI&lt;br /&gt;probe. I dont know if thats possible, by definition of an NMI.&lt;br /&gt;&lt;br /&gt;(Solaris gets away with it, because I suspect the NMI code is such that&lt;br /&gt;nothing uses it and/or dtrace wont allow probes to those functions?)&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3284255927457074941?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3284255927457074941/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/just-put-out-new-release-which-hacks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3284255927457074941'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3284255927457074941'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/just-put-out-new-release-which-hacks.html' title='NMI #2'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2912226433044953824</id><published>2011-11-18T14:40:00.001-08:00</published><updated>2011-11-18T14:40:28.419-08:00</updated><title type='text'>Dtrace and the NMI Interrupt</title><content type='html'>Nigel has helped greatly in moving us forward on Dtrace. The latest&lt;br /&gt;release works well inside a VM, but alas, might not work on real hardware.&lt;br /&gt;&lt;br /&gt;I ran some tests on Ubuntu 11.04 on real hardware and dtrace was&lt;br /&gt;rock solid (well, it survived 500m+ probes and running the test&lt;br /&gt;suite twice over).&lt;br /&gt;&lt;br /&gt;But, if the real hardware is generating NMI interrupts then we are toast.&lt;br /&gt;Ubuntu doesnt do this by default. Maybe Fedora does, or its a function&lt;br /&gt;of the hardware and cpu.&lt;br /&gt;&lt;br /&gt;What I found is that if I loaded the oprofile package (which uses NMI&lt;br /&gt;interrupts to feed the profiler), then the host will reboot if dtrace&lt;br /&gt;is loaded or if its invoked.&lt;br /&gt;&lt;br /&gt;The reason is more than likely that within the NMI handler of the kernel,&lt;br /&gt;if we place a dtrace probe, then we will trigger a breakpoint trap&lt;br /&gt;from inside the NMI handler. I dont believe this is valid or meaningful&lt;br /&gt;(nothing should interrupt an NMI - it should only be used for small&lt;br /&gt;lightweight and contextless operations, such as watchdogs).&lt;br /&gt;&lt;br /&gt;So, we have a problem because we dont know the call graph of an NMI&lt;br /&gt;interrupt, so we dont know what is safe to probe (even if we did know,&lt;br /&gt;chances are high that common/useful probably routines would have to be&lt;br /&gt;excluded).&lt;br /&gt;&lt;br /&gt;I will experiment with turning off the NMI whilst dtrace is loaded.&lt;br /&gt;Thats a very unfair thing to do (disabling oprofile, or other real&lt;br /&gt;hardware events which need NMI), but at least we would be safe.&lt;br /&gt;&lt;br /&gt;I am going to research what Solaris does for NMI ints. Maybe&lt;br /&gt;that will educate me to the problem.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2912226433044953824?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2912226433044953824/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/nigel-has-helped-greatly-in-moving-us.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2912226433044953824'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2912226433044953824'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/nigel-has-helped-greatly-in-moving-us.html' title='Dtrace and the NMI Interrupt'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5302942235169733401</id><published>2011-11-17T14:31:00.001-08:00</published><updated>2011-11-17T14:31:32.839-08:00</updated><title type='text'>Dtrace .. new release</title><content type='html'>I just put out a new release. Hopefully this fixes a number&lt;br /&gt;of stability issues and some silly bugs. I have rewritten the mutex&lt;br /&gt;code to decouple from the kernel, and avoid reentrancy issues.&lt;br /&gt;&lt;br /&gt;Just as I was about to release this a few days back I found a regression&lt;br /&gt;where on repeated driver reloads, the kernel would crash or complain&lt;br /&gt;of interrupts being disabled when they shouldnt. &lt;br /&gt;&lt;br /&gt;I spent ages trying to drill down to where I had screwed up, and&lt;br /&gt;in the end, it was a really stupid typo (sizeof(mp) vs sizeo(*mp)).&lt;br /&gt;&lt;br /&gt;Glad to fix that and get back to a degree of sanity.&lt;br /&gt;&lt;br /&gt;So this release has been tested exhaustively on Fedora Core 15 and&lt;br /&gt;Ubuntu 11.10 (Linux kernel 3.0). I may have broken the build on&lt;br /&gt;earlier kernels, and am looking to try and test on the earlier releases.&lt;br /&gt;&lt;br /&gt;Please give it a try.&lt;br /&gt;&lt;br /&gt;As an aside, I did get an email from someone trying out Dtrace on an&lt;br /&gt;Oracle Linux host, and getting some strange complaints when /usr/bin/dtrace&lt;br /&gt;didnt like the command line. /usr/bin/dtrace is *Oracle*s dtrace, not mine,&lt;br /&gt;and hopefully the person is in a better position now running a working&lt;br /&gt;dtrace on the system.&lt;br /&gt;&lt;br /&gt;Lets see how much damage people can do to this. Many thanks to Nigel Smith&lt;br /&gt;who kept harping on at me for "this doesnt work" which lead me to deep dive&lt;br /&gt;a number of corner cases.&lt;br /&gt;&lt;br /&gt;I have temporarily disabled some aspects of pid tracing/shadowing&lt;br /&gt;and will need to fix that before evaluating the whole pid tracing space.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5302942235169733401?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5302942235169733401/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/i-just-put-out-new-release.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5302942235169733401'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5302942235169733401'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/i-just-put-out-new-release.html' title='Dtrace .. new release'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1827369063025847926</id><published>2011-11-14T13:14:00.001-08:00</published><updated>2011-11-14T13:14:07.648-08:00</updated><title type='text'>Scrollbars: Time to banish them</title><content type='html'>Theres two implementations of scrollbars which need to be&lt;br /&gt;fixed.&lt;br /&gt;&lt;br /&gt;First, Ubuntu scrollbars are annoying. Very annoying. The Ubuntu&lt;br /&gt;scrollbar is invible until you move the mouse close to where&lt;br /&gt;the scrollbar is and then the partial scrollbar appears. Why?&lt;br /&gt;Its unobvious and annoying having to have your eyes track for the&lt;br /&gt;existence of the scrollbar before you can scroll. 30+ years of GUI&lt;br /&gt;development chucked down the drain.&lt;br /&gt;&lt;br /&gt;Second, an iPod scrollbar. When watching a movie, many shows I record&lt;br /&gt;from TV have adverts. They are predictable. If I am watching a 2hr or&lt;br /&gt;more show, and an advert appears, then I use my finger to scroll&lt;br /&gt;past them. But on a long show, a 2-5min interlude is a tiny &lt;br /&gt;fraction of the scroller, and its nearly impossible to do this accurately,&lt;br /&gt;especially as the ipod/iphone scrollbar is not analogue, but digital&lt;br /&gt;in nature, leveraging 10s-1m+ scrollable fragments. The ipod has a way&lt;br /&gt;to fine-scroll, but for a long film, its very difficult to scroll&lt;br /&gt;accurately whilst keeping an eye out for where the advert break ends.&lt;br /&gt;&lt;br /&gt;Whats really needed for the ipod is a scrollbar which is wider&lt;br /&gt;than the screen, so you can quickly "flick thru" without worrying&lt;br /&gt;that the longer the show, the more inaccurate the scroller is.&lt;br /&gt;&lt;br /&gt;(Another bad design: why cant you see the title of what you are watching&lt;br /&gt;on the scroll area? One has to stop watching the show to see what it&lt;br /&gt;is you are watching.)&lt;br /&gt;&lt;br /&gt;The scroller size on an ipad is not an issue, because there are so many&lt;br /&gt;pixels to play with.&lt;br /&gt;&lt;br /&gt;If you think scrollbars are easy, then think again. If you are&lt;br /&gt;scrolling through a 1-million line file, there are not enough pixels to&lt;br /&gt;scroll to an arbitrary line in the file. Even CRiSP uses a&lt;br /&gt;fractional position, which for larger files can mean a single pixel&lt;br /&gt;move of the scrollbar could map to 1000 lines in a file on a 1000-pixel&lt;br /&gt;high scrollbar.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1827369063025847926?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1827369063025847926/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/scrollbars-time-to-banish-them.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1827369063025847926'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1827369063025847926'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/scrollbars-time-to-banish-them.html' title='Scrollbars: Time to banish them'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5907537086071651208</id><published>2011-11-12T15:23:00.001-08:00</published><updated>2011-11-12T15:23:33.657-08:00</updated><title type='text'>In..Out..In..Out..Shake it all about !</title><content type='html'>In the beginning was the mutex. And lo! This was mapped to a Linux mutex.&lt;br /&gt;But this caused problems, because a mutex cannot be used inside an interrupt&lt;br /&gt;routine.&lt;br /&gt;&lt;br /&gt;On the second day, this was mapped to a semaphore. Semaphores are good.&lt;br /&gt;They can be used in an interrupt routine.&lt;br /&gt;&lt;br /&gt;On the third day, the semaphores were replaced with custom mutexes &lt;br /&gt;(effectively spinlocks). Because a semaphore could suspend the calling&lt;br /&gt;process inside a nested interrupt.&lt;br /&gt;&lt;br /&gt;On the fourth day, the custom mutexes were replaced by semaphores, &lt;br /&gt;because a timer probe would invoke calls to spinlocks and preempt&lt;br /&gt;disables, and lead to recursive probe faults.&lt;br /&gt;&lt;br /&gt;On the fifth day, the semaphores were replaced with different custom&lt;br /&gt;mutexes. Ones which supported nested operation, and avoided depending on&lt;br /&gt;Linux functions.&lt;br /&gt;&lt;br /&gt;Get the picture? Its complicated. &lt;br /&gt;&lt;br /&gt;Why is it so complicated? Because Solaris interrupt management&lt;br /&gt;(via splx()) doesnt map to a the cli/sti mode of a processor. Solaris'&lt;br /&gt;interrupt mechanism has existed for nearly two decades, based on&lt;br /&gt;a processor model defined for the PDP-11. Linux's model is sophisticated,&lt;br /&gt;and different.&lt;br /&gt;&lt;br /&gt;One thing I realised this week .. why it took me so long, I dont know,&lt;br /&gt;is that when you take a breakpoint trap, interrupts are left enabled.&lt;br /&gt;This means, during an FBT probe, the timer can fire and you have a nested&lt;br /&gt;interrupt.&lt;br /&gt;&lt;br /&gt;This week has seen me try to solve Nigels Fedora Core issues where, &lt;br /&gt;under load, the system would panic and reboot. The root cause here is nested&lt;br /&gt;interrupts, and the dtrace module not segregating itself strongly enough&lt;br /&gt;from the real kernel, allowing nested operation and deadlocks to arise.&lt;br /&gt;&lt;br /&gt;I have to be very careful to get the key execution paths working in my&lt;br /&gt;head (an fbt/breakpoint interrupt, a timer/tick interrupt - both standalone&lt;br /&gt;and on top of an fbt/breakpoint interrupt, and the xcall/deadlock issue).&lt;br /&gt;&lt;br /&gt;I also believe I have caught another implementation issue. Interestingly&lt;br /&gt;I believe Linux ftrace mechanism suffers the same issue. Namely,&lt;br /&gt;when single stepping, one has to be careful of 64-bit instructions&lt;br /&gt;which use %RIP relative addressing modes. Both dtrace and ftrace almost&lt;br /&gt;do the right thing, but can fail if dtrace is more than 4GB away from&lt;br /&gt;where the kernel is loaded. (I think this might explain some erraticness&lt;br /&gt;on large RAM machines).&lt;br /&gt;&lt;br /&gt;I am not sure there is a cure for this (well, not a quick/easy one), and&lt;br /&gt;I may have to disable those offending probes (typically only a handful on&lt;br /&gt;a normal kernel - not a great loss).&lt;br /&gt;&lt;br /&gt;Anyway, let me continue experimenting with mutexes and see how close I&lt;br /&gt;can get. (My current experiment is real good, except for problems when&lt;br /&gt;kzalloc is called with a mutex...lets see if I can fix that).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5907537086071651208?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5907537086071651208/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/inoutinoutshake-it-all-about.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5907537086071651208'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5907537086071651208'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/inoutinoutshake-it-all-about.html' title='In..Out..In..Out..Shake it all about !'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8987489973143310469</id><published>2011-11-07T15:30:00.001-08:00</published><updated>2011-11-07T15:30:12.022-08:00</updated><title type='text'>Dtrace fixed (hopefully)</title><content type='html'>I just put out a new release which fixes the terrors of the&lt;br /&gt;last release. Hopefully this is more stable.&lt;br /&gt;&lt;br /&gt;I found a bug in strace which caused repeated SIGSEGVs in one&lt;br /&gt;of the tests (strace fails to trace a process it is launching &lt;br /&gt;in about 1:100 test cases), even without dtrace active at the time.&lt;br /&gt;&lt;br /&gt;I havent integrated in the fast-teardown/xcall optimisation yet,&lt;br /&gt;as I need to tidy that up. Hopefully in the next release.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8987489973143310469?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8987489973143310469/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-fixed-hopefully.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8987489973143310469'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8987489973143310469'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-fixed-hopefully.html' title='Dtrace fixed (hopefully)'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7075054328160559703</id><published>2011-11-07T14:36:00.001-08:00</published><updated>2011-11-07T14:36:47.357-08:00</updated><title type='text'>Dtrace .. what a crock</title><content type='html'>I put out a new release over the weekend, and all I can say is apologies.&lt;br /&gt;It appears to be a very bad release. Initial testing was positive,&lt;br /&gt;but Nigel pointed out some glaring errors, and on investigation&lt;br /&gt;some of my changes were bad regressions.&lt;br /&gt;&lt;br /&gt;I think I have the issues resolved/understood, and hope to put&lt;br /&gt;out a new release today to fix this mess.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7075054328160559703?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7075054328160559703/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-what-crock.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7075054328160559703'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7075054328160559703'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-what-crock.html' title='Dtrace .. what a crock'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1132784680306042242</id><published>2011-11-06T05:01:00.001-08:00</published><updated>2011-11-06T05:01:04.685-08:00</updated><title type='text'>Dtrace TCP Provider</title><content type='html'>Now that some of the hard work of getting 3.0 kernels working is out&lt;br /&gt;of the way, I am now looking to add the TCP provider.&lt;br /&gt;&lt;br /&gt;Heres the first very simple cut:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -l -P tcp&lt;br /&gt;   ID   PROVIDER            MODULE                          FUNCTION NAME         48        tcp                                                     state-change&lt;br /&gt;$ dtrace -P tcp&lt;br /&gt;dtrace: description 'tcp' matched 1 probe&lt;br /&gt;&lt;br /&gt;CPU     ID                    FUNCTION:NAME&lt;br /&gt;  0     48                    :state-change&lt;br /&gt;  0     48                    :state-change&lt;br /&gt;  0     48                    :state-change&lt;br /&gt;  0     48                    :state-change&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It doesnt provide the arg0/arg1 values in the probe callback, but will&lt;br /&gt;see what I can do.&lt;br /&gt;&lt;br /&gt;Nigel just reported that the new release is better (thanks for the quick&lt;br /&gt;feedback Nigel), but that this is showing a problem:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n 'fbt:kernel:: BEGIN {exit(0);}'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;On my kernel, it slowed down enormously, but after 60s of no RCU response,&lt;br /&gt;the kernel just hangs. So, off to fix that.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1132784680306042242?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1132784680306042242/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-tcp-provider.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1132784680306042242'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1132784680306042242'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-tcp-provider.html' title='Dtrace TCP Provider'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-979037072784023801</id><published>2011-11-05T16:46:00.001-07:00</published><updated>2011-11-05T16:46:33.407-07:00</updated><title type='text'>How fast is fast?</title><content type='html'>In deeper diving the dtrace lockup on 3.x kernels, I have&lt;br /&gt;been revisiting the xcall code.&lt;br /&gt;&lt;br /&gt;Heres a test for you to try (on FreeBSD, MacOS, Solaris and Linux):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt:::&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This will trace all the FBT probes (on my Linux VM, thats about 48000&lt;br /&gt;probes). "Time" how quick it is to start scrolling the output. (About 1-2s?)&lt;br /&gt;&lt;br /&gt;Now ^C (twice, you need to do this twice for some reason in dtrace).&lt;br /&gt;&lt;br /&gt;How long til you get your shell prompt back? (On my linux system,&lt;br /&gt;its down to 1-2s vs maybe 3-4s on a dual core MacOS box).&lt;br /&gt;&lt;br /&gt;Why is it *slower* to exit than it is to invoke?&lt;br /&gt;&lt;br /&gt;The answer is: IPI or interprocessor interrupts. Now, this may&lt;br /&gt;be fast on Solaris - its engineered nicely. On Mac and Linux at least,&lt;br /&gt;its not. Its really difficult to work out "why".&lt;br /&gt;&lt;br /&gt;When you ^C dtrace, it has to tear down all the probes. In theory&lt;br /&gt;this is easy, and faster than the initial construction. But the&lt;br /&gt;Solaris/Dtrace code has a nasty performance issue. For every probe,&lt;br /&gt;three dtrace_sync() functions are invoked, and this involves&lt;br /&gt;communication with the N-1 other CPUs to process an IPI interrupt.&lt;br /&gt;&lt;br /&gt;This is what I emulate on Linux. But its slow. My 48000 probes&lt;br /&gt;involve nearly 200k IPI interrupts to the N-1 processors. (I am&lt;br /&gt;testing on a 4-cpu VM). And the IPI is either delivered "slowly" or&lt;br /&gt;"received slowly" on the target CPUs.&lt;br /&gt;&lt;br /&gt;What is worse, far far worse is the Linux 3.0.4 kernel I am&lt;br /&gt;using in Ubuntu 11.10 (I compiled my own; the default distro is&lt;br /&gt;3.0.12). If the tear down takes too long, the kernel may notice&lt;br /&gt;a "hung" cpu, and after a minute or two, will hang, hard, the kernel&lt;br /&gt;due to the lack of responsiveness. (I will need to see&lt;br /&gt;if I can find out how it knows the CPU is not-idle, and maybe fool it).&lt;br /&gt;&lt;br /&gt;I really dislike this 3.0 kernel - its a very harsh environment&lt;br /&gt;for a buggy driver to live in, and dtrace has to work even harder&lt;br /&gt;to avoid being caught in the searchlight of the kernel.&lt;br /&gt;&lt;br /&gt;I have a hack/optimisation for this problem, which is proving&lt;br /&gt;rewarding (if the "other" cpu is not sitting inside a dtrace probe&lt;br /&gt;handler, then we have nothing to do, so we can skip the IPI interrupt.&lt;br /&gt;But its not bullet-proof in the few lines of code addition).&lt;br /&gt;&lt;br /&gt;How does Solaris handle this? Well, on Solaris, direct interrupt&lt;br /&gt;disabling does not happen. Instead, a software processor level flag&lt;br /&gt;is set, and the interrupt handler can allow interrupts, even if&lt;br /&gt;logically, the code in question, does not want to be interrupted.&lt;br /&gt;I believe by making direct dtrace checks, that IPIs across cpus&lt;br /&gt;can happen even if the other cpu is in a critical section. I wish&lt;br /&gt;I could prove my understanding of the code (but that would be a distraction).&lt;br /&gt;&lt;br /&gt;Hm. Just took a look at Oracles version:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;void dtrace_xcall(processorid_t cpu, dtrace_xcall_t func, void *arg)&lt;br /&gt;{&lt;br /&gt;        if (cpu == DTRACE_CPUALL) {&lt;br /&gt;               smp_call_function(func, arg, 1);&lt;br /&gt;        } else&lt;br /&gt;               smp_call_function_single(cpu, func, arg, 1);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;All I can say is good luck to them if thats what they think is sufficient&lt;br /&gt;to do the job. Thats pretty much the code I implemented originally, and&lt;br /&gt;it doesnt work.&lt;br /&gt;&lt;br /&gt;Oh well.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-979037072784023801?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/979037072784023801/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/how-fast-is-fast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/979037072784023801'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/979037072784023801'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/how-fast-is-fast.html' title='How fast is fast?'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-298104714716849308</id><published>2011-11-05T02:47:00.001-07:00</published><updated>2011-11-05T02:47:34.161-07:00</updated><title type='text'>Dtrace progress</title><content type='html'>I've spent the last week or so looking at why Dtrace on the 3.x&lt;br /&gt;kernel is misbehaving and found some interesting things on the journey.&lt;br /&gt;&lt;br /&gt;Firstly, printk() seems to be broken. In the 3.x kernel, they&lt;br /&gt;appear to have added a recursion detection and stronger locking&lt;br /&gt;semantics, which means attempts to call printk() from an interrupt&lt;br /&gt;(or some other strong locking mode), will hang the kernel.&lt;br /&gt;&lt;br /&gt;That has made debugging the other problems very difficult.&lt;br /&gt;So be it, I have removed or disabled most of the printk() console&lt;br /&gt;output.&lt;br /&gt;&lt;br /&gt;I have an internal function - dtrace_printf() which does what&lt;br /&gt;printk() does, but writes to an internal circular buffer, which is&lt;br /&gt;safe, but isnt much help if the kernel locks up.&lt;br /&gt;&lt;br /&gt;I have decided to move totally away from the kernel spin locks&lt;br /&gt;and mutex/semaphore support, and use my own mutex implementation.&lt;br /&gt;This avoids problems when the kernel mutexes can allow preemption&lt;br /&gt;and other re-entrant problems, probably leading to some of the lock up.&lt;br /&gt;&lt;br /&gt;I even went the whole hog and added the kernel debuggers to try and&lt;br /&gt;debug the problem (kdebug/kgdb). This was totally unuseful for two&lt;br /&gt;reasons. One, you cannot set breakpoints in the kernel (despite&lt;br /&gt;trying). I had to turn off the kernel RODATA config option&lt;br /&gt;since this prevents some parts of the kernel being writable,&lt;br /&gt;but I still got errors. I could never get the debugger to wake&lt;br /&gt;up on an RCU lockup or allow interactive "break-in" to the locked&lt;br /&gt;up kernel.&lt;br /&gt;&lt;br /&gt;The kernel rarely panics - it just locks up. Very frustrating.&lt;br /&gt;&lt;br /&gt;Having fixed various things, it appears dtrace works much better.&lt;br /&gt;The "dtrace -n fbt:::" works very well, EXCEPT ctrl-c can hang the kernel.&lt;br /&gt;&lt;br /&gt;What is interesting is that during ctrl-c of dtrace, the amount&lt;br /&gt;of work to teardown the probes is huge - specifically, the number of&lt;br /&gt;xcalls called on teardown is about 3x the number of probes actually&lt;br /&gt;played. So, down an fbt::: probe, which is placing around 40,000&lt;br /&gt;probes, means on the ^C, the system is sluggish for a number of seconds as&lt;br /&gt;tens/hundreds of thousands of xcalls are invoked. I think this is &lt;br /&gt;problem which needs to be addressed in the real dtrace.&lt;br /&gt;&lt;br /&gt;The problem is that as each probe is torn down, a number of dtrace_sync()&lt;br /&gt;calls are invoked, and each one is talking to all the other CPUs&lt;br /&gt;to ensure synchronisation. For reasons I cannot tell, this&lt;br /&gt;is very expensive - presumably the other CPUs may be in a disabled&lt;br /&gt;interrupt state, so we have to wait for that&lt;br /&gt;cpu to come out of the interrupt region, or leverage the NMI&lt;br /&gt;interrupt to break the deadlock.&lt;br /&gt;&lt;br /&gt;I'm still trying to understand how xcall works -- if we ask&lt;br /&gt;another cpu to take an IPI interrupt and it is stuck in a long&lt;br /&gt;spinlock, then the xcall may take a while or forever to be invoked.&lt;br /&gt;I dont understand how Solaris breaks this deadlock, but it does&lt;br /&gt;appear solaris tries hard to keep interrupts enabled at all times,&lt;br /&gt;which is different from Linux. I may have that misunderstood in Solaris.&lt;br /&gt;&lt;br /&gt;MacOS does something different/similar in that on a xcall, the&lt;br /&gt;invoking CPU keeps an eye out for recursive TLB shootdowns.&lt;br /&gt;&lt;br /&gt;I hope to make a new release of dtrace in next few days if&lt;br /&gt;I feel happy I havent made things worse.&lt;br /&gt;&lt;br /&gt;I really dont like the Ctrl-C/teardown performance - it can&lt;br /&gt;lead to the kernel complaining about stuck CPUs.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-298104714716849308?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/298104714716849308/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-progress.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/298104714716849308'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/298104714716849308'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/11/dtrace-progress.html' title='Dtrace progress'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2911598341174170291</id><published>2011-10-30T15:25:00.001-07:00</published><updated>2011-10-30T15:25:27.823-07:00</updated><title type='text'>Dtrace and printk()</title><content type='html'>Been debugging lockups in 3.0 kernels. Very difficult to debug, since&lt;br /&gt;all attempts to diagnose what was causing it were met with kernel&lt;br /&gt;lockups.&lt;br /&gt;&lt;br /&gt;Sometimes a trace like:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt::[a-e]*:&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;would work, and sometimes not. Lots of variations and thought processes&lt;br /&gt;were applied, and nothing worked.&lt;br /&gt;&lt;br /&gt;Then, after a little rest, I went back to basics. Lets assume one function&lt;br /&gt;blocks us, so we try the binary search to see which fbt function it is.&lt;br /&gt;&lt;br /&gt;Turns out that the new kernel has modified printk() - the kernel&lt;br /&gt;printing function in some way. (I think its to do with recursive prints,&lt;br /&gt;but not concluded this yet).&lt;br /&gt;&lt;br /&gt;What appears to be happening is if printk() is called at the wrong&lt;br /&gt;time, the kernel will lock up, waiting on a semaphore, to detect if&lt;br /&gt;the console is free for printing.&lt;br /&gt;&lt;br /&gt;printk() is not normally called much during dtrace, but there&lt;br /&gt;seem to be enough places. If I map printk() to a do-nothing function,&lt;br /&gt;then sanity appears to be restored and I can run against fbt:::.&lt;br /&gt;&lt;br /&gt;So I need to either avoid printk() in dtrace, or, be judicious where&lt;br /&gt;its used. (Dtrace already has an internal dtrace_printf function to write&lt;br /&gt;to an internal circular buffer, but thats not visible if the kernel&lt;br /&gt;crashes; I may need to fix that).&lt;br /&gt;&lt;br /&gt;So, if you are having trouble on Ubuntu 11.04 or 11.10, or other&lt;br /&gt;equivalent, using Linux 3.0.x, then stay tuned.&lt;br /&gt;&lt;br /&gt;Thanks Nigel Smith, for pushing me to go hunt this down.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2911598341174170291?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2911598341174170291/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-and-printk.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2911598341174170291'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2911598341174170291'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-and-printk.html' title='Dtrace and printk()'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2520799697216788302</id><published>2011-10-29T01:23:00.001-07:00</published><updated>2011-10-29T01:23:17.941-07:00</updated><title type='text'>Dtrace .. does it work. Yes. No. Yes. No. What?!</title><content type='html'>Got a strange one here. Nigel Smith reported issues running dtrace&lt;br /&gt;on real hardware (FC15). I could reproduce the issue on&lt;br /&gt;Ubuntu 11.10 (Linux 3.0.0). A simply&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n vminfo:::'{printf("%s", execname);}'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;would panic or hang the kernel - not straightaway, but within say 5mins,&lt;br /&gt;especially if doing a:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ while true&lt;br /&gt;&gt; do&lt;br /&gt;&gt; date&lt;br /&gt;&gt; done&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;in another window. &lt;br /&gt;&lt;br /&gt;I've been staring at the dtrace code and trying various things to&lt;br /&gt;see what causes it. Its an annoying panic because I lose control of the&lt;br /&gt;kernel and no way of figuring out what happened immediately leading&lt;br /&gt;up the issue. The stack trace on the panic doesnt help &lt;br /&gt;(I am seeing the same panic in the e1000 driver cleanup code,&lt;br /&gt;but no references to dtrace causing this). &lt;br /&gt;&lt;br /&gt;I suspect dtrace is taking an interrupt, maybe not restoring a &lt;br /&gt;register and sometimes, that register happens to be important.&lt;br /&gt;&lt;br /&gt;Especially strange, as, running on Ubuntu 11.04 (2.6.38 kernel),&lt;br /&gt;it works fine. I can really torture the system and it stays up.&lt;br /&gt;&lt;br /&gt;I need to dive more into the entry64.S code to examine what changes&lt;br /&gt;happened around the way an interrupt is handled. If I am lucky I may&lt;br /&gt;be able to localise this to a register issue (%GS is a high probability).&lt;br /&gt;&lt;br /&gt;Linux is really missing a kernel debugger. Theres kgdb and remote&lt;br /&gt;debugging available, but this is really painful, when you suddenly&lt;br /&gt;need to have to compile a new kernel, waste more than 1GB of disk&lt;br /&gt;because of the symbol table, and then try and get it all "working".&lt;br /&gt;&lt;br /&gt;What is needed is a better way to take control on a panic, and&lt;br /&gt;poke around, similar to kadb for older Sun machines.&lt;br /&gt;&lt;br /&gt;I might have to start writing a crude debugger to help with&lt;br /&gt;these annoying "you died but I am not going to tell you why" issues.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6103&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2520799697216788302?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2520799697216788302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-does-it-work-yes-no-yes-no-what.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2520799697216788302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2520799697216788302'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-does-it-work-yes-no-yes-no-what.html' title='Dtrace .. does it work. Yes. No. Yes. No. What?!'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7116594256718129664</id><published>2011-10-26T14:53:00.001-07:00</published><updated>2011-10-26T14:53:13.535-07:00</updated><title type='text'>Dtrace release 20111026</title><content type='html'>This release is an interim release. It fixes the issue I wrote about&lt;br /&gt;earlier, but likely will not compile on kernels earlier than 2.6.39.&lt;br /&gt;&lt;br /&gt;It is the first working example of the vminfo provider. Heres a small&lt;br /&gt;sample of the vminfo probes:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;   48     vminfo            pgmajfault&lt;br /&gt;   49     vminfo            unevictable_mlockfreed&lt;br /&gt;   50     vminfo            pgfree&lt;br /&gt;   51     vminfo            unevictable_mlockfreed&lt;br /&gt;   52     vminfo            compactsuccess&lt;br /&gt;   53     vminfo            compactfail&lt;br /&gt;   54     vminfo            pgdeactivate&lt;br /&gt;   55     vminfo            pgrotated&lt;br /&gt;   56     vminfo            pgactivate&lt;br /&gt;   57     vminfo            kswapd_low_wmark_hit_quickly&lt;br /&gt;   58     vminfo            kswapd_high_wmark_hit_quickly&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;These correspond to the entries in /proc/vmstat and you can now intercept&lt;br /&gt;calls to them, e.g.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n pgmajfault&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I will attempt to update the release to fix the broken earlier kernels&lt;br /&gt;shortly.&lt;br /&gt;&lt;br /&gt;(Getting the /proc/vmstats stuff involves examining an enum and isnt amenable&lt;br /&gt;to #ifdef coding practise, so I may need to autodetect which ones are&lt;br /&gt;available for the current kernel).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6101&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7116594256718129664?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7116594256718129664/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-release-20111026.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7116594256718129664'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7116594256718129664'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-release-20111026.html' title='Dtrace release 20111026'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6747871314303963556</id><published>2011-10-26T13:45:00.001-07:00</published><updated>2011-10-26T13:45:13.621-07:00</updated><title type='text'>Linux! How Dare you?!</title><content type='html'>Strange. I had started work on proving the vminfo provider. It&lt;br /&gt;was very close to showing results.&lt;br /&gt;&lt;br /&gt;Then I hit a strange problem. One of those "Duh!" moments.&lt;br /&gt;&lt;br /&gt;So, to check out the vminfo provider, I need to run dtrace to intercept&lt;br /&gt;the probes. But the kernel kept panicing.&lt;br /&gt;&lt;br /&gt;Heres the panic:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[  458.807224] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Didnt make sense. It worked a moment ago. Or so I thought.&lt;br /&gt;&lt;br /&gt;So next, lets downgrade our probe. We *know* syscalls work because I had&lt;br /&gt;tried that on getting the Ubuntu 11.10 release and validating on the 3.0 kernel.&lt;br /&gt;Lets try something different:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt::sys_chdir:&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;(Given a choice of 250,000 probes to choose, I want one, I *know* exists without&lt;br /&gt;looking up, and one which is executed rarely, preferably, on demand). Yup.&lt;br /&gt;This dies too.&lt;br /&gt;&lt;br /&gt;Ok, so revert the code to the last release. Repeat. Panic.&lt;br /&gt;&lt;br /&gt;Strange.&lt;br /&gt;&lt;br /&gt;Lets try a random.other.kernel (Ubuntu 10.04). No problem.&lt;br /&gt;&lt;br /&gt;What?!&lt;br /&gt;&lt;br /&gt;Ok, the Linux kernel guys are smart. Very smart. What did they do?&lt;br /&gt;&lt;br /&gt;They changed the way a kernel is mapped into memory. Previously, all&lt;br /&gt;kernel pages were pretty much executable (especially .data/.bss). This was&lt;br /&gt;unnecessary and they appear to have fixed this. Dtrace does something slightly&lt;br /&gt;suboptimal - a char[] array is declared for executing the breakpoint&lt;br /&gt;trampolines, but this is a BSS symbol. And the page the structure resides in&lt;br /&gt;is no longer executable.&lt;br /&gt;&lt;br /&gt;*That* explains why it suddenly broke on the latest kernels. The fix&lt;br /&gt;is easy: just mark the page as executable. I would like to use the&lt;br /&gt;proper API or GCC __attribute__ specifier, but the API calls are problematic -&lt;br /&gt;some are GPL-only exports; others dont expose the pgprot permissions etc.&lt;br /&gt;The "lets modify the page table directly" approach seems to work.&lt;br /&gt;&lt;br /&gt;So, I'll release a new dtrace which fixes this problem (and hopefully&lt;br /&gt;a working vminfo provider too).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6101&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6747871314303963556?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6747871314303963556/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/linux-how-dare-you.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6747871314303963556'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6747871314303963556'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/linux-how-dare-you.html' title='Linux! How Dare you?!'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6739480668996593812</id><published>2011-10-25T15:15:00.001-07:00</published><updated>2011-10-25T15:15:36.259-07:00</updated><title type='text'>dtrace update - attempting the vminfo probe</title><content type='html'>I've started tackling the vminfo probe provider, which is effectively&lt;br /&gt;an SDT probe - statically compiled into the kernel. The approach is a&lt;br /&gt;hands-off on the kernel source, but I have a tactic which I hope&lt;br /&gt;will work.&lt;br /&gt;&lt;br /&gt;Theres some issues, such as slightly different code for 32 vs 64 bit&lt;br /&gt;kernel, but the approach seems sound.&lt;br /&gt;&lt;br /&gt;In effect, this is a google map of the kernel binary - searching the&lt;br /&gt;kernel for the instructions which represent increments to the&lt;br /&gt;vmstat data counters, and enabling them for probes.&lt;br /&gt;&lt;br /&gt;Will report back in a while if this looks good enough to use.&lt;br /&gt;Many of the other providers are similar in style, but there are some&lt;br /&gt;issues, such as not all counters in the kernel follow the vmstat&lt;br /&gt;pattern. There is also the issue of provider function names&lt;br /&gt;matching Solaris (some will, some will be extras). And also the callback&lt;br /&gt;arguments need to match Solaris spec or something useful (more troublesome).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6101&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6739480668996593812?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6739480668996593812/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-update-attempting-vminfo-probe.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6739480668996593812'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6739480668996593812'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-update-attempting-vminfo-probe.html' title='dtrace update - attempting the vminfo probe'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1797084265130343725</id><published>2011-10-21T15:46:00.001-07:00</published><updated>2011-10-21T15:46:08.354-07:00</updated><title type='text'>iOS5 .. some thoughts</title><content type='html'>Having used iOS5 for a week or two now, I thought I would add&lt;br /&gt;my thoughts.&lt;br /&gt;&lt;br /&gt;Wins: iPod Touch - they finally fixed the bug where switching away&lt;br /&gt;from video and back, does not take 5+s to figure out whats going on.&lt;br /&gt;Thanks. I appreciate that. Only taken 2years to fix.&lt;br /&gt;&lt;br /&gt;Loss: Watching videos on the iPad is an exercise in frustration. I dont&lt;br /&gt;get why Movies and TV Series are different. With Movies, I can&lt;br /&gt;tell what the movie is (my ripper adds a front screen and title). But&lt;br /&gt;for TV Series, all I see is an array of images for all tv series.&lt;br /&gt;I cannot tell which is which. Please ! Why cant a title be added?&lt;br /&gt;I have to guess which one I want and drill down to see what it really&lt;br /&gt;is.&lt;br /&gt;&lt;br /&gt;Worse, they removed support for having playlists of TV Series sitting&lt;br /&gt;in the ipod playlists section. So, there is no text to navigate what to&lt;br /&gt;see. &lt;br /&gt;&lt;br /&gt;Why is the iPod and iPad different?&lt;br /&gt;&lt;br /&gt;Next: iTunes. I like iTunes - its not bad. Its not good either.&lt;br /&gt;The most recent release disallows selecting a selection of music&lt;br /&gt;tracks and editing the displayed image. Previously I could go to,&lt;br /&gt;for example, Amazon, and drag/drop the cover art into the Get-Info&lt;br /&gt;popup. This doesnt work when multiple tracks are selected.&lt;br /&gt;&lt;br /&gt;Its always a gamble with iOS and iTunes whether the next release&lt;br /&gt;is retrograde or forward thinking.&lt;br /&gt;&lt;br /&gt;Not being able to have a "guest" ipod/ipad so you can selectively copy,&lt;br /&gt;e.g. home videos to a relatives device, is another bad point.&lt;br /&gt;&lt;br /&gt;On the plus side, if Apple had gotten this right, we wouldnt talk about&lt;br /&gt;them so much.&lt;br /&gt;&lt;br /&gt;Oh, and the 64GB phone! At last. A phone with 64GB. Android cannot&lt;br /&gt;compete. I wish Google and the device manufacturers could find a place&lt;br /&gt;for people who want to load lots of video onto a device. Shame that&lt;br /&gt;Apples phone price is so high. &lt;br /&gt;&lt;br /&gt;Best to wait til next year to see if we get SDXC or 128GB device&lt;br /&gt;support in a phone.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6092&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1797084265130343725?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1797084265130343725/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/ios5-some-thoughts.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1797084265130343725'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1797084265130343725'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/ios5-some-thoughts.html' title='iOS5 .. some thoughts'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6762152142643046067</id><published>2011-10-21T15:15:00.001-07:00</published><updated>2011-10-21T15:17:54.577-07:00</updated><title type='text'>CRiSP and FCTerm</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/-OUQzBxTm-pE/TqHvhjE1zzI/AAAAAAAAABM/C74IfAVXoEw/s1600/20111021-fcterm.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 320px; height: 190px;" src="http://3.bp.blogspot.com/-OUQzBxTm-pE/TqHvhjE1zzI/AAAAAAAAABM/C74IfAVXoEw/s320/20111021-fcterm.png" alt="" id="BLOGGER_PHOTO_ID_5666073165839060786" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;People may wander why I keep mentioning fcterm and CRiSP&lt;br /&gt;when they are really waiting for dtrace. Well, consider it an&lt;br /&gt;advert - dtrace is the "draw-in", but CRiSP is the product.&lt;br /&gt;&lt;br /&gt;What is CRiSP? Its an editor. Its been my hobby horse for so&lt;br /&gt;long now, that its old enough to be married, have children,&lt;br /&gt;and you can find it hanging out in bars, wondering why it&lt;br /&gt;didnt listen to its Daddy and get a decent job.&lt;br /&gt;&lt;br /&gt;Its been relatively stagnant for a while - I had run out of things&lt;br /&gt;to implement and support.&lt;br /&gt;&lt;br /&gt;CRiSP is a multiplatform editor, before that was de-riguer. It&lt;br /&gt;runs on Windows/Mac/Linux, and is pure C code. Its small and&lt;br /&gt;tight in terms of code (by todays measuring sticks).&lt;br /&gt;&lt;br /&gt;Fcterm - is a color terminal xterm. Started many many moons ago&lt;br /&gt;when Sun brought out SunOS 3.x, and color monitors were just starting&lt;br /&gt;to appear. In those days, "shelltool" and "cmdtool" and the whole&lt;br /&gt;XView desktop was just too "black and white". Color xterms didnt&lt;br /&gt;really exist (AIX had a nice one). So, fcterm was born. It&lt;br /&gt;was small and fast.&lt;br /&gt;&lt;br /&gt;As CPUs have gotten faster and faster, its still small and fast, but&lt;br /&gt;over recent years it has had new features added (infinite scroll,&lt;br /&gt;graphical drawing ability). (See prior post on "proc" using this&lt;br /&gt;to effect -- http://crtags.blogspot.com/2011/08/some-illustrations-of-proc.html).&lt;br /&gt;&lt;br /&gt;Theres a limit to how much you can add to an xterm. Or is there.&lt;br /&gt;&lt;br /&gt;See below for a screen shot of the latest fcterm. This is character&lt;br /&gt;mode crisp running inside the window, providing graphical features&lt;br /&gt;(also available in the graphical version of CRiSP). I spend&lt;br /&gt;most of my time in an xterm - the Ctrl-Z/fg aspect of switching from&lt;br /&gt;editing to "doing" is convenient, and its worth making the terminal&lt;br /&gt;emulator comfortable.&lt;br /&gt;&lt;br /&gt;[fcterm has a butt-ugly popup window for setting attributes; on my&lt;br /&gt;todo list to revamp that one day].&lt;br /&gt;&lt;br /&gt;Look carefully at the outlining margin and the gridlines for tab&lt;br /&gt;indenting.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="post-comment-link"&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6092&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6762152142643046067?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6762152142643046067/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/crisp-and-fcterm.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6762152142643046067'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6762152142643046067'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/crisp-and-fcterm.html' title='CRiSP and FCTerm'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-OUQzBxTm-pE/TqHvhjE1zzI/AAAAAAAAABM/C74IfAVXoEw/s72-c/20111021-fcterm.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3291106219624493919</id><published>2011-10-20T15:34:00.001-07:00</published><updated>2011-10-20T15:34:40.366-07:00</updated><title type='text'>80pixels</title><content type='html'>My "new" laptop has developed a HW fault on the screen. (A band around 80pixels&lt;br /&gt;high, about 1/3rd of the way down). I can still use the laptop but have to&lt;br /&gt;move the visibility of what i am doing out of the zone.&lt;br /&gt;&lt;br /&gt;Although Dell have a sophisticated web site, it is sub-par when&lt;br /&gt;it comes to reporting a fault. Following the various strands on the web&lt;br /&gt;disallows you from having an online chat to figure out how to actually&lt;br /&gt;report the fault. A pay-phone number will be expensive, and will have&lt;br /&gt;to do that tomorrow.&lt;br /&gt;&lt;br /&gt;The "Lets download a plugin and sod-you if you are not on Windows&lt;br /&gt;or are using Firefox" is very offensive. Customer care is really an&lt;br /&gt;after thought.&lt;br /&gt;&lt;br /&gt;At least Dell have a twitter feed (@DellCares) which is potentially good&lt;br /&gt;but probably a time waster - in wanting details, sprawled out over&lt;br /&gt;hours of waiting for replies and using terse abbreviations (which&lt;br /&gt;is understandable given twitters 140 char length). &lt;br /&gt;&lt;br /&gt;Its really very comical - that a company the size of DELL with&lt;br /&gt;a sophisticated web site (which I hate and like at the same time),&lt;br /&gt;have to "breathe through a straw" to communicate with customers and&lt;br /&gt;not use the interactive IM mechanism they have on their site.&lt;br /&gt;&lt;br /&gt;My banding on the screen went from 1-pixel high to about 60-80 pixels.&lt;br /&gt;Oh well. Why does this happen 6 months into the laptop and not after&lt;br /&gt;2-3 years (which I so desperately wanted my old one to do, so I could&lt;br /&gt;justify this one!).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6085&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3291106219624493919?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3291106219624493919/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/80pixels.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3291106219624493919'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3291106219624493919'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/80pixels.html' title='80pixels'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-713666030245603743</id><published>2011-10-18T13:55:00.001-07:00</published><updated>2011-10-18T13:55:29.933-07:00</updated><title type='text'>DTrace and GIT</title><content type='html'>I have had a few emails about "where is the latest dtrace?" despite&lt;br /&gt;it being posted on every web page I maintain, and provide the lowest common&lt;br /&gt;denominator - tarballs! Such a nice word :-)&lt;br /&gt;&lt;br /&gt;People ask, cant I do "git"? Well...simple answer is "no". There&lt;br /&gt;is a unmaintained(?) dtrace github page, but it wasnt set up by me,&lt;br /&gt;but an enthusiastic supporter.&lt;br /&gt;&lt;br /&gt;I can understand people either wanting to track changes or make&lt;br /&gt;contributions.&lt;br /&gt;&lt;br /&gt;So, I am opening up the conversation to people: Just how badly can&lt;br /&gt;I damage GIT ?!&lt;br /&gt;&lt;br /&gt;I have recently started using GIT and automating the commits at home,&lt;br /&gt;but I am lacking an understanding of git and how to cope with complexity.&lt;br /&gt;&lt;br /&gt;Heres the deal. In theory I have two main machines - a server,&lt;br /&gt;rarely switched on, but the "master", and my laptop, where I do most&lt;br /&gt;of my work -- manually syncing changes (not just dtrace, but&lt;br /&gt;for CRiSP and other things) across the machines. &lt;br /&gt;&lt;br /&gt;I set up git on my master and laptop ($HOME/git) and use symlinks in&lt;br /&gt;my source code dirs so that the git repository is in its own tree.&lt;br /&gt;&lt;br /&gt;Previously, I would just create periodic tarballs as snapshots -&lt;br /&gt;which are mostly fine, but not necessarily synchronised to the sync points.&lt;br /&gt;&lt;br /&gt;I rsync my laptop/master git repositories - probably a bad thing. Is it?&lt;br /&gt;&lt;br /&gt;So, if an external facing git repository is available, what does it&lt;br /&gt;achieve?&lt;br /&gt;&lt;br /&gt;Q1: I can sync to the external repository from my internal, and stop doing &lt;br /&gt;tarballs? (Or keep doing both).&lt;br /&gt;&lt;br /&gt;Q2: Who can touch the git repo? Presumably whoever I permission, or, is&lt;br /&gt;it a free-for-all?&lt;br /&gt;&lt;br /&gt;Q3: Assuming its a trusted circle of people, then how do I sync&lt;br /&gt;from the repo back to my local git repo? &lt;br /&gt;&lt;br /&gt;I really want to review what people do and likely not&lt;br /&gt;accept some contributions or recode them to fit in with my "style".&lt;br /&gt;&lt;br /&gt;I dont want to be a Linus/Git-meister (but will if need be - if it&lt;br /&gt;helps the greater good).&lt;br /&gt;&lt;br /&gt;So, educate me or be gentle with me.&lt;br /&gt;&lt;br /&gt;(I am busy adding some new features to CRiSP and fcterm to show&lt;br /&gt;outline grids whilst editing, and when I finish this, I may go back to&lt;br /&gt;Dtrace and start to remember "What was I planning to do next").&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.17a-b6082&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-713666030245603743?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/713666030245603743/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-and-git.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/713666030245603743'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/713666030245603743'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-and-git.html' title='DTrace and GIT'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6561961535467402984</id><published>2011-10-14T13:49:00.001-07:00</published><updated>2011-10-18T13:46:48.983-07:00</updated><title type='text'>Dtrace on Ubuntu 11.10</title><content type='html'>Now I figured out the issue on Ubuntu 11.04, a few minutes later, I can&lt;br /&gt;prove it appears to work on 11.10 - with the new Linux 3.0 kernel.&lt;br /&gt;&lt;br /&gt;So, thats a relief.&lt;br /&gt;&lt;br /&gt;(Hm. Why did I make such a basic typo above which spoilt the meaning? Why did someone have to post it to me?!)&lt;br /&gt;&lt;br /&gt;&lt;span class="post-comment-link"&gt;&lt;br /&gt;Post created by CRiSP v10.0.16a-b6074&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6561961535467402984?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6561961535467402984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-on-ubuntu-1110.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6561961535467402984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6561961535467402984'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-on-ubuntu-1110.html' title='Dtrace on Ubuntu 11.10'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1362128061739049361</id><published>2011-10-14T13:33:00.001-07:00</published><updated>2011-10-14T13:33:21.274-07:00</updated><title type='text'>Dtrace fixed for Ubuntu 11.04</title><content type='html'>I reported yesterday that dtrace had suddenly broke on my Ubuntu 11.04 release.&lt;br /&gt;Now resolved.&lt;br /&gt;&lt;br /&gt;At some point in the recent past, /proc/kallsyms was layered in&lt;br /&gt;security. Looking at the file as a non-root user means we dont&lt;br /&gt;have access to the symbol table (all values are zero). We fell&lt;br /&gt;over in a heap since we couldnt find the right places to patch&lt;br /&gt;in the kernel.&lt;br /&gt;&lt;br /&gt;My sillyness really - as either I should run "make load" (or tools/load.pl)&lt;br /&gt;as root, or be more careful when symbol lookups fail. The&lt;br /&gt;script and driver dont handle the null pointers to well.&lt;br /&gt;&lt;br /&gt;Simple fix is to shroud the opening of /proc/kallsyms by a call&lt;br /&gt;to sudo.&lt;br /&gt;&lt;br /&gt;Put up a new release to fix this.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.16a-b6074&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1362128061739049361?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1362128061739049361/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-fixed-for-ubuntu-1104.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1362128061739049361'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1362128061739049361'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-fixed-for-ubuntu-1104.html' title='Dtrace fixed for Ubuntu 11.04'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2031334350902514371</id><published>2011-10-13T15:00:00.001-07:00</published><updated>2011-10-13T15:00:53.899-07:00</updated><title type='text'>Dtrace updates..</title><content type='html'>Dtrace has been quiet lately, as I fix up some other things.&lt;br /&gt;&lt;br /&gt;The recent news about Oracle doing Dtrace has generated a bit more&lt;br /&gt;interest in Dtrace, along with some support issues. I put out a couple&lt;br /&gt;of minor fixes for later kernels.&lt;br /&gt;&lt;br /&gt;I just tried dtrace on my Ubuntu 11.04 release (the day that 11.10 has&lt;br /&gt;come out), and it paniced my kernel. Strange, because it did work a while&lt;br /&gt;ago, although I havent done heavy bare metal usage (I do get bored watching&lt;br /&gt;Linux/KDE reboot :-) ).&lt;br /&gt;&lt;br /&gt;So, am downloading the 11.04 and 11.10 ISOs to give them the VM treatment&lt;br /&gt;and see what gives.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.16a-b6074&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2031334350902514371?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2031334350902514371/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-updates.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2031334350902514371'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2031334350902514371'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-updates.html' title='Dtrace updates..'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8389529085201900937</id><published>2011-10-07T12:15:00.001-07:00</published><updated>2011-10-07T12:15:05.718-07:00</updated><title type='text'>DTrace update...sort of.</title><content type='html'>Just pushed out a minor update to fix a problem on SLES11 systems.&lt;br /&gt;&lt;br /&gt;Its a few days after OracleWorld and the Dtrace announcement and&lt;br /&gt;people may bump into this blog because of Adam Leventhals post.&lt;br /&gt;&lt;br /&gt;So, just a few words on the Dtrace/Linux port which has been available&lt;br /&gt;for 2+ years now.&lt;br /&gt;&lt;br /&gt;From what I understand Oracle are planning to port Dtrace to Linux, despite&lt;br /&gt;this release being available. That is a *good* thing, because it&lt;br /&gt;means a 4th (to my knowledge) port of Dtrace. First was FreeBSD, then&lt;br /&gt;there was MacOSX, then there was this Linux/Dtrace.&lt;br /&gt;&lt;br /&gt;I have learnt a lot from reviewing and understanding the differing&lt;br /&gt;implementations. My work on Dtrace is "not-bad" if I am going to self-rate.&lt;br /&gt;It is mired in details to do with multiple kernel support and lack of&lt;br /&gt;source code modifications to a kernel: Dtrace/Linux is simply a loadable&lt;br /&gt;kernel module.&lt;br /&gt;&lt;br /&gt;Oracle will pick up the latest code of Dtrace from the Solaris area,&lt;br /&gt;and will have some fiddliness to insert into Linux. They&lt;br /&gt;pay their employees to do this - and thats great. They can change&lt;br /&gt;the kernel source code, and provide a value added kernel (just&lt;br /&gt;like Google does with Android and all the other hardware/software&lt;br /&gt;vendors who leverage Linux).&lt;br /&gt;&lt;br /&gt;This will presumably give them a competitive advantage: for those&lt;br /&gt;customers who want Dtrace, then Oracle potentially looks attractive.&lt;br /&gt;Many people in the Linux community will write-off Oracle as "not team&lt;br /&gt;players". That happened before, with Sun. This leads to healthy,&lt;br /&gt;sometimes silly but entertaining debates on the interweb.&lt;br /&gt;&lt;br /&gt;Since Oracle is not using this port of Dtrace as the basis of their&lt;br /&gt;work, does not imply the death of this project. The likelihood is&lt;br /&gt;more people will stumble upon it and lead to more support questions or&lt;br /&gt;requests to finish or fix issues.&lt;br /&gt;&lt;br /&gt;Oracle could add new providers to the kernel - and this release of&lt;br /&gt;Dtrace will not be able to match these, without patching kernel source.&lt;br /&gt;I dont know how this will evolve. Maybe Oracle will show us what to do;&lt;br /&gt;maybe there will be sufficient impetus to get dtrace into the master Linux&lt;br /&gt;kernel, but I doubt that will happen.&lt;br /&gt;&lt;br /&gt;The GPL vs CDDL debate still rages. I've written numerous times on&lt;br /&gt;my opinion of the debate (namely, I dont have an opinion!).&lt;br /&gt;&lt;br /&gt;So continue to try/play with Dtrace.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.16a-b6073&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8389529085201900937?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8389529085201900937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-updatesort-of.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8389529085201900937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8389529085201900937'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-updatesort-of.html' title='DTrace update...sort of.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4566760579419090067</id><published>2011-10-04T14:02:00.001-07:00</published><updated>2011-10-04T14:02:06.199-07:00</updated><title type='text'>Dtrace on linux and Oracle?</title><content type='html'>Great news today - everyone is talking about Oracle shipping DTrace&lt;br /&gt;on the next Oracle Linux release.&lt;br /&gt;&lt;br /&gt;Whilst Oracle are not seen as the great open-source giver-aways,&lt;br /&gt;this can only be good.&lt;br /&gt;&lt;br /&gt;I have no inside knowledge on DTrace for Oracle, and twitter (http://twitter.com/#!/search/%23dtrace)&lt;br /&gt;has references to people rejoicing or denouncing it.&lt;br /&gt;&lt;br /&gt;Is dtrace battle hardened for production use? No. One person&lt;br /&gt;isnt going to prove that DTrace works everywhere for everyone on every&lt;br /&gt;kernel version. I have tried; one gets bogged down in details, especially&lt;br /&gt;for legacy releases and the myriads of distros that have inconsistent&lt;br /&gt;packages and package names.&lt;br /&gt;&lt;br /&gt;It works (mostly) for me; I know it crashes on a 16-core box occasionally.&lt;br /&gt;(Alas, I dont have a 16-core box. Maybe its the 48GB of RAM which is the&lt;br /&gt;issue rather than the number of cores, which affects the page table&lt;br /&gt;layout of that system).&lt;br /&gt;&lt;br /&gt;As anyone knows who manages software projects, a software project&lt;br /&gt;manager probably does little "coding" .. the closer to completion&lt;br /&gt;of the project, the harder and more required it is to reach&lt;br /&gt;100% perfection. Ask Linus. I dont know how much real coding vs&lt;br /&gt;patching/merging/overseeing he does.&lt;br /&gt;&lt;br /&gt;Even the big-guns out there, like RedHat and Oracle - much of the&lt;br /&gt;time and expense is tracking down bugs. There are a few people&lt;br /&gt;who add value. This is what software engineering is about. Finding&lt;br /&gt;your place, and managing those around you to optimise delivery.&lt;br /&gt;&lt;br /&gt;I dont know what Oracle is doing; even if its a closed-wall, there&lt;br /&gt;are benefits. When Apple introduced DTrace, it did a great job of&lt;br /&gt;allowing them to fix and optimise and understand their own systems. Sure,&lt;br /&gt;it wasnt perfect in the early days. &lt;br /&gt;&lt;br /&gt;So, lets see if Oracle can influence.&lt;br /&gt;&lt;br /&gt;From what I read, Adam Leventhal has been experimenting with&lt;br /&gt;adding kernel probes to the Linux source. Even if this is the only&lt;br /&gt;thing he has done, then its a great news banner. I'm loathe to walk&lt;br /&gt;that plank knowing that maintaining such deltas is difficult when you&lt;br /&gt;are not a part of a key release. Maybe if Ubuntu or RedHat or someone&lt;br /&gt;would offer to allow such merges in, it would be fun to add them.&lt;br /&gt;&lt;br /&gt;Solaris has hundreds/thousands of probe points, added over the last&lt;br /&gt;5+ years. Each would have required consideration about what to measure,&lt;br /&gt;whether the correct probes were in the right places, and supporting/testing&lt;br /&gt;them. Probe-dropping is laborious and nobody will thank you for&lt;br /&gt;those probes. A lot of people will benefit when "it just works".&lt;br /&gt;&lt;br /&gt;So, lets see what happens.&lt;br /&gt;&lt;br /&gt;And why have I been quiet recently? Well, various other mini projects&lt;br /&gt;needed to be addressed. "proc" is one of them; I am not happy with it as&lt;br /&gt;yet - sometimes the results seem to be suspicious, but it does look good.&lt;br /&gt;&lt;br /&gt;But my recent project is to update CRiSP a little. Now it is supporting&lt;br /&gt;grids/gridlines, so you can see the true structure of a file.&lt;br /&gt;&lt;br /&gt;Keep watching.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.15a-b6064&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4566760579419090067?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4566760579419090067/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-on-linux-and-oracle.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4566760579419090067'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4566760579419090067'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/10/dtrace-on-linux-and-oracle.html' title='Dtrace on linux and Oracle?'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3575917571030690572</id><published>2011-08-10T14:24:00.002-07:00</published><updated>2011-08-10T14:31:36.001-07:00</updated><title type='text'>Some illustrations of "proc"</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/-XxKNSZbcKw4/TkL4j_mT-BI/AAAAAAAAAAw/ke-hqE3VRhs/s1600/20110810-proc2.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 194px;" src="http://3.bp.blogspot.com/-XxKNSZbcKw4/TkL4j_mT-BI/AAAAAAAAAAw/ke-hqE3VRhs/s320/20110810-proc2.png" alt="" id="BLOGGER_PHOTO_ID_5639342980672518162" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/-O6sGsXErkRM/TkL4jm1RWzI/AAAAAAAAAAo/-apuNK_imMA/s1600/20110810-proc1.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 190px;" src="http://4.bp.blogspot.com/-O6sGsXErkRM/TkL4jm1RWzI/AAAAAAAAAAo/-apuNK_imMA/s320/20110810-proc1.png" alt="" id="BLOGGER_PHOTO_ID_5639342974024375090" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-xqP5RBmQOeY/TkL4kCsNcEI/AAAAAAAAAA4/Kfu9zEvqDms/s1600/20110810-proc3.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 137px;" src="http://2.bp.blogspot.com/-xqP5RBmQOeY/TkL4kCsNcEI/AAAAAAAAAA4/Kfu9zEvqDms/s320/20110810-proc3.png" alt="" id="BLOGGER_PHOTO_ID_5639342981502562370" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;These partial screen shots show "proc" in action. Its still a work in progress. See the prior post for more background on "proc", and download the package and look at the README.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3575917571030690572?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3575917571030690572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/08/some-illustrations-of-proc.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3575917571030690572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3575917571030690572'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/08/some-illustrations-of-proc.html' title='Some illustrations of &quot;proc&quot;'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-XxKNSZbcKw4/TkL4j_mT-BI/AAAAAAAAAAw/ke-hqE3VRhs/s72-c/20110810-proc2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6196359224258946656</id><published>2011-08-10T14:24:00.001-07:00</published><updated>2011-08-10T14:24:02.055-07:00</updated><title type='text'>Two new tools..well old ones. Revamped</title><content type='html'>"top" is a great tool - very old. And lacking in many ways, 20+years after&lt;br /&gt;it first appeared.&lt;br /&gt;&lt;br /&gt;"proc" is my version of top. I wrote it many many years ago for Solaris,&lt;br /&gt;so it could do what *I* wanted. I've had this tool for a long time. I&lt;br /&gt;ported it to Linux, and have been happy with it.&lt;br /&gt;&lt;br /&gt;Why is it better? Because it does color. It crams more data into the &lt;br /&gt;screen real estate. It uses a better sorting algorithm (there are lots&lt;br /&gt;to choose from), and highlights memory deltas.&lt;br /&gt;&lt;br /&gt;"proc" is available from my tools area (next door to dtrace).&lt;br /&gt;&lt;br /&gt;Why am I bothering to talk about proc? &lt;br /&gt;&lt;br /&gt;Because I realised as CPUs get faster, and with 8 cores on an i7,&lt;br /&gt;drilling down and understanding whats going on in a system requires&lt;br /&gt;more esoteric tools.&lt;br /&gt;&lt;br /&gt;So, whats wrong with "proc"? For one thing, its easy to see something&lt;br /&gt;on screen which is of interest, but, on the next screen update, it might have&lt;br /&gt;disappeared. This can be annoying - as machines get bigger, more&lt;br /&gt;processes running, the screen real estate cannot show everything at once.&lt;br /&gt;Sometimes you want to go *backwards* and rewind what you just saw.&lt;br /&gt;&lt;br /&gt;Well, that requires a bit of re-engineering. But its done. By default&lt;br /&gt;you can cycle back upto 20 mins of history. History is stored in&lt;br /&gt;/tmp/proc; by the time we factor in the process table, the amount of&lt;br /&gt;data is quite staggering (about 1GB of data per hour). This includes&lt;br /&gt;the key process attributes, along with extensions for /proc/pid/wchan,&lt;br /&gt;/proc/pid/stack etc. (Nearly everything is kept, but not absolutely everything;&lt;br /&gt;e.g. signal masks are not stored). And this includes the threads.&lt;br /&gt;&lt;br /&gt;We also keep /proc/meminfo, /proc/vmstat and many more.&lt;br /&gt;&lt;br /&gt;Theres so much data, that actually visually monitoring it is quite&lt;br /&gt;difficult. Just staring at /proc/meminfo has so many fields,&lt;br /&gt;one cannot understand/comprehend what is happening from one second&lt;br /&gt;to the next.&lt;br /&gt;&lt;br /&gt;Even with history, its not comprehendable.&lt;br /&gt;&lt;br /&gt;So, the second major update to "proc" is graphics. The ability&lt;br /&gt;to see, in graphical format, what is happening to the various key stats&lt;br /&gt;is very educational and illuminating.&lt;br /&gt;&lt;br /&gt;The implementation of graphs is interesting. Rather than creating&lt;br /&gt;an X11 application or KDE or GNOME, I decided to implement this&lt;br /&gt;inside the terminal emulator. "fcterm" is my emulator of choice -&lt;br /&gt;and fcterm was recently enhanced to support various escape sequences to&lt;br /&gt;do line and rectangle drawing. By using simple printf/escape-sequences,&lt;br /&gt;anything can be drawn - sufficient for drawing graphs.&lt;br /&gt;&lt;br /&gt;[I have a sampler in the fcterm/ctw distribution, available on&lt;br /&gt;my site, written in Perl, to show just about every /proc entry&lt;br /&gt;as a graph. Its crude, but effective, but quickly shows that dumping&lt;br /&gt;all graphs onto a page just overwhelms; that is why proc was enhanced].&lt;br /&gt;&lt;br /&gt;I have just uploaded proc-b21, for you to play with (but you must&lt;br /&gt;use it inside fcterm; I havent validated what it does in another&lt;br /&gt;xterm). It is still a work in progress, so dont bother reporting&lt;br /&gt;bugs to me yet.&lt;br /&gt;&lt;br /&gt;I will upload a few images in the next blog post.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.13a-b6043&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6196359224258946656?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6196359224258946656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/08/two-new-toolswell-old-ones-revamped.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6196359224258946656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6196359224258946656'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/08/two-new-toolswell-old-ones-revamped.html' title='Two new tools..well old ones. Revamped'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-68880837136587582</id><published>2011-08-06T14:20:00.001-07:00</published><updated>2011-08-06T14:20:45.896-07:00</updated><title type='text'>Virginmedia Tivo</title><content type='html'>Sometimes, I dont understand the web. If you read the reviews&lt;br /&gt;of Virginmedias Tivo product, they all rave about how good it is.&lt;br /&gt;&lt;br /&gt;But it isnt. The interface is buggy and designed by people who havent&lt;br /&gt;tried to read the text from across a living room, even on a big screen.&lt;br /&gt;&lt;br /&gt;The fact that you cannot archive programs from the device without&lt;br /&gt;watching the same program on the main TV is, well, somewhat myopic.&lt;br /&gt;&lt;br /&gt;The ethernet and USB interfaces do nothing (as yet). Why? Its 2011.&lt;br /&gt;(I think I know why, because the film industry doesnt want people to watch&lt;br /&gt;films away from a DRM controlled environment). &lt;br /&gt;&lt;br /&gt;The on-demand and catch-up services are badly thought out and confusing.&lt;br /&gt;&lt;br /&gt;The remote control is as bad as all other remote controls - it&lt;br /&gt;cannot be held in the hand and used in a one-handed mode - not if you&lt;br /&gt;are fast forwarding.&lt;br /&gt;&lt;br /&gt;The fast forward on tivo is very badly thought out and confusing.&lt;br /&gt;Its idea of fast forward is 2x or 3x. One cannot quickly scroll&lt;br /&gt;through without a lot of button pressing. (The jump in 10min&lt;br /&gt;intervals is broken and confusing).&lt;br /&gt;&lt;br /&gt;The suggestions do not understand recording and watching something -&lt;br /&gt;it only seems to work based on thumbs up/down.&lt;br /&gt;&lt;br /&gt;Tivo does not understand a family who have different tastes and&lt;br /&gt;preferences.&lt;br /&gt;&lt;br /&gt;The YouTube app is a joke. Watching 240x320 youtube videos in&lt;br /&gt;degraded quality on a 40" TV is about as ugly as you can get.&lt;br /&gt;&lt;br /&gt;http://virgintivo.blogspot.com/2011/07/virgin-media-ceo-tivo-impact-will-equal.html&lt;br /&gt;&lt;br /&gt;The above is typical of the glowing self-satisfying reports on&lt;br /&gt;the product. Multi-room streaming? How will that work? If that means I&lt;br /&gt;can watch tivo on my PC in another room, then I am salivating.&lt;br /&gt;&lt;br /&gt;If it means I can watch one tivo from another room, then think on! Who&lt;br /&gt;is going to have two Tivos in a household?&lt;br /&gt;&lt;br /&gt;Applications on the tivo are just a real joke. On the ipad,&lt;br /&gt;applications are great because it allows a degree of 'context'&lt;br /&gt;without having everything coming through the browser.&lt;br /&gt;&lt;br /&gt;BTW the virgin Android TV guide app is poor. Very poor. It is welcomed -&lt;br /&gt;it is better than nothing, but it is one of those 'why am I wasting&lt;br /&gt;storage space on my device' apps. (You can control your&lt;br /&gt;tivo device from this app, to do remote recordings; but theres a serious&lt;br /&gt;glitch in the way it works - you cannot record a program if it is&lt;br /&gt;on in less than 35mins from now. Why?)&lt;br /&gt;&lt;br /&gt;The tivo is 'not bad' but its certainly not a step up from the&lt;br /&gt;prior Virgin+ device. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.13a-b6043&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-68880837136587582?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/68880837136587582/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/08/virginmedia-tivo.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/68880837136587582'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/68880837136587582'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/08/virginmedia-tivo.html' title='Virginmedia Tivo'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2937253187889282162</id><published>2011-07-22T15:33:00.001-07:00</published><updated>2011-07-22T15:33:47.231-07:00</updated><title type='text'>cpu visualisation</title><content type='html'>Its quite interesting to contemplate different ways of looking at&lt;br /&gt;things.&lt;br /&gt;&lt;br /&gt;I have an Intel i7 machine - its fast (its a laptop, so it could be &lt;br /&gt;faster if I had a desktop CPU).&lt;br /&gt;&lt;br /&gt;Linux provides a lot of raw data, but one thing that "top" lacks is &lt;br /&gt;more detailed info. There are display widgets for KDE and GNOME&lt;br /&gt;which help you visualise cpu load, but this display shows something&lt;br /&gt;interesting:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;last pid: 4792 in: 4448 load avg: 1.28 0.71 0.43                      23:21:45&lt;br /&gt;CPU: 8(HT)  @ 2.00GHz, proc:231, thr:464, zombies: 1, stopped: 5, running: 3 [t&lt;br /&gt;dixxy:  7.3% usr, 0.1% nice, 1.5% sys, 84.6% idle, 6.4% iow, 0.1% sirq&lt;br /&gt;RAM:7918M RSS:0K Free:303M Cached:1913M Dirty: 664K Swap:225M Free:7878M&lt;br /&gt;cpu&lt;br /&gt;Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz&lt;br /&gt;          usr   nice    sys   idle    iow    irq   sirq  steal  guest  gnice&lt;br /&gt;CPU0     8.4%   0.0%   2.6%  73.6%  15.2%   0.0%   0.8%   0.0%   0.0%   0.0%&lt;br /&gt;CPU1    63.6%   0.0%   1.8%  21.0%  14.2%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;CPU2     0.2%   0.0%   1.0%  99.2%   0.0%   0.0%   0.2%   0.0%   0.0%   0.0%&lt;br /&gt;CPU3     2.4%   0.0%   1.0%  97.0%   0.4%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;CPU4     0.0%   0.2%   0.2% 101.0%   0.0%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;CPU5     0.0%   0.0%   0.2% 100.2%   0.0%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;CPU6     0.2%   0.0%   0.8%  99.4%   0.2%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;CPU7     0.0%   0.2%   0.6%  98.6%   1.2%   0.0%   0.0%   0.0%   0.0%   0.0%&lt;br /&gt;&lt;br /&gt;           MHz    Cache  Bogomips&lt;br /&gt;CPU0  2001.000  6144 KB  3990.88&lt;br /&gt;CPU1  1400.000  6144 KB  3990.92&lt;br /&gt;CPU2   800.000  6144 KB  3990.97&lt;br /&gt;CPU3   800.000  6144 KB  3990.93&lt;br /&gt;CPU4   800.000  6144 KB  3990.96&lt;br /&gt;CPU5   800.000  6144 KB  3990.96&lt;br /&gt;CPU6   800.000  6144 KB  3990.94&lt;br /&gt;CPU7   800.000  6144 KB  3990.98&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;The info is taken from /proc/cpuinfo (this is the "proc" utility - available&lt;br /&gt;at my website; run it and type 'cpu' at the command line to see this&lt;br /&gt;display).&lt;br /&gt;&lt;br /&gt;Note that CPU0 is running at 2GHz - to be expected, although slightly strange.&lt;br /&gt;Its strange because this represents the cpu that the proc command&lt;br /&gt;is instantaneously running on. It doesnt use much cpu, but the cpu&lt;br /&gt;has adjusted the clock to give it speed. (Note that, as an i7, this&lt;br /&gt;CPU should be able to ramp up to 2.9GHz but I havent seen evidence in&lt;br /&gt;/proc/cpuinfo this occurs).&lt;br /&gt;&lt;br /&gt;Note also that cpus 2-7 are idle (800MHz is the lowest speed&lt;br /&gt;without actually sleeping).&lt;br /&gt;&lt;br /&gt;CPU1 is running at 1.4GHz - I have a backup job running in another&lt;br /&gt;window. The question is - *what is cpu1?* I presume its the&lt;br /&gt;hyperthreaded cpu, and therefore should run slower than cpu0. Ideally,&lt;br /&gt;jobs should run on: cpu0, cpu2, cpu4, cpu6, cpu1, cpu3, cpu5, cpu7, in that&lt;br /&gt;order.&lt;br /&gt;&lt;br /&gt;The question in my mind - what is hyperthreading -- is it an attribute&lt;br /&gt;of the cpu, which is fixed, or does it meander from one cpu to another.&lt;br /&gt;If the hyperthreaded sibling is solely virtual, then one can deduce&lt;br /&gt;that for this system, we should get unequal performance as the 5th cpu&lt;br /&gt;is made to do work.&lt;br /&gt;&lt;br /&gt;I just did a test (seeing how many "counts" we can do per second), and&lt;br /&gt;ran 5 of them in parallel. Certainly, one of them was not as busy as&lt;br /&gt;the other 4. [This was not a good test, since the counter-loop doesnt&lt;br /&gt;exercise cache-misses and hyperthread ability, but solely relies&lt;br /&gt;on the Linux scheduler to run the processes].&lt;br /&gt;&lt;br /&gt;Definitely requires more investigation to understand the effects.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6036&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2937253187889282162?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2937253187889282162/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/07/cpu-visualisation.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2937253187889282162'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2937253187889282162'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/07/cpu-visualisation.html' title='cpu visualisation'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4876277026745459412</id><published>2011-07-19T14:21:00.001-07:00</published><updated>2011-07-19T14:21:16.507-07:00</updated><title type='text'>warning! warning! warning! In the beginning was more. Then there was
less.</title><content type='html'>In the very old days of computing, you could sit in front of a screen&lt;br /&gt;or a teletype and watch the output, a character at a time.&lt;br /&gt;110 baud or 300 baud was eminently readable.&lt;br /&gt;&lt;br /&gt;As output devices progress to 9600 baud serial lines, one could fill&lt;br /&gt;a screen in a second (80x24). And "cat" or "make" on its own was&lt;br /&gt;not good enough to read the text frantically scrolling off the screen.&lt;br /&gt;&lt;br /&gt;Zoom forward a few years, and with todays multi-GHz cpus and fast&lt;br /&gt;screens, one can 'cat' a 10MB file in a few seconds to the screen.&lt;br /&gt;&lt;br /&gt;Did you see the error on line 12,723,104 ? No? Didnt think so.&lt;br /&gt;&lt;br /&gt;Tools like "more" and "less" are great for paging slowly&lt;br /&gt;through a file and allow searching and backwards motion.&lt;br /&gt;&lt;br /&gt;Or, one can use an editor, such as vim/emacs/CRiSP. &lt;br /&gt;&lt;br /&gt;These are great.&lt;br /&gt;&lt;br /&gt;When building software, e.g. with gcc/g++, and as projects have&lt;br /&gt;gotten bigger, it can be difficult to spot an error in the middle&lt;br /&gt;of a huge amount of benign output. Worse, gcc has a tendency to&lt;br /&gt;overdo the warnings. Scrolling in an xterm to review the output is&lt;br /&gt;frustrating, trying to spot the magic "error" in the midst of warnings&lt;br /&gt;(or other output).&lt;br /&gt;&lt;br /&gt;There are many solutions (such as viewing the output in "more" or "less",&lt;br /&gt;and relying on highlighting to find the item you are after). "less" can&lt;br /&gt;do highlight, but "more" cannot. CRiSP can do highlighting too.&lt;br /&gt;&lt;br /&gt;fcterm (my own personal terminal emulator) can do this too, but&lt;br /&gt;you have to tell it what to search for. (I must modify it to have&lt;br /&gt;a default set of words - having a single search pattern is not good enough).&lt;br /&gt;&lt;br /&gt;I wrote a simple tool called "warn". You use it like this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ warn make&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and all error output lines are shown in red, with warnings in yellow.&lt;br /&gt;(My default console is green on black).&lt;br /&gt;&lt;br /&gt;Very useful for spotting the wood for the trees.&lt;br /&gt;&lt;br /&gt;I havent released it as a standalone tool (it has bare minimum&lt;br /&gt;requirements - its plain C code). If people are interested, I will put it&lt;br /&gt;out.&lt;br /&gt;&lt;br /&gt;Next up is to fix fcterm...&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6034&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4876277026745459412?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4876277026745459412/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/07/warning-warning-warning-in-beginning.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4876277026745459412'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4876277026745459412'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/07/warning-warning-warning-in-beginning.html' title='warning! warning! warning! In the beginning was more. Then there was&#xA;less.'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2063837151641212532</id><published>2011-07-17T07:03:00.001-07:00</published><updated>2011-07-17T07:03:56.168-07:00</updated><title type='text'>What is the meaning of 1?</title><content type='html'>Timestamp: 2011-07-17 12:59:11&lt;br /&gt;Title:	What does '1' mean?&lt;br /&gt;Body:&lt;br /&gt;In the context of load average on a system, a load avg of 1 is&lt;br /&gt;something meaningful, if you are on a single cpu system. It represents&lt;br /&gt;the cpu is busy, continuously.&lt;br /&gt;&lt;br /&gt;Now consider multicore/multicpu machines. A load avg of 1 is not&lt;br /&gt;quite so meaningful. On Linux, the load average represents a moving&lt;br /&gt;average of processes which are blocking. It slows ramps up and ramps&lt;br /&gt;down.&lt;br /&gt;&lt;br /&gt;Doing heavy duty work (like parallel compilation) means that "gmake -j"&lt;br /&gt;doesnt have enough information to determine if the system is busy.&lt;br /&gt;&lt;br /&gt;In the old days, when a source file compilation could take many seconds&lt;br /&gt;or minutes, the load average told us what the system was doing. &lt;br /&gt;&lt;br /&gt;On an 8-core (Intel i7) cpu, doing 'gmake -j' can invoke&lt;br /&gt;tens of parallel compilations, yet, 'top' can show the system as&lt;br /&gt;being idle, because the load average takes a while to ramp up.&lt;br /&gt;&lt;br /&gt;On an 8-core system, with one cpu being busy, should we say 'the system is&lt;br /&gt;busy' (system usage == 100%), or should we say it is idle (system usage == 12.5%)?&lt;br /&gt;&lt;br /&gt;The answer depends on what you are measuring and how you want to handle&lt;br /&gt;it. If 1 out of 8 cpus is busy (maybe the application is broken and&lt;br /&gt;stuck, and eating cpu continuously), then that is important. The&lt;br /&gt;system may be busy, but noticing that rogue application is useful.&lt;br /&gt;Ignoring it until all 8 cores are busy may never happen.&lt;br /&gt;&lt;br /&gt;An additional complexity is that on a totally idle system, a single&lt;br /&gt;CPU can ramp up the clock speed; but if that cpu is not doing useful&lt;br /&gt;work, then the second cpu may not be able to ramp up as high, and&lt;br /&gt;get worse performance.&lt;br /&gt;&lt;br /&gt;In the end, what is useful is to notice one or more processes&lt;br /&gt;'behaving badly', e.g. consuming too much cpu, or too many failed&lt;br /&gt;syscalls, or too much I/O.&lt;br /&gt;&lt;br /&gt;Today top (or my application, 'proc') does not readily show that, but&lt;br /&gt;that needs to change.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2063837151641212532?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2063837151641212532/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/07/what-is-meaning-of-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2063837151641212532'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2063837151641212532'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/07/what-is-meaning-of-1.html' title='What is the meaning of 1?'/><author><name>Paul Fox</name><uri>http://www.blogger.com/profile/11969759101059066480</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5758078070418575267</id><published>2011-07-13T14:38:00.001-07:00</published><updated>2011-07-13T14:38:54.138-07:00</updated><title type='text'>dtrace gripe</title><content type='html'>I really dislike some aspects of dtrace. Its a great tool,&lt;br /&gt;but the "lets pretend we are C" when it isnt is a nuisance.&lt;br /&gt;Macro languages should be designed to be expressive, but&lt;br /&gt;dtraces' D language is annoying.&lt;br /&gt;&lt;br /&gt;Firstly, the lack of if-then-else is a problem. It leads to&lt;br /&gt;convoluted use of ?: (which cannot handle multiple statements).&lt;br /&gt;I really dont understand why if-then-else isnt there. It doesnt&lt;br /&gt;harm the "Thou shalt not have loops" which can lock up a kernel.&lt;br /&gt;&lt;br /&gt;Whats annoying is that the C programming language, and D, copying it,&lt;br /&gt;does it to an extent that is .. well, annoying !&lt;br /&gt;&lt;br /&gt;Consider this: I want a probe which can exit after 5s of execution&lt;br /&gt;time. Heres the naive implementation:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;BEGIN {&lt;br /&gt;	t = timestamp;&lt;br /&gt;	}&lt;br /&gt;tick-1ms {&lt;br /&gt;	timestamp - t &gt; 5*1000*1000*1000 ? exit(0) : 1;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This isnt possible, because exit(0) is a void function.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;BEGIN {&lt;br /&gt;	t = timestamp;&lt;br /&gt;	}&lt;br /&gt;tick-1ms {&lt;br /&gt;	timestamp - t &gt; 5*1000*1000*1000 ? (int) exit(0) : 1;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But, oh-no! You cannot cast a "void" to an "int". In C, I can understand&lt;br /&gt;that (almost) but it leads to painful workarounds. In D, there is even&lt;br /&gt;less reason: if a (void) could be cast to "(int) 0", then the above&lt;br /&gt;would work. Its still ugly, but functional.&lt;br /&gt;&lt;br /&gt;The actual solution is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;BEGIN {&lt;br /&gt;	t = timestamp;&lt;br /&gt;	}&lt;br /&gt;tick-1ms / timestamp - t &gt; 5*1000*1000*1000 / {&lt;br /&gt;	exit(0);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Which is fine - although I havent determined if the predicate is&lt;br /&gt;worse or more expensive than the actual code. What is annoying is that&lt;br /&gt;the predicate is a "different part of the language". What if I wanted&lt;br /&gt;to do this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;tick-1ms {&lt;br /&gt;	do-some-stuff;&lt;br /&gt;	if (var &gt; somevalue) { printf("hello"); exit(0);}&lt;br /&gt;	do-some-more-stuff;&lt;br /&gt;	if (var &gt; someOthervalue) {printf("world"); }&lt;br /&gt;	...&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This can be translated into predicate format, but this can involve&lt;br /&gt;ugliness in performing the transformation, especially if the do-stuff&lt;br /&gt;lines of code are complex in themselves.&lt;br /&gt;&lt;br /&gt;Its time to start addressing these deficiencies in Dtrace (at the risk&lt;br /&gt;of being non-standard extensions to the true code).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6033&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5758078070418575267?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5758078070418575267/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/07/dtrace-gripe.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5758078070418575267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5758078070418575267'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/07/dtrace-gripe.html' title='dtrace gripe'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1511282549940422211</id><published>2011-07-12T13:51:00.001-07:00</published><updated>2011-07-12T13:51:38.059-07:00</updated><title type='text'>CRiSP website updated</title><content type='html'>After many years of staring at abject ugliness,&lt;br /&gt;http://www.crisp.demon.co.uk has been given a lick of paint, more in tune&lt;br /&gt;with the &lt;a href="http://crtags.blogspot.com"&gt;blog&lt;/a&gt; site in terms&lt;br /&gt;of look and feel and stylesheet.&lt;br /&gt;&lt;br /&gt;I have updated some of the very dated things, and hope to update it more&lt;br /&gt;so.&lt;br /&gt;&lt;br /&gt;Obligatory plug: you can now purchase CRiSP (via paypal) if you so&lt;br /&gt;choose.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6033&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1511282549940422211?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1511282549940422211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/07/crisp-website-updated.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1511282549940422211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1511282549940422211'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/07/crisp-website-updated.html' title='CRiSP website updated'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4954765937174078041</id><published>2011-06-20T15:10:00.001-07:00</published><updated>2011-06-20T15:10:48.133-07:00</updated><title type='text'>dtrace update</title><content type='html'>I've updated dtrace to slightly improve the xcall code. Having tried&lt;br /&gt;on an AS5 kernel - hit some other issues.&lt;br /&gt;&lt;br /&gt;Some build issues are fixed (2.6.18 kernels confuse the syscall&lt;br /&gt;extraction code); it mostly works - but some warnings are present.&lt;br /&gt;Additionally, a 'dtrace -n syscall:::' will crash the kernel. I suspect&lt;br /&gt;some mismatch on the ptregs syscalls and/or 32b syscalls on this kernel.&lt;br /&gt;Need to debug.&lt;br /&gt;&lt;br /&gt;Also found that on 16-core machine, the xcall code leads to a lot&lt;br /&gt;of noise when things arent the way it expects. This eventually led&lt;br /&gt;to an assertion failure in dtrace.c (on a buffer switch - which is&lt;br /&gt;in agreement that the dtrace_sync() didnt hit the expected cpus, i.e.&lt;br /&gt;some race condition/bug), and eventually a failure from the kernel that&lt;br /&gt;a vm_free was invalid.&lt;br /&gt;&lt;br /&gt;Oh dear.&lt;br /&gt;&lt;br /&gt;To date I have been testing on dual-core cpus. I need to get an i7&lt;br /&gt;so I can ramp up to 8 cores and do more heavy torture tests.&lt;br /&gt;&lt;br /&gt;So, keep an eye out for updates (which are likely to be slow in&lt;br /&gt;coming in next week or two), whilst I hopefully try to refine&lt;br /&gt;the xcall issue.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6033&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4954765937174078041?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4954765937174078041/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-update.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4954765937174078041'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4954765937174078041'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-update.html' title='dtrace update'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3363995722518138827</id><published>2011-06-19T14:17:00.001-07:00</published><updated>2011-06-19T14:17:48.804-07:00</updated><title type='text'>NMI support added</title><content type='html'>I modified the cross-cpu call code to allow use of the NMI interrupt&lt;br /&gt;when the IPI interrupt is not responding. Hopefully this will avoid&lt;br /&gt;the xcall code from busting a lock due to a deadlock/timeout.&lt;br /&gt;&lt;br /&gt;It looks like the APIC allows specific interrupts to be marked as&lt;br /&gt;NMI - which would be great since rather than sharing the NMI with&lt;br /&gt;other users of the interrupt, we could just make the IPI interrupt&lt;br /&gt;work like an NMI and avoid the deadlock scenario.&lt;br /&gt;&lt;br /&gt;For now, the interrupt handler tries to be careful and not trigger&lt;br /&gt;when its uncalled for. It does present a problem if we need the NMI&lt;br /&gt;and someone else does at the same time, but I can investigate what/how&lt;br /&gt;the APIC works a little better (or check the Solaris code to see&lt;br /&gt;if indeed, that is what it does).&lt;br /&gt;&lt;br /&gt;I also need to update the dtrace_linux.c code so that I dont just&lt;br /&gt;grab interrupt vector 0xea (random interrupt which appears not to be&lt;br /&gt;used, but it could be). I am a naughty programmer.&lt;br /&gt;&lt;br /&gt;Release 20110619 contains the above fixes.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6033&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3363995722518138827?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3363995722518138827/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/nmi-support-added.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3363995722518138827'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3363995722518138827'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/nmi-support-added.html' title='NMI support added'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4133557192618721468</id><published>2011-06-19T04:56:00.001-07:00</published><updated>2011-06-19T04:56:00.601-07:00</updated><title type='text'>The Final Phase of dtrace</title><content type='html'>I have been writing about the issues of inter cpu cross function&lt;br /&gt;calls (xcall) - a key part of dtrace. This feature isnt used very&lt;br /&gt;often, but its an important part of SMP to ensure consistency in&lt;br /&gt;accessing buffers.&lt;br /&gt;&lt;br /&gt;After a lot of effort, writing, rewriting and rewriting yet again, the&lt;br /&gt;code is (nearly) finished. It looks good - it handles arbitrary&lt;br /&gt;cpus calling into other cpus and allows for a xcall to be interrupted&lt;br /&gt;by another call to xcall (effectively a mesh of NCPU * NCPU callers).&lt;br /&gt;&lt;br /&gt;However, I have found a flaw. If I modify the dtrace_sync() function&lt;br /&gt;to sync 100-200 times instead of just once, then occasionally there are delays&lt;br /&gt;and kernel printk()s from the code - where spinlocks are taking too long.&lt;br /&gt;&lt;br /&gt;Turns out, we could deadlock if we try to invoke an IPI on another&lt;br /&gt;CPU which has interrupts disabled. Not totally sure how Solaris handles&lt;br /&gt;this - I get a little lost in the maze of mutex_enter() and splx() code.&lt;br /&gt;&lt;br /&gt;There is a solution to take us to the next level - NMI - the NMI interrupt&lt;br /&gt;is not maskable (unless an NMI is in progress). NMIs are typically used&lt;br /&gt;by Linux for a watchdog facility - make sure CPUs arent locking up, as&lt;br /&gt;well as "danger signals" (like ECC/parity memory errors).&lt;br /&gt;&lt;br /&gt;I will experiment to see if I can run via an NMI rather than a normal&lt;br /&gt;interrupt and that should help reduce the problems of lock-busting &lt;br /&gt;significantly.&lt;br /&gt;&lt;br /&gt;At the moment dtrace is pretty good - my ultra-torture tests really&lt;br /&gt;are horrible, and most people wont do that in real life.&lt;br /&gt;&lt;br /&gt;So, as always, tread carefully until *you* feel happy this is not&lt;br /&gt;going to panic your production system.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6033&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4133557192618721468?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4133557192618721468/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/final-phase-of-dtrace.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4133557192618721468'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4133557192618721468'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/final-phase-of-dtrace.html' title='The Final Phase of dtrace'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2465457418490584776</id><published>2011-06-16T15:24:00.001-07:00</published><updated>2011-06-16T15:24:01.854-07:00</updated><title type='text'>Update to prior post</title><content type='html'>My findings in the prior post are not strictly the end of the story.&lt;br /&gt;Subsequent to this, I found that I needed to resort to the cmpxchg &lt;br /&gt;instructions to beef up the resilience of dtrace_xcalls. Current&lt;br /&gt;results look good.&lt;br /&gt;&lt;br /&gt;More testing to follow and I need to fix AS4 (Linux 2.6.9) kernel&lt;br /&gt;compilation issues.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6030&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2465457418490584776?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2465457418490584776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/update-to-prior-post.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2465457418490584776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2465457418490584776'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/update-to-prior-post.html' title='Update to prior post'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4952407518504264590</id><published>2011-06-16T12:14:00.001-07:00</published><updated>2011-06-16T12:14:53.275-07:00</updated><title type='text'>DTrace xcall issue -- fixed? Website of the day.</title><content type='html'>&lt;a href='http://forum.osdev.org/viewtopic.php?f=1&amp;t=21768&amp;start=0'&gt;http://forum.osdev.org/viewtopic.php?f=1&amp;t=21768&amp;start=0&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now, this has been driving me nuts for months. Why was my spanking new&lt;br /&gt;cross-cpu code hanging occasionally? I had spent ages building up&lt;br /&gt;the courage to write it, and was fairly proud of it. But it just&lt;br /&gt;wasnt relilable enough and I disabled it in recent releases of dtrace.&lt;br /&gt;&lt;br /&gt;Heres the problem: a cross-cpu synchronisation call is needed in dtrace.&lt;br /&gt;Not often, but in key components. I feel like the way this was done in&lt;br /&gt;dtrace was almost laziness, because there are other ways to achieve&lt;br /&gt;this (I believe). But the single cross call (in dtrace_sync()) is&lt;br /&gt;a problem...&lt;br /&gt;&lt;br /&gt;Interestingly, I was surprised it was called so often. Its called&lt;br /&gt;during the tear-down of /usr/bin/dtrace as the process exits. I had&lt;br /&gt;wondered why dtrace intercepts ^C and doesnt die immediately. It&lt;br /&gt;does something very curious - it intercepts ^C and asks the driver nicely&lt;br /&gt;to tear down the probes we may have set up. Of course, you can kill -9&lt;br /&gt;the process, and it works. *But*. *But*. If you do that the probes&lt;br /&gt;arent torn down! Instead, they are left running. After about 20-30s,&lt;br /&gt;since nothing in user land empties the buffers, the kernel auto&lt;br /&gt;garbage collects, but it means on a kill -9 scenario, whatever&lt;br /&gt;you were tracing may continue to take effect. &lt;br /&gt;&lt;br /&gt;I dont like the way ^C works in dtrace and I may attempt to fix it&lt;br /&gt;(eg fork a child to tear down the probes; tear down is done by a STOP&lt;br /&gt;ioctl(), btw).&lt;br /&gt;&lt;br /&gt;Ok - so cross calls happen a lot especially during tear down (and also&lt;br /&gt;during timer/tick interrupt handling).&lt;br /&gt;&lt;br /&gt;So .. what happens? Well, on a two cpu system, the cpu invoking&lt;br /&gt;cross call deadlocks against the other cpu waiting for the remote&lt;br /&gt;procedure call to be acknowledged. &lt;br /&gt;&lt;br /&gt;With the original Linux smp_call_function() there were lots of issues&lt;br /&gt;in calling it with interrupts disabled (ie from the timer tick interrupt).&lt;br /&gt;This is not allowed - two cpus calling each other at the same time&lt;br /&gt;will deadlock.&lt;br /&gt;&lt;br /&gt;The cross-call code has to run with interrupts enabled and that&lt;br /&gt;means being very careful with reentrancy and mutual invocation.&lt;br /&gt;&lt;br /&gt;One day I put some debug into the code to try and spot mutual or&lt;br /&gt;nested invocations and I got a hit. On a real machine. But never on&lt;br /&gt;my VMs.&lt;br /&gt;&lt;br /&gt;I modified the code to allow a break-out - after too long waiting, the&lt;br /&gt;code gives up and allows the machine to stay in tact. Without this, the&lt;br /&gt;machine would lock up (deadlock with interrupts disabled).&lt;br /&gt;&lt;br /&gt;I fixed the code to handle mutual invocation and recursion.&lt;br /&gt;&lt;br /&gt;But I could not figure out what the locked-up CPU was doing. I tried&lt;br /&gt;to get stack dumps from the locked CPU - but these would only happen&lt;br /&gt;after dtrace had given up waiting. Its as if the other CPU was asleep&lt;br /&gt;and wouldnt wake up until the primary CPU had given up looking&lt;br /&gt;(a definite Heisenbug!).&lt;br /&gt;&lt;br /&gt;The web link at the top of this page illustrates the exact same&lt;br /&gt;setup I was seeing. So, I followed the page (it tells that &lt;br /&gt;acknowledging an end of interrupt to the APIC too prematurely may not&lt;br /&gt;work on a VM).&lt;br /&gt;&lt;br /&gt;Not only had I spent a huge amount of time to understand, fix and engineer&lt;br /&gt;a solution but I almost had a working solution without realising it.&lt;br /&gt;I had moved the APIC_EOI code to the end of the interrupt routine&lt;br /&gt;previously, but because of the lack of support for mutual invocation, it&lt;br /&gt;hadnt worked. So I put it back again.&lt;br /&gt;&lt;br /&gt;So I think this is looking good - much better than before. I need&lt;br /&gt;to do more torture testing and cleanup before I release.&lt;br /&gt;&lt;br /&gt;On the way, I tried or started trying with lots of things&lt;br /&gt;(like using a crash dump to analyse this problem .. which wasnt&lt;br /&gt;successful). Or, using NMI interrupts instead of normal interrupts.&lt;br /&gt;I've learnt a lot and been frustrated by a lot too along the way.&lt;br /&gt;&lt;br /&gt;Keep an eye on twitter .. I'll report a status update if I think&lt;br /&gt;I am not close enough.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.12a-b6030&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4952407518504264590?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4952407518504264590/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-xcall-issue-fixed-website-of-day.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4952407518504264590'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4952407518504264590'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-xcall-issue-fixed-website-of-day.html' title='DTrace xcall issue -- fixed? Website of the day.'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4169740272738568492</id><published>2011-06-12T09:21:00.001-07:00</published><updated>2011-06-12T09:21:01.231-07:00</updated><title type='text'>My blogs</title><content type='html'>Just reading Nigel Smiths Blog (http://nwsmith.blogspot.com/), which&lt;br /&gt;has a nice back-reference to this blog and dtrace.&lt;br /&gt;&lt;br /&gt;People may find my blogs a bit confusing. I thought it worth detailing "why".&lt;br /&gt;&lt;br /&gt;Originally I set up a series of blog posts, using my own Perl blog code,&lt;br /&gt;which was in turn, based on the nanoblogger code. &lt;br /&gt;(http://nanoblogger.sourceforge.net/).&lt;br /&gt;&lt;br /&gt;The website I publish to (www.crisp.demon.co.uk) is interesting in itself.&lt;br /&gt;Demon was the first ISP in the UK (back in early 1990s) to offer access&lt;br /&gt;to the Internet. Alas, they have never done anything useful since then,&lt;br /&gt;and I pay subscriptions for a near-useless service (teeny amount of&lt;br /&gt;web space, no perl or cli or anything else). Because space is so tight,&lt;br /&gt;I tend to leave most things, including CRiSP and Dtrace downloads on my&lt;br /&gt;internet facing machine at home. The only thing that Demon usefully&lt;br /&gt;serves me is the email address, although I do try to get people to switch&lt;br /&gt;to my (numerous) gmail accounts.&lt;br /&gt;&lt;br /&gt;I was using the Dyndns service for a DNS entry but due to some sillyness&lt;br /&gt;on my behalf, I lost the name entry, which put dtrace off the map&lt;br /&gt;for many people. I reinstated a new address (via crisp.dyndns-server.com).&lt;br /&gt;&lt;br /&gt;I should just pay for a normal DNS entry but I havent decided what I want.&lt;br /&gt;&lt;br /&gt;The crisp.demon.co.uk is costly, much more costly than a decent hosted&lt;br /&gt;web applicance, so I do need to do something.&lt;br /&gt;&lt;br /&gt;At home, I have two main dev machines - and when I blog post, I try to&lt;br /&gt;update both the original Demon hosted site, and also blogger. It turns&lt;br /&gt;out to be easier to update blogspot first, and the Demon at a later&lt;br /&gt;date when I "get around to it". ("Get around to it" means powering on&lt;br /&gt;my main PC, running a script, and shutting it down again). Things&lt;br /&gt;got confused because I have two dev machines and have to be careful how&lt;br /&gt;I sync to and from each other.&lt;br /&gt;&lt;br /&gt;So, thats the feeble excuse for me appearing and disappearing in&lt;br /&gt;the waves.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.11a-b6022&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4169740272738568492?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4169740272738568492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/my-blogs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4169740272738568492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4169740272738568492'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/my-blogs.html' title='My blogs'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7399527913215346869</id><published>2011-06-11T02:00:00.001-07:00</published><updated>2011-06-11T02:00:01.119-07:00</updated><title type='text'>dtrace progress</title><content type='html'>Been continuing work to increase resilience of dtrace. One thing&lt;br /&gt;I found was that there are some syscalls which have differing &lt;br /&gt;calling sequence compared to the others (fork, clone, sigreturn, execve&lt;br /&gt;and a few others).&lt;br /&gt;&lt;br /&gt;Bear in mind when we think of a kernel - there are multiple &lt;br /&gt;views of the kernel:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  - 64b kernel running 64b apps&lt;br /&gt;  - 32b kernel running 32b apps&lt;br /&gt;  - 64b kernel running 32b apps&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The apps get to the kernel via system calls. System calls are implemented&lt;br /&gt;in a variety of ways - depending on the kernel version and the&lt;br /&gt;CPU. (Some older cpus, such as i386, i486 dont support instructions&lt;br /&gt;like SYSCALL, SYSENTER).&lt;br /&gt;&lt;br /&gt;So dtrace traps the system calls by patching the system call table.&lt;br /&gt;The code is mostly the same but subtley different for a 32b and 64b &lt;br /&gt;kernel.&lt;br /&gt;&lt;br /&gt;But when a 32b app is running on a 64b kernel - the app doesnt know&lt;br /&gt;any different, but the kernel does. The kernel has two system call&lt;br /&gt;tables: the system call, for eg. "open" is a different index on the&lt;br /&gt;two OS's. The two OS's developed differently. i386 kernels have&lt;br /&gt;had to maintain backwards compatibility, but the amd64 kernel did not&lt;br /&gt;and started afresh at the point these cpus became available.&lt;br /&gt;&lt;br /&gt;Dtrace handles that.&lt;br /&gt;&lt;br /&gt;Except it didnt handle the special syscalls: when a 32b app invokes&lt;br /&gt;fork(), clone(), etc, we usually ended up panicing the kernel.&lt;br /&gt;&lt;br /&gt;Most Linux distros are "pure": a 64b distro has 64b apps, so you&lt;br /&gt;rarely see the effect of a 32b app.&lt;br /&gt;&lt;br /&gt;Linux/dtrace has a nice interface for system calls. The probe name, e.g.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n syscall:::&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;matches all system calls. But the 32b and 64b calls are different probes.&lt;br /&gt;So, you can intercept all 32b syscalls on a 64b system:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n syscall:x32::&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;which is useful in many ways.&lt;br /&gt;&lt;br /&gt;I have nearly fixed these special syscalls on the 64b kernel - just&lt;br /&gt;have clone() to fix. The symptom of not fixing is a cascade of&lt;br /&gt;kernel OOPs and panics (because the kernel stack layout is not&lt;br /&gt;what it should be).&lt;br /&gt;&lt;br /&gt;I hope to release later today a fix for this problem.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.10a-b6012&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7399527913215346869?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7399527913215346869/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-progress.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7399527913215346869'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7399527913215346869'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-progress.html' title='dtrace progress'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-294244672141045480</id><published>2011-06-05T13:37:00.001-07:00</published><updated>2011-06-05T13:37:47.962-07:00</updated><title type='text'>dtrace -- some updates</title><content type='html'>After spending a lot of effort on the xcall issue, I had hit an issue&lt;br /&gt;where occasionally, system calls would fail. The regression&lt;br /&gt;test shows this up by running a perl script which continuously&lt;br /&gt;opens an existing and a non-existing file, plus a variety of other things.&lt;br /&gt;&lt;br /&gt;Very occasionally, Perl would emit a warning relating to a file handle&lt;br /&gt;being referred to which belong to a file which couldnt be opened.&lt;br /&gt;(/etc/hosts - which always exists).&lt;br /&gt;&lt;br /&gt;Similarly, other apps would occasionally fail to start with rtld&lt;br /&gt;linker errors.&lt;br /&gt;&lt;br /&gt;This proved very hard to track down: I was pretty certain it was&lt;br /&gt;related to the xcall work I was doing. The error rates were rare - less&lt;br /&gt;than 1 in a million, and almost impossible to track down.&lt;br /&gt;&lt;br /&gt;I moved away from xcall debugging and found that by having two&lt;br /&gt;simple perl scripts (on a dual core machine), which continuously opened&lt;br /&gt;files and nothing else, that the error rate would increase whilst&lt;br /&gt;the two scripts ran.&lt;br /&gt;&lt;br /&gt;To try and get a better handle on this, I moved from 64-bit kernel&lt;br /&gt;debugging to 32-bit kernel, where the error rate was significantly&lt;br /&gt;higher.&lt;br /&gt;&lt;br /&gt;After a lot of experimentation, it transpired that the error wasnt to do&lt;br /&gt;with xcall, but the syscall provider. Specifically, a piece of&lt;br /&gt;assembler glue turned out to be rubbish. I am not sure why it appeared to&lt;br /&gt;work, but it didnt. (I had made some changes earlier on which may&lt;br /&gt;have broken the syscall tracing on 32-bit kernels).&lt;br /&gt;&lt;br /&gt;After recoding the assembler glue - things looked much better. The&lt;br /&gt;errors in syscall processing appeared to be gone. But a new problem&lt;br /&gt;surfaced - one I wasnt too surprised to see. There are a handful&lt;br /&gt;of 32-bit syscalls which use a differing calling convention to the others.&lt;br /&gt;(The 64-bit code handles this, but not the 32-bit code).&lt;br /&gt;&lt;br /&gt;I have nearly finished redoing the 32-bit syscall tracing, and, once&lt;br /&gt;done, will need to validate the 64-bit syscall tracing.&lt;br /&gt;&lt;br /&gt;If I am lucky, hopefully in the next few days or weeks, the resiliency&lt;br /&gt;issues will disappear and I can put out a new release.&lt;br /&gt;&lt;br /&gt;The syscall tracing code is horribly ugly - because we have to support&lt;br /&gt;different calling conventions across the two types of cpu architecture.&lt;br /&gt;I may split the code up into an x86 and x86_64 code file.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.11a-b6022&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-294244672141045480?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/294244672141045480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-some-updates.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/294244672141045480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/294244672141045480'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/dtrace-some-updates.html' title='dtrace -- some updates'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3460524130861507347</id><published>2011-06-02T14:03:00.001-07:00</published><updated>2011-06-02T14:03:38.523-07:00</updated><title type='text'>Bad websites</title><content type='html'>Bad websites. Whats with them?&lt;br /&gt;&lt;br /&gt;I have a beef with a variety of websites - nice websites, let down by the&lt;br /&gt;"We dont care attitude" or "We didnt test it".&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;&lt;a href="http://www.tvguide.co.uk"&gt;http://www.tvguide.co.uk&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;I despair of this web site. Its a great guide to TV channels for UK&lt;br /&gt;people. Nice layout. Lots of content.&lt;br /&gt;&lt;br /&gt;So, whats wrong with it?&lt;br /&gt;&lt;br /&gt;A number of things. One - the menu bar at the top of the screen is over&lt;br /&gt;engineered. If you try to do something, like select one of the sub-menu items,&lt;br /&gt;the ability to navigate and not lose context is near impossible. Try and&lt;br /&gt;select something, e.g. "New series". I leave you to find which submenu&lt;br /&gt;thats under (a minor annoyance).&lt;br /&gt;&lt;br /&gt;Secondly, the huge amount of real estate given over to pointless banners.&lt;br /&gt;These arent advertising banners, but program banners. On a small screen&lt;br /&gt;you have no information content on the first screen at all. On a large screen&lt;br /&gt;you barely get 50% of your screen with the TV grid.&lt;br /&gt;&lt;br /&gt;The search function is badly over engineered using javascript. &lt;br /&gt;&lt;br /&gt;And if you turn off some ad sites via an ad-blocker, the whole page becomes non-functional.&lt;br /&gt;&lt;br /&gt;And lastly, the page quite often forgets who you are and your&lt;br /&gt;channel selections.&lt;br /&gt;&lt;br /&gt;I gave up with this site, and wrote my own TV highlighting application.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;BBC RadioTimes&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;The BBC provides XML files containing 14 days of TV schedules. This is&lt;br /&gt;a great source of data (which I use in my TV planning application).&lt;br /&gt;&lt;br /&gt;But the reviews are *awful*. No, make that, *truly awful*. When I see&lt;br /&gt;a film or a series of potential interest, the paragraph of review is of&lt;br /&gt;this form:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;This film, made by XXX YYY, is a follow on to his earlier work ZZZ, AAA, BBB.&lt;br /&gt;The director did blah, and the actors did bloop. The film won an award at&lt;br /&gt;Cannes, and went straight to video.&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Can you tell whats wrong with the above? Its totally devoid of any information&lt;br /&gt;about what the film or program is *about*. The reviews/write-ups on&lt;br /&gt;tvguide.co.uk at least tell you what the program is about.&lt;br /&gt;&lt;br /&gt;Heres a real quote from the BBC:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;One of two low-budget westerns made by Barbara Stanwyck - the other was &lt;br /&gt;1956's The Maverick Queen - before she found her glorious late-career &lt;br /&gt;stride with such titles as Forty Guns and TV's The Big Valley. &lt;br /&gt;Aided by thoughtful direction from the prolific and talented Allan Dwan, &lt;br /&gt;this movie now has great curiosity value, in that the leading man is &lt;br /&gt;former US president Ronald Reagan, a bland and colourless performer &lt;br /&gt;when pitted against screen villains Gene Evans and Jack Elam. &lt;br /&gt;The location scenery is very attractive, the action sequences &lt;br /&gt;well staged, and Stanwyck as tough as ever: it's a shame the &lt;br /&gt;script didn't give her or any of the cast more opportunities. &lt;br /&gt;Still, this will pass the time nicely, and teenage girls might &lt;br /&gt;discover a useful role model.&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;&lt;a href="http://www.gizmodo.com"&gt;Gizmodo.com&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;My first site of the day is http://www.dailymail.co.uk. Yes, I know - thats&lt;br /&gt;a poor choice of a news website, but consider it bubblegum for the&lt;br /&gt;brain first thing in the morning. My second is &lt;br /&gt;&lt;a href="http://www.engadget.com"&gt;Engadget&lt;/a&gt;. A very nice&lt;br /&gt;and highly fluid website with news stories of interest to me.&lt;br /&gt;&lt;br /&gt;And this, *was* Gizmodo. But I have removed the link from my web browser.&lt;br /&gt;&lt;br /&gt;On my ipad, I have a cached page dating back to April - I cannot get it to&lt;br /&gt;update. I dont know what they did. On my other devices, I dont have the&lt;br /&gt;caching problem. Gizmodo used to track Engadget in style and content.&lt;br /&gt;But recently, they have overhauled it. And they have not done any&lt;br /&gt;user testing as far as I can tell.&lt;br /&gt;&lt;br /&gt;First, I would be redirected to the mobile site, even though I dont&lt;br /&gt;want that. Now, they have reformatted the website, and its totally&lt;br /&gt;devoid of content on the front page.&lt;br /&gt;&lt;br /&gt;It used to work and be a great site, but I waste my monthly bandwidth&lt;br /&gt;quota vising Gizmodo and hoping for something useful to browse.&lt;br /&gt;&lt;br /&gt;So, goodbye to Gizmodo. Maybe, when others start linking to it again&lt;br /&gt;and it contains useful content (even if its a rehash of other sites),&lt;br /&gt;I will revisit.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;&lt;a href="http://slashdot.org"&gt;Slashdot&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;This site has been great for years. Until now. The pool of people powered&lt;br /&gt;news stories they have is great. But slashdot have been playing games &lt;br /&gt;with their presentation and - as my 4th choice of read of the day - is&lt;br /&gt;close to being binned as well.&lt;br /&gt;&lt;br /&gt;For starters, the three column format is annoying. Very annoying.&lt;br /&gt;When browsing on a mobile/small screen device, the left hand column &lt;br /&gt;requires you to scroll the screen to view the text. I never look at the&lt;br /&gt;left hand column - because I know it never changes. So why waste prime&lt;br /&gt;real estate with that, *there*.&lt;br /&gt;&lt;br /&gt;Next. Slashdot has tried to create slow and large home page loads. I applaud&lt;br /&gt;that. But they have done that by limiting the number of visible stories&lt;br /&gt;to about 6. Given that they seem to dribble items out at about once&lt;br /&gt;per hour, that means its pointless visiting the site&lt;br /&gt;repeatedly during the day. And if you leave it too long, you lose&lt;br /&gt;continuity of stories you have read/not-read. (You have to scroll&lt;br /&gt;to the bottom of the screen, click on "More", wait, wait, and&lt;br /&gt;then you see the stories you saw a few hours ago).&lt;br /&gt;&lt;br /&gt;Slashdot seems to have "lost it" - it used to be an interesting&lt;br /&gt;place to read non-news stories, about technology, but they have&lt;br /&gt;taken the Gizmodo approach - reduce the amount of useful info&lt;br /&gt;on the page to the point where visiting it has taken on a boring&lt;br /&gt;attitude.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;&lt;a href="http://www.bbc.co.uk"&gt;BBC&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;BBC - what a poor website. It used to be awful. Now it is pointless.&lt;br /&gt;Another home page devoid of content. Its full of flash cleverness where&lt;br /&gt;you can edit the layout, but I dont want to do that. I want to see news.&lt;br /&gt;The news page is devoid of information - almost like it is a commodity&lt;br /&gt;which is in short supply.&lt;br /&gt;&lt;br /&gt;(Compare the BBC news defaults with the Dailymail website - theres enough&lt;br /&gt;information in each paragraph on Dailymail to decide if you&lt;br /&gt;want to read further. On BBC, you have to guess if the news item says&lt;br /&gt;anything useful).&lt;br /&gt;&lt;br /&gt;Next, try reading BBC on a mobile device. The customisations do not work&lt;br /&gt;(at least, not on my android device). The site is untested in real life.&lt;br /&gt;I rarely look at BBC - every few years when I look, I think the same thing.&lt;br /&gt;A waste.&lt;br /&gt;&lt;br /&gt;There *is* good content on the BBC site - if you spend the time to find&lt;br /&gt;the programme schedule and radio information. But using the BBC website&lt;br /&gt;is like having an unfaithful lover: things move around so much you are&lt;br /&gt;never sure if the site will be the same when you visit it. It would not&lt;br /&gt;be so bad if it got better when the changes happen. But it gets worse.&lt;br /&gt;&lt;br /&gt;The real-estate vs information content is so low, that it reminds me&lt;br /&gt;of the days of a Teletype (ASR-33 with a paper punch drive).&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;Can I do better?&lt;/h1&gt;&lt;br /&gt;&lt;br /&gt;I dont for one moment think I can do better than these sites. I have&lt;br /&gt;learnt lots of interesting things (both in terms of content and in&lt;br /&gt;terms of presentation). But the dilution of news sites which&lt;br /&gt;all feed off each other, has made the internet quite boring.&lt;br /&gt;&lt;br /&gt;Which is a shame.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.10a-b6012&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3460524130861507347?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3460524130861507347/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/06/bad-websites_855.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3460524130861507347'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3460524130861507347'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/06/bad-websites_855.html' title='Bad websites'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5118414331196066256</id><published>2011-05-31T13:30:00.001-07:00</published><updated>2011-05-31T13:30:37.898-07:00</updated><title type='text'>dtrace update 20110531</title><content type='html'>Its been a while since I put out a dtrace update and thought it&lt;br /&gt;worthwhile to give a brief update of what is annoying me.&lt;br /&gt;&lt;br /&gt;The most annoying thing in dtrace at the moment is me. I have spent&lt;br /&gt;the last two months trying hard to resolve some resilience issues.&lt;br /&gt;&lt;br /&gt;At the moment, there are two of them: (1) is the xcall code, and&lt;br /&gt;the other is (2) something to do with syscall tracing.&lt;br /&gt;&lt;br /&gt;As related on prior blogs, the mapping of dtrace_xcall() (which&lt;br /&gt;does inter-cpu synchronisation), doesnt map to Linux very well. On&lt;br /&gt;Solaris, the inter-cpu calling code works from interrupt context,&lt;br /&gt;but we cannot do that in Linux. (Linux will write a warning to&lt;br /&gt;the /var/log/messages files when this happens - although it does mostly&lt;br /&gt;work).&lt;br /&gt;&lt;br /&gt;I have tried a number of variants of a native IPI system in dtrace and&lt;br /&gt;they have failed with various problems. The biggest problem is that&lt;br /&gt;on an SMP system, cpu#1 will invoke a call to cpu#2 but cpu#2 wont respond&lt;br /&gt;until cpu#1 finishes the xcall (a deadlock). In the code, I have&lt;br /&gt;resolved the deadlock by giving up after a suitable period of time,&lt;br /&gt;but thats not good enough. Trying to find out what cpu#2 is doing&lt;br /&gt;when it refuses to respond to the interrupt is very tricky. Various&lt;br /&gt;ad-hoc debug tricks (like using the native smp_call_function() to dump&lt;br /&gt;stacks) failed. Additionally, the synchronous order of messages&lt;br /&gt;written to /var/log/messages is horrendous when I am doing my implementation&lt;br /&gt;of xcall - the cpus write out of order with timestamps going backwards.&lt;br /&gt;(I can understand why, but it doesnt help).&lt;br /&gt;&lt;br /&gt;I have given up resolving the SMP cross-call issue: and instead have&lt;br /&gt;been trying something different. The only place where the&lt;br /&gt;xcall issue is a problem is the profile/tick provider hr_timer clock&lt;br /&gt;interrupts. So I have modified the code to use a tasklet structure instead.&lt;br /&gt;This seems to work (I have some race condition problem to fix before I&lt;br /&gt;can release it).&lt;br /&gt;&lt;br /&gt;But, during all this testing, I hit another strange and annoying scenario.&lt;br /&gt;Doing:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n syscall:::&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and doing intensive things in another window, like:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ while true ; do date ; done&lt;br /&gt;...&lt;br /&gt;date: error while loading shared libraries: /lib64/ld-linux-x86-64.so.2: cannot apply additional memory protection after relocation: Error 9&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Very occasionally, a system barfs. I have seen the output from dtrace&lt;br /&gt;hang (it hangs until I press a key on the keyboard). I have tracked&lt;br /&gt;this down : when a write() syscall is being executed, its being&lt;br /&gt;turned into a read() syscall.&lt;br /&gt;&lt;br /&gt;The event is very rare - 1 in hundreds of thousands of syscalls, but its&lt;br /&gt;horrible. And its *this* problem which is likely what prompted me to go&lt;br /&gt;on the xcall wild-goose chase. The "make test" regression suite is very&lt;br /&gt;good at pushing the cpu load to the max whilst doing dtrace things, but&lt;br /&gt;it occasionally would have issues.&lt;br /&gt;&lt;br /&gt;So, if I can chase the 1:100,000 issue in syscall tracing, then I can&lt;br /&gt;move forward. (I suspect a timer interrupt coming in during a syscall might&lt;br /&gt;be causing the issue).&lt;br /&gt;&lt;br /&gt;As always, I will release the code when I feel its better than where&lt;br /&gt;we are.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.10a-b6012&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5118414331196066256?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5118414331196066256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-update-20110531.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5118414331196066256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5118414331196066256'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-update-20110531.html' title='dtrace update 20110531'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6201558940744284459</id><published>2011-05-20T15:27:00.001-07:00</published><updated>2011-05-20T15:27:01.976-07:00</updated><title type='text'>Dtrace progress - update on xcall</title><content type='html'>I have spent the last few weeks on trying to perfect the dtrace_xcall&lt;br /&gt;emulation. Its quite possible I have been wasting my time and/or&lt;br /&gt;looking at the wrong problem and solution.&lt;br /&gt;&lt;br /&gt;What is dtrace_xcall? Very occasionally, dtrace needs to broadcast to the&lt;br /&gt;other CPUs to ensure they are in sync. Typically this happens when&lt;br /&gt;shutting down the user space application as the trace buffers need to be&lt;br /&gt;dropped, but happens under other scenarios.&lt;br /&gt;&lt;br /&gt;Solaris provides a cross cpu function mechanism, as does Linux.&lt;br /&gt;But they do not work the same.&lt;br /&gt;&lt;br /&gt;On later Linux kernels, you may see a BUG warning in the&lt;br /&gt;kernel logs, like this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[  245.215564] WARNING: at kernel/smp.c:421 smp_call_function_many+0x69/0x1b9()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is the kernel being nice, and warning that the smp_call_function_many()&lt;br /&gt;function is being abused -- specifically being called from an interrupt&lt;br /&gt;routine.&lt;br /&gt;&lt;br /&gt;The problem is to do with locking and rescheduling should delays happen.&lt;br /&gt;At worst, this can lead to deadlock and a hang of the system you are using.&lt;br /&gt;&lt;br /&gt;In practise, this problem is rare - but that is not good enough.&lt;br /&gt;Dtrace needs to be rock solid. As part of the distro, I include a&lt;br /&gt;torture test ("make test") which tries various things, whilst putting&lt;br /&gt;excessive load on the system (forked processes, opening files and doing&lt;br /&gt;things which I have found to cause problems on earlier dtrace releases).&lt;br /&gt;&lt;br /&gt;The test runs well, despite the problem described above, but&lt;br /&gt;very occasionally, one sees horrible "failed to open shlib" type errors&lt;br /&gt;as Linux applications are spawned - rare, but enough to demonstrate we&lt;br /&gt;are doing something wrong and/or have race and locking problems.&lt;br /&gt;&lt;br /&gt;In reviewing carefully the Solaris code, and the Linux code, I have&lt;br /&gt;tried about 3 algorithms to resolve the problem. The latest code&lt;br /&gt;is good, and is a fairly close emulation of what Solaris does, but&lt;br /&gt;it is difficult to prove correctness. In this latest code, the&lt;br /&gt;driver allocates its own distinct interrupt vector, and uses this to&lt;br /&gt;send an inter-cpu interrupt (IPI), whilst spin locking waiting for the&lt;br /&gt;other CPUs to complete.&lt;br /&gt;&lt;br /&gt;It works real well, except occasionally - an attempt to raise an interrupt&lt;br /&gt;on the other cpus fails, and we have deadlock. I put some counters&lt;br /&gt;and delays into the code and some "get out of jail" logic to avoid&lt;br /&gt;total machine hang, but thats not good enough. I have tried various&lt;br /&gt;other hacks, but it is demonstrable that we ask an interrupt be fired&lt;br /&gt;and sometimes it never arrives. This, I fear is me not fully understanding&lt;br /&gt;how the APIC works or having interference with other interrupt sources.&lt;br /&gt;&lt;br /&gt;But an important question in all of this is why do we get&lt;br /&gt;that BUG warning, as illustrated above?&lt;br /&gt;&lt;br /&gt;The reason is the timer interrupt. If you use the tick/profile probes to&lt;br /&gt;get periodic probes (and even if you do not), the clock will fire&lt;br /&gt;and occasionally this will happen whilst a probe is being serviced.&lt;br /&gt;(You get dtrace timer interrupts even without the use of the&lt;br /&gt;profile provider, since dtrace internally uses this for deadlock or&lt;br /&gt;lack of system responsiveness detection). The more probes you fire&lt;br /&gt;the higher the chance of collision.&lt;br /&gt;&lt;br /&gt;Add to this the complexity of a multicpu system, and debugging is very &lt;br /&gt;difficult. One of the most difficult things to do is to see what another&lt;br /&gt;CPU is doing from another. Its easy to litter the driver code&lt;br /&gt;with printk() tracing, and to generate a stack trace when interesting events&lt;br /&gt;happen - but doing this for another cpu to see what/where it is stuck is&lt;br /&gt;not easy (or, at least, I havent found simple code to do it - I am sure&lt;br /&gt;its doable, since the kernel provides support via SYSRQ to do this, but,&lt;br /&gt;possible using the same smp_call_function_many calls we are trying to debug).&lt;br /&gt;&lt;br /&gt;But - back to the timer interrupt -- should they even be happening?&lt;br /&gt;&lt;br /&gt;I am about to research this: dtrace_probe() disables interrupts during&lt;br /&gt;operation, so the CPU invoking this function cannot have a reentrancy problem.&lt;br /&gt;&lt;br /&gt;But the cyclic timer code eventually invokes the dtrace_xcall function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[  245.215564]  &amp;lt;IRQ&gt;  [&amp;lt;ffffffff8104d9ad&gt;] warn_slowpath_common+0x85/0x9d&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa02472d6&gt;] ? dtrace_sync_func+0x0/0xb [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa02472d6&gt;] ? dtrace_sync_func+0x0/0xb [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff8104d9df&gt;] warn_slowpath_null+0x1a/0x1c&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81077c34&gt;] smp_call_function_many+0x69/0x1b9&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa02472d6&gt;] ? dtrace_sync_func+0x0/0xb [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81077da6&gt;] smp_call_function+0x22/0x26&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa025575d&gt;] orig_dtrace_xcall+0x35/0x4b [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa0255a85&gt;] dtrace_xcall+0xe/0x1b [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa0248c9f&gt;] dtrace_sync+0x1a/0x1c [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa022f9cc&gt;] dtrace_state_deadman+0x46/0x89 [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa022c875&gt;] be_callback+0x1d/0x2f [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff810692ed&gt;] __run_hrtimer+0xbb/0x143&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffffa022c858&gt;] ? be_callback+0x0/0x2f [dtracedrv]&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81069aa6&gt;] hrtimer_interrupt+0xd4/0x1b3&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff8146ff95&gt;] smp_apic_timer_interrupt+0x79/0x8c&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff8100a693&gt;] apic_timer_interrupt+0x13/0x20&lt;br /&gt;[  245.215564]  &amp;lt;EOI&gt;  [&amp;lt;ffffffff81397010&gt;] ? read_pmtmr+0x10/0x17&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81397025&gt;] acpi_pm_read+0xe/0x12&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff8106dce9&gt;] timekeeping_get_ns+0x1b/0x3d&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff8106e1ce&gt;] getnstimeofday+0x54/0x89&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81065f2c&gt;] sys_clock_gettime+0x61/0x90&lt;br /&gt;[  245.215564]  [&amp;lt;ffffffff81009cf2&gt;] system_call_fastpath+0x16/0x1b&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So, now I question whether the invocation of a timer interrupt&lt;br /&gt;should invoke a direct call into dtrace, or whether this should be &lt;br /&gt;queued up and deferred. I am 50:50 on this - deferring would imply&lt;br /&gt;complication which doesnt appear in the dtrace code. Its possible&lt;br /&gt;that in the non-dtrace Solaris code, this scenario is handled in the interrupt&lt;br /&gt;handlers - maybe some form of decouple of the interrupt from the&lt;br /&gt;probe firing. (I cant figure out if/how that can be done, without &lt;br /&gt;the kernel doing occasional checks).&lt;br /&gt;&lt;br /&gt;It is possible that Solaris (and MacOS) are both broken and that&lt;br /&gt;it is possible for the system to be deadlocked, but I havent&lt;br /&gt;found evidence this can happen - so, somehow, this is resolved.&lt;br /&gt;&lt;br /&gt;(Note that dtrace linux takes some cheap shortcuts in the hrtimer&lt;br /&gt;code and avoids some of the complexity of the cycle timer code - I havent&lt;br /&gt;understood enough of the details to see what it does and if we need it&lt;br /&gt;in Linux; maybe eventually for accurate and non-skewable clock sources).&lt;br /&gt;&lt;br /&gt;Before I release a new dtrace, I need to decide what to do with some&lt;br /&gt;of the experimental code (presently, there is missing code on the i386&lt;br /&gt;side, and issues with older kernel compilations, along with a bit of&lt;br /&gt;dirtyness in stealing an interrupt rather than properly allocating one&lt;br /&gt;via the kernel APIs).&lt;br /&gt;&lt;br /&gt;If anyone is still reading this, you can occasionally pick up private&lt;br /&gt;beta releases (dtrace-tmp.tar.bz2) in the dtrace ftp dir, but I dont&lt;br /&gt;advise touching these - as the code will be in an indeterminate state&lt;br /&gt;along with too much debugging enabled.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.10a-b6012&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6201558940744284459?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6201558940744284459/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-progress-update-on-xcall.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6201558940744284459'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6201558940744284459'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-progress-update-on-xcall.html' title='Dtrace progress - update on xcall'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7482839026811709986</id><published>2011-05-10T15:28:00.001-07:00</published><updated>2011-05-10T15:28:23.139-07:00</updated><title type='text'>dtrace update -- xcall interim</title><content type='html'>I hope I have this fixed - am just trying to prove the quality of the&lt;br /&gt;implementation. I'll write up the experimentation and results another&lt;br /&gt;day when I have time.&lt;br /&gt;&lt;br /&gt;I have implemented my own interprocess function call interrupt because&lt;br /&gt;the Linux one isnt viable from within an interrupt routine.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.9a-b6004&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7482839026811709986?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7482839026811709986/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-update-xcall-interim.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7482839026811709986'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7482839026811709986'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/05/dtrace-update-xcall-interim.html' title='dtrace update -- xcall interim'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7332385073911577827</id><published>2011-05-02T13:21:00.001-07:00</published><updated>2011-05-02T13:21:08.030-07:00</updated><title type='text'>Trials and Tribulations of Dtrace</title><content type='html'>I reported a while back about the issue with cross-cpu function&lt;br /&gt;calls (via the smp_call_function() series of functions).&lt;br /&gt;&lt;br /&gt;On Solaris, the way dtrace_xcall() works is to invoke an inter-cpu &lt;br /&gt;interrupt in order to ensure the CPUs are in sync with certain data structures.&lt;br /&gt;&lt;br /&gt;Alas, on Linux, the smp_call_function's are not callable from an interrupt&lt;br /&gt;routine. Unluckily, this is necessary and can cause the systems to&lt;br /&gt;deadlock / hang, whilst dtrace breaks the API call.&lt;br /&gt;&lt;br /&gt;I tried an alternate implementation of dtrace_xcall() using timers,&lt;br /&gt;but found problems in getting this to work.&lt;br /&gt;&lt;br /&gt;I tried a 3rd variant - not invoking cross-cpu calls, but, instead&lt;br /&gt;executing the cross-cpu call 'on-behalf-of' the current cpu. This&lt;br /&gt;looks more promising, but requires putting a mutex (actually, a spin-lock)&lt;br /&gt;around the dtrace_probe function, to avoid one cpu executing a probe, whilst&lt;br /&gt;another cpu is doing the housework to clean up.&lt;br /&gt;&lt;br /&gt;This does look better, but I am currently having rare scenarios&lt;br /&gt;where some syscalls can break: a repeated series of process forks can&lt;br /&gt;occasionally give rise to a problem mmap-ing one of the shared libraries.&lt;br /&gt;This is very difficult to debug - as many millions of calls can work&lt;br /&gt;correctly, before one of them fails. &lt;br /&gt;&lt;br /&gt;By "thinking about the problem" it is likely that some form of stack&lt;br /&gt;or register corruption is happening somewhere. What might be worse is&lt;br /&gt;CPU cache consistency issues happening - I am fearful of cache consistency&lt;br /&gt;issues being very difficult to debug, but lets see what happens.&lt;br /&gt;&lt;br /&gt;I have put extra debug code into dtrace to try and spot the issues.&lt;br /&gt;&lt;br /&gt;There is another approach which is to intercept the NMI interrupt which&lt;br /&gt;is where IPI interrupts come from, but I need to think this one through&lt;br /&gt;before attempting this.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.7a-b5984&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7332385073911577827?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7332385073911577827/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/05/trials-and-tribulations-of-dtrace.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7332385073911577827'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7332385073911577827'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/05/trials-and-tribulations-of-dtrace.html' title='Trials and Tribulations of Dtrace'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6107381875782321646</id><published>2011-04-30T11:34:00.001-07:00</published><updated>2011-04-30T11:34:05.945-07:00</updated><title type='text'>Natty Ubuntu - n.i.c.e !</title><content type='html'>After a year of Ubuntu failing to work with suspend-to-ram (despite&lt;br /&gt;upgrading kernels, debugging the powersave etc), it now works in 11.04.&lt;br /&gt;&lt;br /&gt;Thank you!&lt;br /&gt;&lt;br /&gt;My only report of dislike is that iwconfig broke in this release.&lt;br /&gt;My prior command to set up the wifi stopped working and took a number&lt;br /&gt;of rereadings of the man page to figure out what was wrong.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;iwconfig eth1 key XXXXXX&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Firstly, my wifi changed from wlan0 to eth1. But the key was not being&lt;br /&gt;allowed no matter what I tried. I eventually did this:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;iwconfig eth1 key XXXXXX '[1]'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and that fixed it. Strange. Ubuntu 11.04 is a 2.6.38.3 kernel - the same&lt;br /&gt;kernel I had previously (or maybe it is 2.6.38.8 - am a bit confused&lt;br /&gt;what it did to my /boot and /usr/src directory).&lt;br /&gt;&lt;br /&gt;Just found they broke my googlecli install. Had to reinstall python-gdata&lt;br /&gt;and the googlecl package to allow me to blog post again.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.7a-b5984&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6107381875782321646?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6107381875782321646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/natty-ubuntu-nice.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6107381875782321646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6107381875782321646'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/natty-ubuntu-nice.html' title='Natty Ubuntu - n.i.c.e !'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-2748088092118491854</id><published>2011-04-29T08:56:00.001-07:00</published><updated>2011-04-29T08:56:37.809-07:00</updated><title type='text'>CRiSP Adding gridlines support</title><content type='html'>I've added support for gridlines - which show you a line denotating&lt;br /&gt;the indentation structure of the file you are editing. Its interesting&lt;br /&gt;because what should just be a 'here it is' feature turns into a possible&lt;br /&gt;feature with lots of sub-options.&lt;br /&gt;&lt;br /&gt;I have added a 'set [no]gridlines' command as an interim to turn it&lt;br /&gt;on/off, so I can become comfortable with what it shows me.&lt;br /&gt;&lt;br /&gt;This is an alternative to the 'ruler' which is more 'in your face' in the&lt;br /&gt;way the display is done.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.7a-b5983&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-2748088092118491854?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/2748088092118491854/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/crisp-adding-gridlines-support.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2748088092118491854'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/2748088092118491854'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/crisp-adding-gridlines-support.html' title='CRiSP Adding gridlines support'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-440289449224502114</id><published>2011-04-26T13:26:00.001-07:00</published><updated>2011-04-26T13:26:11.448-07:00</updated><title type='text'>Sometimes, things are too fast.</title><content type='html'>As machines have gotten faster and networks better, one doesnt notice&lt;br /&gt;those annoying things which happen in this non-perfect world.&lt;br /&gt;&lt;br /&gt;I have been using X11 CRiSP over a transatlantic link and the&lt;br /&gt;startup time is annoying, along with visible artefacts whilst drawing&lt;br /&gt;the color gradients. &lt;br /&gt;&lt;br /&gt;On investigation, the problem is our old-friend: latency. Some&lt;br /&gt;X11 function calls block, waiting for a reply from the server, e.g.&lt;br /&gt;querying color pixel mappings being the core one. The gradient and &lt;br /&gt;pixmap drawing code relies on lots of color allocations, forcing&lt;br /&gt;round trips - large numbers of them.&lt;br /&gt;&lt;br /&gt;On a locally connected machine or network, this happens so fast, you&lt;br /&gt;rarely notice it.&lt;br /&gt;&lt;br /&gt;CRiSP was written back when monochrome displays were common - certainly&lt;br /&gt;color was a very rare thing. Today, we luxuriate in 24/32 bit video&lt;br /&gt;displays and rarely even think about it.&lt;br /&gt;&lt;br /&gt;So, these round trips are pointless when we know what the RGB&lt;br /&gt;mapping will be - no more XQueryColors, or even XAllocColor.&lt;br /&gt;&lt;br /&gt;Doing this has a dramatic performance enhancement for these&lt;br /&gt;long latency trips. (You wont notice the speedup on a local&lt;br /&gt;machine - not unless you use special measuring tools).&lt;br /&gt;&lt;br /&gt;The first part of this went into CRiSP 10.0.6, and the pixmap&lt;br /&gt;enhancements in 10.0.7.&lt;br /&gt;&lt;br /&gt;[Note, I am dropping the trailing letter in CRiSP version numbers - &lt;br /&gt;they never served a real purpose, and people sometimes forget to&lt;br /&gt;feed this back on error reports; the build number tells me everything&lt;br /&gt;I need to track back to specific source code changes].&lt;br /&gt;&lt;br /&gt;At the moment, this feature is enabled by setting an environment&lt;br /&gt;variable:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ export CR_XS_DIRECT=1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;http://www.crisp.demon.co.uk/download.html&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.7a-b5977&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-440289449224502114?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/440289449224502114/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/sometimes-things-are-too-fast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/440289449224502114'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/440289449224502114'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/sometimes-things-are-too-fast.html' title='Sometimes, things are too fast.'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-6717844337223720768</id><published>2011-04-24T03:11:00.001-07:00</published><updated>2011-04-24T03:11:43.069-07:00</updated><title type='text'>Dtrace bugs</title><content type='html'>Hm. Dtrace has a lot of bugs in it...as I try to ensure dtrace&lt;br /&gt;cannot crash the linux kernel, I am stumbling on to some "thinko"&lt;br /&gt;errors in dtrace.&lt;br /&gt;&lt;br /&gt;Try the following on a Mac:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -l &gt;/dev/null &amp; dtrace -l &gt;/dev/null &amp; dtrace -l &gt;/dev/null &amp; dtrace -l &gt;/dev/null &amp;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We get interrupted system calls and issues on the major/minor numbers&lt;br /&gt;(see 'dmesg' after it fails).&lt;br /&gt;&lt;br /&gt;Next up is DTRACE_ENABLEIOC. This calls dtrace_copyin_dof() which validates&lt;br /&gt;the main dtrace mutex is not asserted. Normally it isnt, unless someone&lt;br /&gt;else is running a heavy handed probe trace, in which case a nasty race&lt;br /&gt;condition exists. Unless a VFS lock is applied, this could spell out&lt;br /&gt;danger or kernel panic (on Solaris, FreeBSD and MacOS).&lt;br /&gt;&lt;br /&gt;dtrace_xcall is another area of potential for kernel deadlock if&lt;br /&gt;multiple dtraces are firing on multiple cpus.&lt;br /&gt;&lt;br /&gt;I have nearly finished my "safety" checks in dtrace, having written&lt;br /&gt;my own dtrace_xcall, and am checking that kernels dont deadlock on you.&lt;br /&gt;&lt;br /&gt;But its uncovering some nasty race conditions in dtrace as a whole.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5971&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-6717844337223720768?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/6717844337223720768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-bugs.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6717844337223720768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/6717844337223720768'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-bugs.html' title='Dtrace bugs'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-9126220521051161768</id><published>2011-04-22T14:31:00.001-07:00</published><updated>2011-04-22T14:31:16.801-07:00</updated><title type='text'>How does this work? (wifi)</title><content type='html'>Very strange. For *months* (in fact, over a year) my laptop has worked&lt;br /&gt;fine with the wifi router...occasionally lost connections when&lt;br /&gt;the microwave is on.&lt;br /&gt;&lt;br /&gt;Tonight, my wifi broke, totally, on my laptop. Spent last few hours&lt;br /&gt;slaving over the system, a driver or two and all the config in the&lt;br /&gt;world.&lt;br /&gt;&lt;br /&gt;Today, was the day I threw out most of my old PCMCIA wifi cards. Typical.&lt;br /&gt;&lt;br /&gt;Ok, out with the screwdrivers, take the back of the laptop off (and&lt;br /&gt;learnt that my laptop can take two hard drives at the same time).&lt;br /&gt;&lt;br /&gt;No joy. Niente. Nothing. Nada.&lt;br /&gt;&lt;br /&gt;Same problem.&lt;br /&gt;&lt;br /&gt;Use the Android phone to analyse the wifi around me.&lt;br /&gt;&lt;br /&gt;Use the iwlist command to probe access routers around me. Someone must&lt;br /&gt;have installed a new one.&lt;br /&gt;&lt;br /&gt;Next....err....why does my startup script set the channel to 11,&lt;br /&gt;yet iwlist is showing it on channel 6?!&lt;br /&gt;&lt;br /&gt;Ok, so set the channel to 6. And Viola! It works.&lt;br /&gt;&lt;br /&gt;Weird. Very weird.&lt;br /&gt;&lt;br /&gt;If the channel was so wrong, how did it work at all?&lt;br /&gt;&lt;br /&gt;Just dont *ever* assume I know what I am talking about.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5971&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-9126220521051161768?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/9126220521051161768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/how-does-this-work-wifi.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9126220521051161768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/9126220521051161768'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/how-does-this-work-wifi.html' title='How does this work? (wifi)'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1828860848045738774</id><published>2011-04-19T15:57:00.001-07:00</published><updated>2011-04-19T15:57:42.639-07:00</updated><title type='text'>smp_call_function and dtrace_sync</title><content type='html'>I've been reading Adam Leventhals (old) blog entry on the&lt;br /&gt;IPI mechanism (inter-process interrupt) and how some of the&lt;br /&gt;code in dtrace works in order to try and debug the Linux problem,&lt;br /&gt;whereby the timer interrupt can invoke a call to dtrace_sync and hence&lt;br /&gt;dtrace_xcall. dtrace_xcall in turn invokes smp_call_function() (and&lt;br /&gt;its friends), but Linux explicitly disallows this behavior - calling&lt;br /&gt;the function from an interrupt or whilst interrupts are disabled.&lt;br /&gt;&lt;br /&gt;Linux's implementation seems ok. But it is at right angles to the Solaris&lt;br /&gt;implementation. Solaris seems to have a higher level of semantics in&lt;br /&gt;this area, allowing interrupt code to invoke inter-cpu synchronisations.&lt;br /&gt;&lt;br /&gt;I have a question - and if anyone (Adam?) knows the answer, feel&lt;br /&gt;free to educate me. Without an answer, I may have to wrap the&lt;br /&gt;Linux APIC interrupt handlers with code that resembles Solaris.&lt;br /&gt;&lt;br /&gt;What does dtrace_sync() *actually do*? In reading the code, it is trying&lt;br /&gt;to ensure other cpus are synchronised in terms of the probe states,&lt;br /&gt;but the implementation *looks wrong*. dtrace_sync() is a way of ensuring&lt;br /&gt;that another cpu is either not running the dtrace code, or, if it is,&lt;br /&gt;is at a sync point. But there are many sync points in the dtrace driver, and&lt;br /&gt;no guarantee that the other cpus are anywhere close to where the invoking&lt;br /&gt;cpu is asking for help.&lt;br /&gt;&lt;br /&gt;Its a bit like a 3-dimensional goto - trying to prove safety via&lt;br /&gt;the various code paths is not easy (maybe not possible). &lt;br /&gt;&lt;br /&gt;Normally, mutex's are used to ensure guarded regions of code, along&lt;br /&gt;with interrupt enable/disable, to prevent nested interrupts.&lt;br /&gt;&lt;br /&gt;But dtrace_sync() is different - it suspends the invoking cpu&lt;br /&gt;until the other cpus have acknowledged the interrupt - and the&lt;br /&gt;acknowledgement is not done based on where the other cpus are.&lt;br /&gt;&lt;br /&gt;The problem that dtrace/linux is having is mostly around the timer &lt;br /&gt;interrupt - breaking the kernel contract on interrupts and&lt;br /&gt;scheduling state. Its not possible (without hacking or damage) to conform&lt;br /&gt;to the contract, which means I need to either stop using smp_call_function&lt;br /&gt;or seek some other mechanism.&lt;br /&gt;&lt;br /&gt;I need to sleep on this and work out the various permutations of&lt;br /&gt;code paths.&lt;br /&gt;&lt;br /&gt;[Note, I do have a safe workaround, but the workaround will cause the&lt;br /&gt;odd timer tick drop within dtrace (the rest of the kernel is not affected).&lt;br /&gt;I may have to release this as a temporary safe fix].&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5971&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1828860848045738774?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1828860848045738774/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/smpcallfunction-and-dtracesync.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1828860848045738774'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1828860848045738774'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/smpcallfunction-and-dtracesync.html' title='smp_call_function and dtrace_sync'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-1086759472677893219</id><published>2011-04-18T13:35:00.001-07:00</published><updated>2011-04-18T13:35:28.512-07:00</updated><title type='text'>opensuse .. and the differences</title><content type='html'>The opensuse kernel is interesting. I am using this kernel:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;2.6.31.14-0.6-desktop&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Its caused me a few days of frustration, but I didnt want to be beat.&lt;br /&gt;I have mostly cracked it now. The issue is around syscalls. The&lt;br /&gt;way dtrace intercepts the system calls is by patching the system&lt;br /&gt;call table. This normally points to the C code function to implement&lt;br /&gt;the activity, and a small amount of assembler glue and C is used&lt;br /&gt;to "hook" or "wrap" the system call.&lt;br /&gt;&lt;br /&gt;One of the problems is that all system calls in dtrace go through&lt;br /&gt;the same hooking code. When a system call is invoked, the %RAX register&lt;br /&gt;contains the system call number, and dtrace assumed this register was&lt;br /&gt;left intact.&lt;br /&gt;&lt;br /&gt;It turns out it isnt. But why it isnt is interesting.&lt;br /&gt;&lt;br /&gt;The opensuse kernel appears to be one of the most resilient kernels&lt;br /&gt;to kernel crashes. Many a time an array of general protection faults&lt;br /&gt;and illegal instruction traps fired, but I have rarely had to reboot&lt;br /&gt;the VM. &lt;br /&gt;&lt;br /&gt;Two things distinguish this kernel: DWARF stack walking and&lt;br /&gt;stack protection. The opensuse kernel has a DWARF stack walker which&lt;br /&gt;helps to ensure more accurate stacks are displayed when a fault&lt;br /&gt;occurs. (Similar to the work I started, but to which I abandoned.&lt;br /&gt;Maybe I can look at that code and see what style of approach they used).&lt;br /&gt;[Stack walking is problematic generically, because of the 32 + 64 bit&lt;br /&gt;kernels, along with all the permutations of GCC compiler switches&lt;br /&gt;which makes it difficult to ensure the code base can handle these&lt;br /&gt;variations].&lt;br /&gt;&lt;br /&gt;The stack protection ensures that if a buffer overflow or some&lt;br /&gt;other bad thing happens, then this is caught very fast.&lt;br /&gt;&lt;br /&gt;The approach that GCC takes is to snapshot a random value&lt;br /&gt;on the stack at the start of the function and validate the value&lt;br /&gt;is still there on exit. This code utilises the %RAX register, which is&lt;br /&gt;what was tickling my problem.&lt;br /&gt;&lt;br /&gt;After various attempts to "jam" the uncorrupted %RAX into the C&lt;br /&gt;arguments to the dtrace handler, I gave up. On 64-bit code, arguments&lt;br /&gt;can be passed on the stack or via registers (the first 6 arguments only),&lt;br /&gt;which means some degree of register fiddling, but also sensitivity&lt;br /&gt;to compiler regimes and kernel compilation modes.&lt;br /&gt;&lt;br /&gt;What I did was created a new per-cpu data structure so that&lt;br /&gt;the %RAX register could be saved, without corruption by another cpu.&lt;br /&gt;This data structure can then be used by the syscall wrapper code and&lt;br /&gt;the results look good.&lt;br /&gt;&lt;br /&gt;The existing systrace code has to handle 6 of the syscalls specially&lt;br /&gt;(eg clone, fork and a few others), because, by definition, these&lt;br /&gt;syscalls dont take the normal "exit" from the kernel route, but&lt;br /&gt;hopefully I can fix these.&lt;br /&gt;&lt;br /&gt;The next steps is to see if this "better" code works for the other&lt;br /&gt;kernels I did not have problems with, and then to contemplate looking&lt;br /&gt;at the 32-bit code implementation.&lt;br /&gt;&lt;br /&gt;Hopefully, an update in a few days (or over the long weekend).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5969&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-1086759472677893219?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/1086759472677893219/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/opensuse-and-differences.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1086759472677893219'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/1086759472677893219'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/opensuse-and-differences.html' title='opensuse .. and the differences'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8356025687631890295</id><published>2011-04-15T14:58:00.001-07:00</published><updated>2011-04-15T14:58:31.479-07:00</updated><title type='text'>Why?</title><content type='html'>Why do people have to complicate something that was already&lt;br /&gt;complicated already?&lt;br /&gt;&lt;br /&gt;OpenSUSE 11.x - very nice system, very slow network gui configurator&lt;br /&gt;(GUI network configurators drive me mad also - everyone is different,&lt;br /&gt;none of them as good as the old fashioned SunOS /etc/rc.local files,&lt;br /&gt;and all of them bloat. OpenSUSE cannot even have a settings control&lt;br /&gt;panel that even looks familiar to any other operating system).&lt;br /&gt;&lt;br /&gt;But my beef is the kernel source/images. Everyone has the&lt;br /&gt;module include files in /lib/modules/`uname -r`/build/. Except&lt;br /&gt;OpenSuse who has them in the /lib/modules/`uname -r`/source directory.&lt;br /&gt;(Maybe they are available in the build/ dir if I had installed the&lt;br /&gt;right extra package, but its like russian roulette trying to guess&lt;br /&gt;where people have stored things or what the package names are).&lt;br /&gt;&lt;br /&gt;apt-get, zypper, pkgadd, ... the list of names for proprietary package&lt;br /&gt;installers is maddening. WTF is "zypper"? At least "pkgadd" hints what&lt;br /&gt;it is. apt-get - which I am now familiar with, is standard across&lt;br /&gt;many distros, yet the name of the package installer on Fedora eludes me.&lt;br /&gt;&lt;br /&gt;Whilst I am bitching, what is "plymouth", what is "policykit", what is&lt;br /&gt;"hald-runner", and the other ten zillion memory hogging wasteful utilities&lt;br /&gt;on a Linux system? Go look for "plymouth" on google and see how well&lt;br /&gt;you find references.&lt;br /&gt;&lt;br /&gt;What happened to meaningful names. Even "dtrace" hints at what it might&lt;br /&gt;be, and is unique to not clash with any other English word.&lt;br /&gt;&lt;br /&gt;So....why do I care? Because I am setting up a new dual-core VM&lt;br /&gt;to debug the smp_call_function issue. I think I know why smp_call_function&lt;br /&gt;might be causing me erratic reliability issues, but I need a good&lt;br /&gt;kernel to demonstrate/cause the bug I am trying to find.&lt;br /&gt;&lt;br /&gt;Incidentally, I have turned up the clock ticks in the tests/tests.d script&lt;br /&gt;to 1mS, to force a higher rate of clock-ticks interrupting caught probes to&lt;br /&gt;help diagnose the tests. (I might even go higher - I want to torture&lt;br /&gt;any system I come into contact with, because that is the only way&lt;br /&gt;to validate interrupt-based coding systems; reading code or test-driven&lt;br /&gt;scenarios can never handle the multi-dimensionality of interrupts occuring&lt;br /&gt;at just the right + wrong points in time).&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5969&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8356025687631890295?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8356025687631890295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/why.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8356025687631890295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8356025687631890295'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/why.html' title='Why?'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5246797667249897581</id><published>2011-04-13T14:11:00.001-07:00</published><updated>2011-04-13T14:11:09.455-07:00</updated><title type='text'>smp_call_function and friends</title><content type='html'>Dtrace has one reliability bug in it - which affects multicore cpus.&lt;br /&gt;I have tried hard to get the reliability up in recent releases, torture&lt;br /&gt;testing the code on various linux platforms - many survive very well, some&lt;br /&gt;dont.&lt;br /&gt;&lt;br /&gt;The torture test involves use of copyinstr() against an argument which is&lt;br /&gt;not a string pointer. Good ones to test are fbt:::return, where the&lt;br /&gt;argN parameters are whatever is left on the stack or in registers.&lt;br /&gt;Dereferencing these lead to kernel level GPF or page faults.&lt;br /&gt;Dtrace should be resilient to these.&lt;br /&gt;&lt;br /&gt;Yet, it still can crash.&lt;br /&gt;&lt;br /&gt;Some kernels are better at handling this than others - the later releases&lt;br /&gt;generally have better interrupt handling routines or BUG_ON warnings&lt;br /&gt;which log a message when a driver breaks a programming contract.&lt;br /&gt;&lt;br /&gt;Which brings us to smp_call_function and it friends. From day 1, these&lt;br /&gt;functions caused me pain; fortunately, Linux provides a set of these&lt;br /&gt;functions, very similar to Solaris, so we won, handsomely.&lt;br /&gt;&lt;br /&gt;Or didnt.&lt;br /&gt;&lt;br /&gt;What are they? On a multicpu system, it is sometimes necessary for&lt;br /&gt;drivers to invoke synchronisation barriers on other cpus. Typically&lt;br /&gt;an SMP kernel and driver will utilise arrays of structures - one per cpu,&lt;br /&gt;so that each cpu can process work independent of what another cpu is doing.&lt;br /&gt;In the case of dtrace, we may be handling an fbt trap on one cpu, whilst&lt;br /&gt;another is doing a system call. As the number of cpus goes up, the permutation&lt;br /&gt;of scenarios of user code, kernel code, drivers, shared structures, etc&lt;br /&gt;goes up.&lt;br /&gt;&lt;br /&gt;What does dtrace do to warrant inter-cpu function calls?&lt;br /&gt;&lt;br /&gt;Well, dtrace, the user space program works by periodically polling&lt;br /&gt;the driver for trace information. Each cpu utilises a double-buffering&lt;br /&gt;approach for tracing: traps and probes are recorded in an event buffer.&lt;br /&gt;When the buffer is full, it can switch to an alternate buffer. When&lt;br /&gt;bin/dtrace asks for a buffer dump, the buffer is emptied, and dtrace&lt;br /&gt;asks the kernel driver to switch to the alternate buffer - just like&lt;br /&gt;video double buffering.&lt;br /&gt;&lt;br /&gt;A single bin/dtrace is effectively asking for data from all cpus.&lt;br /&gt;So, the cpu which takes the ioctl() to ask for the buffer dump has&lt;br /&gt;to tell the other cpus to "empty their buffers". It could do this&lt;br /&gt;by swizzling the pointers in the per-cpu buffer, but this is dangerous -&lt;br /&gt;the other cpu may be executing code leveraging these things. You&lt;br /&gt;cannot simply disable interrupts and lock them out with a normal&lt;br /&gt;type of mutex (you could do, but the number of places to litter&lt;br /&gt;mutex can be high). Instead, because this is so rare, the&lt;br /&gt;smp_call_function functions are invoked to ask each CPU in turn to&lt;br /&gt;do an action on behalf of the invoker. Its quite elegant.&lt;br /&gt;&lt;br /&gt;This is all based around the APIC implementation on the Intel/AMD&lt;br /&gt;cpus which provides a mechanism to send a forced-interrupt to another cpu.&lt;br /&gt;The target cpu takes the interrupt (assuming interrupts are not presently&lt;br /&gt;disabled on that cpu), performs the task, and exits the interrupt.&lt;br /&gt;&lt;br /&gt;The Solaris and Linux implementations are similar, but different. That&lt;br /&gt;difference hurts. &lt;br /&gt;&lt;br /&gt;Ok, so now to the tricky bit. Consider at any point in time, what&lt;br /&gt;is a cpu doing? It could be in user space, or kernel space. In kernel&lt;br /&gt;space, interrupts may be enabled or disabled.&lt;br /&gt;&lt;br /&gt;Lets consider kernel running with interrupts enabled - another cpu&lt;br /&gt;invokes smp_call_function; the first cpu takes the interrupt and returns.&lt;br /&gt;&lt;br /&gt;Now, Linux has a programming contract: when invoking the function, we&lt;br /&gt;must honor the following contract (taken from comments in kernel/smp.c):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; * You must not call this function with disabled interrupts or from a&lt;br /&gt; * hardware interrupt handler or from a bottom half handler. Preemption&lt;br /&gt; * must be disabled when calling this function.&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It is saying that the invoke of the function cannot be an interrupt&lt;br /&gt;handler. Guess where (my) dtrace implementation invokes smp_call_function's&lt;br /&gt;from? The hrtimer code. A timer callback, by definition, is an interrupt.&lt;br /&gt;So, because of this contract, which we have broken, profile tick probes&lt;br /&gt;can interrupt the kernel when it should not do, leading to either&lt;br /&gt;kernel warnings, or, kernel deadlocks.&lt;br /&gt;&lt;br /&gt;I recently attempted to avoid recursive dtrace probes, but if we are&lt;br /&gt;not careful, we will lose timer tick probes, which can break&lt;br /&gt;scripts. (My own regression tests, which terminates after 5s, never&lt;br /&gt;terminate because the tick we need is discarded).&lt;br /&gt;&lt;br /&gt;So, we need to fix this problem. The SMP function calls rely on a fair&lt;br /&gt;amount of fabric to control the APICs and other cpu register masks, so&lt;br /&gt;its not as simple as reimplementing the code to shield from&lt;br /&gt;kernel artefacts.&lt;br /&gt;&lt;br /&gt;So, I need to come up with a plan.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.5a-b5969&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5246797667249897581?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5246797667249897581/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/smpcallfunction-and-friends.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5246797667249897581'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5246797667249897581'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/smpcallfunction-and-friends.html' title='smp_call_function and friends'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-5257420656305616325</id><published>2011-04-10T13:13:00.001-07:00</published><updated>2011-04-10T13:13:25.496-07:00</updated><title type='text'>Diary of a compiler bug</title><content type='html'>I do hate compiler bugs. They waste a lot of time and energy realising&lt;br /&gt;you have hit one. The quality of todays compilers are excellent, spurned&lt;br /&gt;by automatic regression tests, and the vastness of code which users&lt;br /&gt;want to compile.&lt;br /&gt;&lt;br /&gt;GCC - an excellent compiler - has been around for many years. I think&lt;br /&gt;I first learned of it back in 1987 or so, and not only has a huge number&lt;br /&gt;of updates and releases, but works on nearly every CPU/OS combination ever&lt;br /&gt;invented.&lt;br /&gt;&lt;br /&gt;But, when you hit a problem with the compiler, you are hosed -- you&lt;br /&gt;could report a bug, but this may be fruitless, especially when the compiler&lt;br /&gt;version is quite a few years out of date.&lt;br /&gt;&lt;br /&gt;Today, I hit a compiler bug causing strange behavior in Dtrace on Ubuntu 8.04,&lt;br /&gt;using the gcc 4.2.4 compiler. Yes this is old, and there may be patches&lt;br /&gt;for it, but who actually runs Ubuntu 8.04 today? Is my Ubuntu even&lt;br /&gt;patched up to date? (No, it isnt).&lt;br /&gt;&lt;br /&gt;And does this bug exist in many other prior/next compiler versions?&lt;br /&gt;&lt;br /&gt;There isnt enough time in the world to validate this. So a kludge, but a nice&lt;br /&gt;one, was applied:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;if (dp-&gt;dtdo_rtype.dtdt_kind ==&lt;br /&gt;    DIF_TYPE_STRING) {&lt;br /&gt;	char c = '\0' + 1;&lt;br /&gt;	int intuple = act-&gt;dta_intuple;&lt;br /&gt;	size_t s;&lt;br /&gt;&lt;br /&gt;	for (s = 0; s &lt; size; s++) {&lt;br /&gt;		if (c != '\0')&lt;br /&gt;			c = dtrace_load8(val++);&lt;br /&gt;&lt;br /&gt;#if linux	&lt;br /&gt;		/***********************************************/&lt;br /&gt;		/*   This  pointless  code,  which will never  */&lt;br /&gt;		/*   fire,  is  to work around a gcc compiler  */&lt;br /&gt;		/*   bug  which  causes  a page fault because  */&lt;br /&gt;		/*   'act' gets overwritten. I havent exactly  */&lt;br /&gt;		/*   figured  out  whats  going  on here, but  */&lt;br /&gt;		/*   turning off optimisation (which is not a  */&lt;br /&gt;		/*   good  plan  for  __dtrace_probe())  isnt  */&lt;br /&gt;		/*   viable. I have seen this on Ubuntu 8.04,  */&lt;br /&gt;		/*   gcc 4.2.4, i386.			       */&lt;br /&gt;		/***********************************************/&lt;br /&gt;		if (act == valoffs) {&lt;br /&gt;			printk("defeat compiler bug! %p act=%p s=%x/%x %x %x\n", &lt;br /&gt;				&amp;act, valoffs, end, act, s, size);&lt;br /&gt;		}&lt;br /&gt;#endif&lt;br /&gt;		DTRACE_STORE(uint8_t, tomax,&lt;br /&gt;		    valoffs++, c);&lt;br /&gt;&lt;br /&gt;		if (c == '\0' &amp;&amp; intuple)&lt;br /&gt;			break;&lt;br /&gt;	}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The above code is embedded into the middle of the very important dtrace_probe()&lt;br /&gt;function. (Renamed to __dtrace_probe, to allow/detect re-entrancy problems,&lt;br /&gt;which lead to finding this bug).&lt;br /&gt;&lt;br /&gt;The code is too large to easily find this bug or refactor. I verified that&lt;br /&gt;turning off optimisation avoids the bug, but thats not a good thing,&lt;br /&gt;as per the comment above. &lt;br /&gt;&lt;br /&gt;What seems to be going wrong is register allocation/spilling in the compiler.&lt;br /&gt;I *hope* it only affects this code fragment, but its very difficult to tell.&lt;br /&gt;&lt;br /&gt;The effect of the bug was to cause a key variable ("act") to be overwritten. &lt;br /&gt;Fortunately, the kernel survived the subsequent panic/page-fault, but&lt;br /&gt;its not comforting to have this happening in an unexpected way.&lt;br /&gt;&lt;br /&gt;BTW I managed to nuke my first VM last night - Fedora Core 14. After&lt;br /&gt;debugging something similar on FC14, I started getting strange filesystem&lt;br /&gt;bugs. Even a reboot didnt fix things. I got annoyed and fsck'ed the root&lt;br /&gt;filesystem manually, despite fsck telling me "this was a bad idea". And it&lt;br /&gt;was right. Because of the LVM, i nuked the root filesystem, and had to reinstall.&lt;br /&gt;&lt;br /&gt;I do love VMs - took less than an hour to recover (oh, I wish I had&lt;br /&gt;snapshotted the filesystem first, but, well, you know, I didnt!)&lt;br /&gt;&lt;br /&gt;New dtrace later tonite to fix the compiler issue above. &lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-5257420656305616325?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/5257420656305616325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/diary-of-compiler-bug.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5257420656305616325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/5257420656305616325'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/diary-of-compiler-bug.html' title='Diary of a compiler bug'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8923445001813431245</id><published>2011-04-09T00:27:00.001-07:00</published><updated>2011-04-09T00:27:03.547-07:00</updated><title type='text'>Dtrace and the insane dtrace_probe_error call</title><content type='html'>I wrote last night about how the Sun DTrace implementation seems to get&lt;br /&gt;the passback of the error address wrong. Looking at the Apple/Darwin&lt;br /&gt;implementation, they fixed this.&lt;br /&gt;&lt;br /&gt;So, leveraging the fact that I am not insane, I can fix it to.&lt;br /&gt;&lt;br /&gt;(Its possible that in looking at the Open Solaris implementation of&lt;br /&gt;dtrace, its missing some key features, as I find it difficult to believe&lt;br /&gt;real Solaris has this issue).&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8923445001813431245?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8923445001813431245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-and-insane-dtraceprobeerror-call.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8923445001813431245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8923445001813431245'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-and-insane-dtraceprobeerror-call.html' title='Dtrace and the insane dtrace_probe_error call'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-3611977850568485462</id><published>2011-04-08T15:45:00.001-07:00</published><updated>2011-04-08T15:45:09.246-07:00</updated><title type='text'>Either I am insane or dtrace is... dtrace_probe_error()</title><content type='html'>Inside the driver, when an address/fault occurs, a function called&lt;br /&gt;dtrace_probe_error() is invoked. This recursively calls dtrace_probe()&lt;br /&gt;to raise a probe against the ERROR probe. Nice.&lt;br /&gt;&lt;p&gt;&lt;br /&gt;Heres the function prototype for:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;void dtrace_probe_error(dtrace_state_t *state, dtrace_epid_t epid, int which, int fault, int fltoffs, uintptr_t illval)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This function, which is (perversely) written in assembler in FreeBSD and Solaris,&lt;br /&gt;calls dtrace_probe(). An additional arg is passed back to dtrace_probe --&lt;br /&gt;the internal id of the ERROR probe (dtrace_probeid_error).&lt;br /&gt;&lt;br /&gt;All good so far.&lt;br /&gt;&lt;br /&gt;But - and its late here, maybe I am too tired - the number of arguments are&lt;br /&gt;*wrong*. The last parameter passed to dtrace_probe() is the "illval" argument&lt;br /&gt;which is the address of a faulting instruction.&lt;br /&gt;&lt;br /&gt;Consider this D script:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;        BEGIN&lt;br /&gt;        {&lt;br /&gt;                x = (int *)NULL;&lt;br /&gt;                y = *x;&lt;br /&gt;                trace(y);&lt;br /&gt;        }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We should get an error from Dtrace telling us we tried to access address&lt;br /&gt;0x0000. But in Linux, we get some other random value. This is because&lt;br /&gt;we cannot pass in the illval address - we ran out of arguments (arg0..arg4).&lt;br /&gt;&lt;br /&gt;*Yet* FreeBSD/Solaris pass in the extra arg anyway - one arg too many.&lt;br /&gt;&lt;br /&gt;I need to read this more to understand what is going on.&lt;br /&gt;&lt;br /&gt;Heres the solaris code:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;        ENTRY(dtrace_probe_error)&lt;br /&gt;        pushq   %rbp&lt;br /&gt;        movq    %rsp, %rbp&lt;br /&gt;        subq    $0x8, %rsp&lt;br /&gt;        movq    %r9, (%rsp) # &amp;lt;=== what is this?!&lt;br /&gt;        movq    %r8, %r9&lt;br /&gt;        movq    %rcx, %r8&lt;br /&gt;        movq    %rdx, %rcx&lt;br /&gt;        movq    %rsi, %rdx&lt;br /&gt;        movq    %rdi, %rsi&lt;br /&gt;        movl    dtrace_probeid_error(%rip), %edi&lt;br /&gt;        call    dtrace_probe&lt;br /&gt;        addq    $0x8, %rsp&lt;br /&gt;        leave&lt;br /&gt;        ret&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-3611977850568485462?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/3611977850568485462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/either-i-am-insane-or-dtrace-is.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3611977850568485462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/3611977850568485462'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/either-i-am-insane-or-dtrace-is.html' title='Either I am insane or dtrace is... dtrace_probe_error()'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4097021003790106250</id><published>2011-04-06T14:27:00.001-07:00</published><updated>2011-04-06T14:27:53.559-07:00</updated><title type='text'>New dtrace release</title><content type='html'>This is an interim fix for the recent pgfault problem. A silly in the&lt;br /&gt;handler meant we didnt restore the interrupt stack properly, resulting&lt;br /&gt;in a hung/broken kernel if a bad probe function, such as copyinstr(arg0)&lt;br /&gt;triggered a page fault.&lt;br /&gt;&lt;br /&gt;Initial results look good - although I havent done a diverse validation across&lt;br /&gt;my kernels, but it should boost dtrace usability.&lt;br /&gt;&lt;br /&gt;I am seeing a lot of these on FC14, indicating some kernel API protocols&lt;br /&gt;are not being conformed to...thats next on my list:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[  311.686851] BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1074&lt;br /&gt;[  311.687789] in_atomic(): 0, irqs_disabled(): 1, pid: 2553, name: tests.pl&lt;br /&gt;[  311.687789] Pid: 2553, comm: tests.pl Tainted: P      D W   2.6.35.6-45.fc14.x86_64 #1&lt;br /&gt;[  311.687789] Call Trace:&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff8103d12b&gt;] __might_sleep+0xeb/0xf0&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff8146c374&gt;] do_page_fault+0x15c/0x265&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff814697f5&gt;] page_fault+0x25/0x30&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffffa023ddfe&gt;] ? dt_try+0x0/0xa [dtracedrv]&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffffa021f0b2&gt;] ? dtrace_load8+0x41/0x90 [dtracedrv]&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffffa022842b&gt;] dtrace_probe+0x202b/0x2420 [dtracedrv]&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff8111f4d1&gt;] ? path_put+0x22/0x27&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff811201e9&gt;] ? putname+0x34/0x36&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff81116511&gt;] ? do_sys_open+0xfe/0x110&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffffa024011e&gt;] dtrace_systrace_syscall2+0x208/0x21b [dtracedrv]&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffffa02402ab&gt;] dtrace_systrace_syscall+0xb2/0xb4 [dtracedrv]&lt;br /&gt;[  311.687789]  [&amp;lt;ffffffff81009cf2&gt;] system_call_fastpath+0x16/0x1b&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4097021003790106250?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4097021003790106250/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/new-dtrace-release.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4097021003790106250'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4097021003790106250'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/new-dtrace-release.html' title='New dtrace release'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-132310348525462980</id><published>2011-04-04T12:53:00.001-07:00</published><updated>2011-04-04T12:53:23.426-07:00</updated><title type='text'>Tales of an Interrupt -- ud2a</title><content type='html'>Debugging interrupt routines is tiring - anything wrong and BOOM!&lt;br /&gt;&lt;br /&gt;So, whilst enabling lots of fbt probes, I was finding AS4 kernels&lt;br /&gt;dying on me. Didnt appear to happen on other kernels, but probably&lt;br /&gt;could/would do.&lt;br /&gt;&lt;br /&gt;On investigation, we had executed an invalid instruction. Weird, as&lt;br /&gt;in theory, the dtrace disassembler should just *work*.&lt;br /&gt;&lt;br /&gt;But it turns out that because Linux uses judicious use of inlined assembler&lt;br /&gt;code, we had hit a problem. The BUG_ON macro in the kernel, used for&lt;br /&gt;asserting broken behavior (typically of a driver), is implemented by&lt;br /&gt;utilising an invalid instruction. The invalid instruction (UD2A) is great -&lt;br /&gt;we handle that ok, but subsequent to this instruction is a 10 byte field&lt;br /&gt;which encodes the file/linenumber of where the bug happened.&lt;br /&gt;&lt;br /&gt;Interesting.&lt;br /&gt;&lt;br /&gt;Rather than have the compiler embed lots of __FILE__ strings into the kernel&lt;br /&gt;memory, the strings are uniquified and stored in a separate lookup table,&lt;br /&gt;utilising tricks of the GCC compiler.&lt;br /&gt;&lt;br /&gt;Of course, Solaris doesnt do this - and so the FBT code need not worry about&lt;br /&gt;this. (But FreeBSD and MacOSX *could do* if they so choose).&lt;br /&gt;&lt;br /&gt;A quick path to the prov_common.c file stops us tripping over and losing&lt;br /&gt;sync with the instruction stream, and this fixes the problem with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n fbt::a*:&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-132310348525462980?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/132310348525462980/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/tales-of-interrupt-ud2a.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/132310348525462980'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/132310348525462980'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/tales-of-interrupt-ud2a.html' title='Tales of an Interrupt -- ud2a'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-4370725937838556913</id><published>2011-04-03T07:17:00.001-07:00</published><updated>2011-04-03T07:17:55.355-07:00</updated><title type='text'>dtrace update 20110403</title><content type='html'>I've fixed the problem with etc/io.d and etc/sched.d causing runtime&lt;br /&gt;errors on those platforms where we cannot compile ctfconvert. This&lt;br /&gt;is only a partial fix, as its not acceptable to preclude ctfconvert&lt;br /&gt;from being available. ctfconvert is important so we can access Linux&lt;br /&gt;kernel structures from D scripts, and this impairs the ability of &lt;br /&gt;dtrace to do interesting things. (On these platforms, fbt + syscall&lt;br /&gt;are still available).&lt;br /&gt;&lt;br /&gt;I've rewritten the interrupt routines for i386 and amd64 - the interrupts&lt;br /&gt;are now heavily macro-ised, which will help with future support and should&lt;br /&gt;fix some issues.&lt;br /&gt;&lt;br /&gt;I note on one of my kernels, that the io::: provider will hang/crash&lt;br /&gt;the kernel, so I need to validate what is causing this breakage.&lt;br /&gt;&lt;br /&gt;I have added a better 'make test' which can be used to to some minor&lt;br /&gt;stress testing, and will add more functional tests as I hit bugs.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5955&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-4370725937838556913?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/4370725937838556913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-update-20110403.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4370725937838556913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/4370725937838556913'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/04/dtrace-update-20110403.html' title='dtrace update 20110403'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-8645358975521324543</id><published>2011-03-29T14:57:00.001-07:00</published><updated>2011-03-29T14:57:41.474-07:00</updated><title type='text'>dtrace pgfault handling</title><content type='html'>Just spent the last two weeks or so debugging one of those&lt;br /&gt;"but it used to work!" bugs.&lt;br /&gt;&lt;br /&gt;Heres the script:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ dtrace -n syscall::open*:'{printf("%s", stringof(arg0));}'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It doesnt do anything useful - intercepts all the open system&lt;br /&gt;calls, prints the name of the file to be opened. Because the&lt;br /&gt;last part of the predicate is wildcarded, we match the "entry" and&lt;br /&gt;"return" paths of the function.&lt;br /&gt;&lt;br /&gt;On return from a function, arg0 and friends are mostly irrelevant, random&lt;br /&gt;and pointless.&lt;br /&gt;&lt;br /&gt;So - by doing this, stringof() is being called on a bogus pointer.&lt;br /&gt;Which should lead to a GPF interrupt. This works well on the later&lt;br /&gt;kernels.&lt;br /&gt;&lt;br /&gt;But on RedHat AS4, it paniced the kernel.&lt;br /&gt;&lt;br /&gt;After a lot of investigation, it transpires, on AS4, we are taking&lt;br /&gt;a page fault, not a gpf. And my page fault handler was not handling&lt;br /&gt;the fact that on a page fault, the CPU pushes an extra word on to the&lt;br /&gt;stack.&lt;br /&gt;&lt;br /&gt;So dtrace is/was dangerously unstable to rogue D scripts, like this one.&lt;br /&gt;&lt;br /&gt;Very difficult to debug - because I would keep panicing the kernel&lt;br /&gt;as I tried all sorts of experiments to locate the area of the problem&lt;br /&gt;(the interrupt code and the C callback code). Having located most of the&lt;br /&gt;problem to the interrupt code, it took quite a few days to work out what&lt;br /&gt;was wrong (I was ignoring the extra word pushed on the taken fault).&lt;br /&gt;But this was good - I had often stared at the Linux interrupt handlers&lt;br /&gt;to understand the very subtle effect of how the traps are handled&lt;br /&gt;vs the "struct pt_regs" layout. I was having problems with the pt_regs&lt;br /&gt;pointer being "garbage" in the various dtrace pieces of code, and it&lt;br /&gt;was because, even if I survived panicing the kernel, pt_regs was out&lt;br /&gt;by one word.&lt;br /&gt;&lt;br /&gt;Having exercised (exorcised) the code very hard, I feel much more confident&lt;br /&gt;that a user cannot crash the kernel - just as Solaris had lead us to believe.&lt;br /&gt;&lt;br /&gt;[I note that in the Solaris kernel, special code is in place in the&lt;br /&gt;interrupt trade code (assembly), to determine if the CPU_DTRACE_NOFAULT&lt;br /&gt;flag is set. This flag is set within the dtrace code to tell the &lt;br /&gt;gpf/pgfault handler not to take the trap, but, to skip over the offending&lt;br /&gt;instruction (which is most likely a MOV instruction)].&lt;br /&gt;&lt;br /&gt;So, now we have a better handling of gpf + pgfault (although I still&lt;br /&gt;worry if during the handling of a GPF, whether we can have a pgfault.&lt;br /&gt;Not sure this matters, because if its *our* pgfault, then we only&lt;br /&gt;skip over the offending instruction, we dont try to read other parts of&lt;br /&gt;memory.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;ctfconvert / libdwarf problems&lt;/h3&gt;&lt;br /&gt;&lt;br /&gt;Another fix I hope to have in this release is some improvements for&lt;br /&gt;building which people are reporting to me, due to the changes in the&lt;br /&gt;last release (stub dwarf.h added to the ctfconvert utility). AS4 doesnt&lt;br /&gt;have a viable libdwarf.so library - so either I work out what it has&lt;br /&gt;and patch the ctfconvert code, or add in a libdwarf release (which would&lt;br /&gt;bloat the distro). The main problem here is we *need* ctfconvert if the&lt;br /&gt;files in etc/*.d are to not cause a run-time syntax error, as kernel&lt;br /&gt;structs are referred to in the translators.&lt;br /&gt;&lt;br /&gt;I may have to patch the dtrace command to ignore such errors when &lt;br /&gt;auto-inclusion is enabled when parsing user scripts.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Testing&lt;/h3&gt;&lt;br /&gt;&lt;br /&gt;Now I am getting more familiar with dtrace, I hope to include better&lt;br /&gt;tests to avoid problems where things break. I probably wont enable this&lt;br /&gt;in the next release, but definitely for the one after this.&lt;br /&gt;&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5950&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-8645358975521324543?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/8645358975521324543/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/03/dtrace-pgfault-handling.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8645358975521324543'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/8645358975521324543'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/03/dtrace-pgfault-handling.html' title='dtrace pgfault handling'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8336326562741944626.post-7334294056792205486</id><published>2011-03-27T00:31:00.001-07:00</published><updated>2011-03-27T00:31:08.555-07:00</updated><title type='text'>Why the ipad1 is better than the ipad2</title><content type='html'>I feel sorry for people who have the ipad2. Despite its lighter weight,&lt;br /&gt;faster cpu, cameras and other tech, the magnetic cover, it is no match&lt;br /&gt;for an ipad1.&lt;br /&gt;&lt;br /&gt;Why?&lt;br /&gt;&lt;br /&gt;Our aging cat who has become slightly incontinent and in her last days,&lt;br /&gt;decided to pee on my ipad1. Maybe she has a preference for Android, I dont&lt;br /&gt;know. But the moleskin cover did a 100% job of avoiding damage or&lt;br /&gt;leakage on the device. The (fake) moleskin cover, now washed and covered in&lt;br /&gt;washing powder, detergent, soap, really doesnt care what you do to it.&lt;br /&gt;It doesnt care what cats think about it. It just survives.&lt;br /&gt;&lt;br /&gt;So there you have it. 1 out of 5 cats prefer ipad1 to ipad2.&lt;br /&gt;&lt;br /&gt;And I shant leave my ipad on the floor, charging, waiting to be&lt;br /&gt;used as a litter tray again.&lt;br /&gt;&lt;span class='post-comment-link'&gt;&lt;br /&gt;Post created by CRiSP v10.0.3b-b5950&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8336326562741944626-7334294056792205486?l=crtags.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://crtags.blogspot.com/feeds/7334294056792205486/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://crtags.blogspot.com/2011/03/why-ipad1-is-better-than-ipad2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7334294056792205486'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8336326562741944626/posts/default/7334294056792205486'/><link rel='alternate' type='text/html' href='http://crtags.blogspot.com/2011/03/why-ipad1-is-better-than-ipad2.html' title='Why the ipad1 is better than the ipad2'/><author><name>Crisp Editor</name><uri>http://www.blogger.com/profile/14144625547464350210</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
