TESTFRAME is dead. This accelerates us some.

Eric S. Raymond esr at thyrsus.com
Thu Sep 29 17:15:35 UTC 2016


I just took TESTFRAME out behind the barn and shot it.  Here's the
change comment:

    TESTFRAME: Withdraw the TESTFRAME code.
    
    There's an incompatible split between KERNEL_PLL and non-KERNEL_PLL
    capture logs - neither can be interpreted by the replay logic that
    would work for the other.
    
    Because we can't get rid of KERNEL_PLL without seriously hurting
    convergence time, this means the original dream of a single set of
    regression tests that can be run everywhere by waf check is dead.
    Possibly to be revived if we solve the slow-convergence problem
    and drop KERNEL_PLL, but that's far in the future.
    
    Various nasty kludges could be attempted to partly save the concept
    by, for example, having two different sets of capture logs.  But, as
    the architect of TESTFRAME, I have concluded that this would be
    borrowing trouble we don't need - there are strong reasons to suspect
    the additional complexity would be a defect attractor.
    
    One proble independent of the KERNEL_PLL/non-KERNEL_PLL split is that
    once capture mode was (mostly) working, it became apparent that the
    log format is very brittle in the sense that captures would easily be
    rendered invalid for replay by even minor logic changes.
    
    Best to fill in this rathole and move on.

It hurt to do this; I put a huge amount of effort into trying to make
TESTFRAME work. But it was like being Sisyphus pushing his rock -
every time I felt like I was on the verge of success, some unexpected
issue would tumble the boulder back down on me. I went through at least
four cycles of this.

A few days ago I realized that spending a day and a half raiding the Classic
code for unit tests we didn't have had improved our coverage more than the
weeks I had spent on TESTFRAME.  That's when I started to seriously consider
cutting our losses.

The news isn't all bad.  Yes, we've taken a hit to our long-term
expected defect rate by not being able to test as deterministically as
I had hoped.  On the other hand, now I can put all my energy into
actual forward movement towards 1.0.  The first step in that process
will be landing Daniel's refactoring of the protocol machine.

And, in truth, I'm less worried about the kind of iatrogenic problem
TESTFRAME was designed to head off than I was nine months ago.  We've
done a remarkably good job of avoiding those without TESTFRAME. There
are no guarantees anywhere, but our odds of maintaining that record
don't seem horrible.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>


More information about the devel mailing list