jUnit and What Makes a Successful Tool ("Packages", Part 2)
Download MP3Welcome to oddly influenced, a podcast about how people have applied ideas from *outside* software *to* software. Episode 3: jUnit and what makes a successful tool.
This is the second of four episodes on Joan Fujimura’s idea of “packages” for spreading theory and technology together. I know I said in the last episode that there would be three episodes about packages. I’ll explain what’s changed in a bit.
First, though, let me give you the core of Fujimura’s history of cancer genetics in case you skipped all the detail in the previous episode. Although simplified, I think it’s true enough for our purposes.
Scientists wanted to understand cancer.
Understanding cancer means understanding cells. But the technology before around the 1970’s wasn’t really up to the task of understanding human, animal, or plant cells. So investigation shifted to easier targets: bacteria and viruses.
At that time, it was known that there were viruses that could cause cancer, but that was considered unimportant. They didn’t cause cancers that anyone then cared about. Tumor virology was a scientific backwater.
Still: such tumor viruses were a good object of study because their genetics were so simple.
Meanwhile, the people studying bacteria and the viruses that infect bacteria discovered that bacterial defenses against viruses involved enzymes that could snip DNA sequences at specific places. That led to recombinant DNA or gene splicing, and also easier DNA sequencing.
Because tumor viruses don’t have much DNA, tumor virologists were early adopters of DNA tools.
Those virologists developed a theory: normal cells had genes that did useful things but still were *almost* cancer-causing. Damage to those normal genes could turn them into cancer genes, called “oncogenes”. The theory claiming that cancer was caused by normal cells transitioning to oncogenes was called the “proto-oncogene theory of cancer”. The normal cells were “proto-oncogenes”, meaning “before cancerous genes”.
The tumor virologists had the recombinant DNA tech to demonstrate their case. And they did.
The tech they used was so impressive that everyone – all sorts of scientists and all sorts of labs – picked it up.
As they did so, those scientists also picked up the assumptions of the cancer theory that went along with the tools. In a sense, they were “infected” by those theories because of the tools.
That is, the modern theory of cancer was largely established because the tools associated with it were *just so good*.
In her book, /Crafting Science/, Joan Fujimura calls tools and theories that spread together “packages”. Implicitly, she’s saying that if you’ve got a great theory that you want everyone to adopt, package it with an attractive new tool.
But. Before I go further, I find I need to, uh, reframe the discussion. You see, as I was writing the script for this episode, it occurred to me that Fujimura describes the proto-oncogene theory of cancer as being wildly successful. It must have infected way over 90% of people exposed to it.
Test-driven design (TDD) has not been that infectious. So I was curious: how infectious was it? How many people who *could* be doing TDD *are* doing TDD? That is, how many people bought into the theory, rather than just the tooling?
The results were a shock. I would have put 10% as a pessimistic estimate. But GeePaw Hill’s tweet is representative:
“I have no hard data. Anecdotally, I'd suggest less than 1% actually do TDD. Something more like 30-40% do a kind of "test-looking-thing-that-isn't-a-test-after-the-code-is-done" as a metric game, which they sometimes call TDD, or BDD, or just "testing”.”
However, there’s another group, including many old-timers like me, who did experience TDD as a bandwagon. As Chas Emerick puts it: "I think it's absolutely right that the nomenclature and "spirit" (big heavy scare quotes there) of TDD had a huge impact on nearly everyone.”
So I find myself faced with two questions, right in the middle of this series that was planned to be simple and straightforward.
1. Why did jUnit and programmer testing succeed among people for whom TDD failed? Because I think it’s true that jUnit and the related idea of programmer testing *did* succeed. How many languages today come without a unit testing framework? How many development environments come without a nice interface to a test runner? And even if GeePaw is right that only 30-40 percent of programmers do some kind of testing, that’s still *way* more than it was before jUnit.
2. Why was TDD a smashing success for some people and a flop for others? GeePaw speaks for a lot of us when he wrote about the poor adoption of TDD: “It breaks my heart, of course. The TDD me and my teams do has dramatically improved our productivity.” But it’s equally possible to find people who reject TDD with the same certainty. What’s going on here?
Dealing with those two questions means I’m obliged to say more words about the topic of packages than I’d planned, which means this episode will be about the “tool” part of a package: specifically, what characteristics make a tool infectious? The next episode will be about what makes the theory part of a package infectious.
We’re starting with the tools. What was so special about recombinant DNA technologies?
Fujimura says the scientists she studied focused on “doable problems”, ones that have a reasonable chance of being solved. Importantly, the next doable problem usually stems from problems that have already been successfully “done”. There is continuity in the work. I imagine science as a sort of ink blot or blob. The dark interior is solved problems, and the white exterior is unsolved problems. An individual lab is almost always pushing the boundary between the two outward in some more-or-less constant direction.
Recombinant DNA technologies meshed with a lab’s doable problems for two reasons. First, they accelerated the movement of the boundary. Particular problems could be solved in less time. That’s important because my impression is that the competition to be first in biology makes competition between software companies seem positively tame. It’s the ultimate in first-mover advantage: it’s your name that appears in the textbooks.
Second, the technologies allowed new problems to become doable. Using my metaphor, a lab growing the boundary used to have to avoid some attractive direction because the problems in that direction just weren’t doable enough. Recombinant DNA opened up such spaces, made them doable.
Here’s something to keep in mind, though: the overall direction was roughly unchanged: a lab that was, say, going roughly north-northwest wasn’t forced by new tools to change completely to push on the southern boundary. The tools did not make most scientists rethink their whole career trajectory.
And, from what I’ve read, the work didn’t *feel* essentially different. Technicians and scientists were still working with beakers and centrifuges and reagents and purified proteins. They kept doing all the kinds of things I don’t really understand because I flunked chemistry my freshman year in college. Their work wasn’t disrupted the way that say, machinists’ work was disrupted by the rise of computer-controlled machine tools.
Another reason that the tools were easy to adopt was that they were what I’m going to call *sharable*. First, using them was much more akin to following a recipe than figuring out how to apply a theorem or construct a proof.
As an example, 1982 was the first publication of /Molecular Cloning: a Laboratory Manual/. The first chapter in the 1992 edition is called “Isolation and Quantification of DNA1”. It contains a long set of “protocols” (which is what people in the life sciences call what I’d call a “recipe” or a “procedure”). The first four protocols are:
Protocol 1: Preparation of Plasmid DNA by Alkaline Lysis with SDS: Minipreps.
Protocol 2: Preparation of Plasmid DNA by Alkaline Lysis with SDS: Maxipreps
Protocol 3: Isolating DNA from Gram-Negative Bacteria
Protocol 4: Precipitation of DNA with Ethanol
… and so on and on *and on*.
Someone, if exceptional, *could* achieve competence in recombinant DNA techniques by reading and trying things out, but that would be wasteful. So it was extremely common to send graduate students or postdocs to other laboratories to soak up skills by participating in their work. Like medicine or architecture or programming, recombinant DNA has a lot of tacit knowledge that’s best picked up by working alongside other people and discussing things happening right in front of you. (That’s another teaser for Donald Schön’s notion of “effective practice.”)
I also get the impression that there was, in the early days, a sort of gift economy – things like “I’ll let you have some of my lab’s reagents with the expectation that someday you’ll return a similar favor.” Community matters in the early days. (And gift economies is another topic I want to bring up in this podcast, but not in the… well… *wrong* ways the idea has been used in open source software.)
Related to sharability is what I’ll call “tinkerability”. Recombinant DNA wasn’t a machine you bought and put samples into. It was a loosely organized collection of individual protocols and reagents and enzymes and pieces. Individual labs modified protocols the way skilled cooks individualize written recipes by trying out variants and settling on the ones they like best. My mother adapted a traditional “linsen und spaetzle” recipe from her region of Germany after she came to America, and that Marick variant has been adopted by two following generations. (Recipe available upon request.) Such tinkering is just the sort of thing that happens in craft work, which is what lab work and programming work are.
A final point on adoptability is that you could use the tools without making a strong commitment to the underlying proto-oncogene theory of cancer. You’d probably end up tacitly accepting it – because everyone else around you was – but you *could* solve your doable problems using recombinant DNA tech, largely independently of your understanding of, or commitment to, the associated theory.
Let’s compare jUnit to recombinant DNA. jUnit was certainly *sharable* in ways quite similar to recombinant DNA. First, it was free and downloadable.
And I claim the early documentation was atypical. The first thing many people read was an article called “Test Infected: Programmers Love Writing Tests”. Notice that the title grabs your attention by asserting something counter to almost everyone’s experience at the time. The whole short, clearly written article is *technical marketing*: it very explicitly wants to persuade you, not just give you the facts.
Such documentation was unusual for the time. Around 2000 there were still a lot of people producing tools with the attitude of that fictional painter: “I’ve suffered for my art; now it’s your turn”. Or a slogan many people have attributed to the creators of various programming languages: “You have to *earn* the right to use our code.” The bar for documentation back then was pretty low, and jUnit cleared it easily. A little empathy for your user – or even just for your earlier, more ignorant self – really does go a long way.
jUnit was also the right sized tool to teach people at a half-day workshop at a conference or local gathering. It also lent itself to learning through pair programming (which was also becoming somewhat fashionable at the same time).
I’d say the equivalent of /Molecular Cloning: a Laboratory Manual/ was the testdrivendevelopment Yahoo mailing list. Books on TDD tended to emphasize theory more than the equivalent of laboratory protocols, which was probably a mistake, but I want to give a shoutout to J.B. Rainsberger’s /JUnit Recipes/, published in 2004, and Gerard Meszaros’s /xUnit Test Patterns/, published in 2007.
jUnit was also *tinkerable*, both because it was open source, and because it was designed so that some of its parts could be swapped out. And documentation once again played a role, in the form of “jUnit: a Cook’s Tour”, which described the structures and design patterns used in jUnit.
The first jUnit article wasn’t tied to TDD. jUnit started as a more-competent-than-most programmer testing tool, most all of which were homegrown at that time. As such, it made the pre-existing idea that we should do better testing more *doable*.
And I think it’s important that jUnit was happening at roughly the same time people – especially people working in companies with big Java codebases – were abandoning terminal-centric tools like VI or Emacs for tools like Jetbrains’ IDEA or Eclipse. That is, tools that say, “hell, we’re OK if we’re not backwards compatible with an emulator that pretends to be a 1978-era VT100 terminal; instead, we will exploit the now-dominant graphical user interfaces.” Because jUnit was quickly integrated with those environments, it became a readily-available option. There were fewer excuses *not* to do programmer testing.
Moreover, jUnit introduced an expectation that programmers across teams would do testing in roughly the same way and use roughly the same words to describe it. “Setup” and “teardown” and even idiosyncratic (and frankly weird) words like “fixture” meant the same thing across teams and across companies.
A complementary way of looking of jUnit is that, even after it became associated with TDD, jUnit was theory-indifferent. Everyone kind of knew they should do testing, and jUnit made it easier for them… to kind of do testing. (Typically poorly, from the point of view of people who’d actually studied testing.)
People could ignore TDD, whereas cancer researchers could not as easily ignore the proto-oncogene theory. I’ll have some speculation about the difference in the next episode.
But let’s look at the people who *didn’t* ignore TDD, people who embraced it. What was it about their “theory of programming” that made them susceptible to infection by a rapid test-code-refactor loop? I’m going to give two answers, one this week and one next week.
Let me start with a claim. I want to say a programmer’s “doable problem” is most often a *programming* problem. The theory of TDD turned testing into a little episode in the larger work of solving a programming problem, an episode that felt more like design than like most programmers’ previous experience of testing. Before TDD, a programmer’s job was programming and then doing *something else* that *promised* to save time in the long run but most certainly *cost* time in the short run. That’s one of the reasons why even the improvements jUnit brought to programmer testing still didn’t make that testing popular enough to be a majority practice.
TDD allowed programmers to feel like they were doing *exclusively* programming (or a mix of programming and design). The test-writing provoked much the same attitude as waiting for the compiler – that is, it’s a part of the job you wouldn’t track separately because it didn’t feel like a different task.
After claiming the claim you just heard, I thought I’d check whether it made sense to other TDD practitioners. To some, it did.
Others expressed things a bit more modestly. Glenn Vanderburg, generally less excitable than me, wrote "I certainly wouldn’t have phrased it that way… but it did feel like a much easier, lighter weight context switch between writing the “real” code and writing the tests. Much easier to maintain context and a kind of flow as we switched back and forth.”
Johannes Link wrote “I used to frame it as „designing, programming & testing is now a single activity“. But the most important emotion I had while doing TDD was being relieved about getting control back, even if a task was beyond my personal complexity threshold.” That’s reminiscent of expanding the available landscape of doable problems.
Being the kind of guy I am, I want to tie all three of those together by claiming TDD satisfied our desire to *work with ease*. What do I mean by that? My beloved wife used to do surgery on cows. She had this way of describing what it was like to work with a really jelled surgical team. She would hold her hand upright in front of her, then sweep it down and out to the right, then – without pause – complete a circular motion – up and to the left – ending with her hand reaching down into the imaginary cow’s abdomen to clamp, suture, or incise. The idea was that, maybe without her even having to say what she wanted, the right surgical instrument had simply *appeared* in her hand.
I’ve always found that image very compelling and have frequently acted it out before other people. Some wince at the thought of groveling around in a cow’s body cavity, but then have come to agree that that feeling of ease is a valuable thing to have at work.
Therefore, I claim the following: TDD infected primarily people who strongly desire to work with ease, and have as the focus of their work, code. People with different values can remain immune.
Now, is my new theory *true*?
That question gives me an opportunity to state a theme that’s been in my work for decades. I read books like Fujimura’s that contain theories about how people do their work. I’m not so concerned if the theories are *true* as if they are *suggestive* – that is, do they give me ideas about how software people should do *our* work. Then the next task is trying out whether those ideas have good results, in our work. Because everyone’s theory about people is somewhere between fully wrong and fully right, and is always incomplete.
I’d like you to do the same with any of my conclusions, including this one. In the end, it’s really up to you to figure out what my conclusions, or anyone’s conclusions, mean to you and if they give you any ideas about, say, how to get your favorite tool more widely adopted.
There’s more to say, but I want to keep these episodes relatively short. The next one will be about what characteristics of a theory make it suitable to tag along with tools. That is, some theories (like the proto-oncogene theory) catch on when they’re delivered alongside a tool. Others (like TDD) largely don’t. What the difference? And why is the reaction against TDD so often *fierce*.
In the meantime, thank you for listening.