“Fully automatic software testing now possible” – Really? Hmm? Soooo?

Part of the gap between computer science as taught at universities and software development as done in our industry is what Alistair Cockburn lists as one of the early methodologist errors: I did this, now do what I did, and you’ll be happy ever after. This notion is not only disrespectful for the achievements that our industry has come about, but it also lacks the particular difference between lab situations and the context of software development in software development shops all around the globe. This is nothing I came up with, but an observation I made while teaching anything. The first reaction people have to something new that’s imposed on them is “but this does not work for me” until you show them how that’s possible – and find out yourself that the combination of Spring, SOA, JBoss, GWT, SWING, and Ruby or even any other combination of buzzword technologies from the past two decades come with their own pitfalls and fallacies, and your beloved approach ends up being useful. In fact a while ago James Bach claimed that Quality is dead due to the unmanageable stack of technologies and abstractions our industry has to deal with. I would even go further and claim that no one will be able to handle the Y10k problem in eight-thousand years if we continue with this.

One of the fads that seems to be reappearing is the idea of automating away humans from the software development work at all. This fad came up with the rise of UML, and the most recent fad appears to be model-based testing. One of the interesting things I noticed is the ignorance of other past movements. It seems that universities keep on bringing up new talents from time to time that claim to save the world because universities favor a particular competition-based learning model in which everyone wants to be the next hero. Of course this is garnished with some flavor of Pandorra’s Pox:

Nothing new ever works, but there’s always hope that this time will be different.

(The Secrets of Consulting, Gerald M. Weinberg, Dorset House Publishing, page 142)

On a side-note the same author just recently wrote about this ignorance to past experiences in the context of development models like structured programming or Agile.

Up until now my hope was that model-based testing was a fad that would disappear quickly with industry leaders ignoring it completely to start with. But it seems that the hope that model-based testing will be different keeps on re-occurring besides the voices of highly-skilled consultants in our fields – for example take this blog entry from James Bach, dated 2007. Four years have passed since then.

I decided to ignore this fad for as long as possible, but this morning I read about model-based testing in a way that made me angry. Of course, the problem is not model-based testing in itself, but my reaction to what was written on that particular webpage. Fully automatic software testing now possible is the title of this webpage. So far, I have criticized some of the work on model-based testing. These discussions almost always ended up in an agreement that you shouldn’t base all of your testing efforts on model-based testing (MBT) but combine it with other approaches as well. I am fine with this. If MBT serves you better than another approach for a particular effort, go ahead. But I wouldn’t invest all of my money in a single stock option. Or run a consultancy with a single client. In fact, most countries won’t allow you to start a consultancy if you got just a single client. Think about why for yourself.

But the article – no, I wouldn’t call it article, it’s rather a marketing piece for a particular company – seems to fully ignore such contextual considerations. Let’s take a look at some sentences before I explain the pitfalls of model-based testing, and what to do instead.

The system not only facilitates quick and accurate software testing, but it will also save software developers a great deal of money.

Really? Every software developer in the world? Even those who are highly performing using TDD and Specification by Example? I doubt this. Since there are no data mentioned at all this claim is hard to justify. And in the end, how much is “a great deal of money” in first place? A great deal for my private life would be 100.000 Euros, but a great deal of money for an enterprise surely looks differently.

Our automated method can improve product quality and significantly shorten the testing phase, thereby greatly reducing the cost of software development.

And if I don’t treat testing as a separate phase to start with, what happens then? Also notice that their understanding product quality is not defined. (On a side-note I added the word “quality” to my set of lullaby language terms.) So, what does an improvement in product quality mean? How would I notice it? Connecting the first quote with this second, it also appear that shortening testing is correlated to the costs of software development. This violates three fallacies from Jurgen Appelo’s Management 3.0 class at once:

  • Machine Metaphor Fallacy – Don’t treat organizations as a machine (organizations are complex systems).
  • Linear Behavior Fallacy – Don’t assume things behave in a linear way (things behave complex in a complex system).
  • Unknown-Unknowns Fallacy – Don’t think you have covered all risks (the Titanic effect will hit you if you do).

The testing phase for new software consists of three steps: developing the tests, running the tests and evaluating the results.

I think this belief is the reason why testers are treated like monkeys, and software testing keeps on being outsourced to underpaid countries where the power system still has outages. What’s missing in this list? Think about it. There is one key ingredient missing, and I have not seen anyone trying to automate this key ingredient. You’re with me? Yes? Exactly! It’s the learning. Test execution and test results lead to learning when executed manually, or when supervised. This learning informs the development of the next test – in one way or the other. There are learning approaches that could work for some particular part of software testing. For example there are systems in place which learn risky code changes based on changes that introduced bugs in the past. But this addresses one of many risks in software development.

When used properly, the method completely eliminates the need for manual software testing

Besides the hint to “When used properly” (people are not very disciplined at using something properly), I sincerely doubt that the need for manual software testing will be eliminated by their system. Proof me wrong. Please.

Model-Based Testing has a number of major advantages: it makes the software testing process faster, cheaper and more accurate.

I’m speechless about such statements. Speechless except for: “compared to what?”

It is not uncommon for manual software testing to take anywhere from several months to years.

Finally, a hint to the problem they are trying to solve. So, manual testing takes too much time appears to me to be the problem that MBT addresses. But I don’t see a clue why testing is taking so long. Matt Heusser taught me a while back that this is an impossible to answer question. “So, testing should take fewer time. Which of these risks should I leave unaddressed with my testing then?” is a more congruent answer to that question.

The pitfalls

One of the pitfalls for MBT if sold with the sense of humor as in the above statements – and I really hope they don’t mean these statements seriously – is the belief that I covered all risks. My favorite quote from Jerry Weinberg on this is the Titanic Effect from Quality Software Management Volume 1 – Systems Thinking

The thought that disaster is impossible often leads to an unthinkable disaster.

If you look for such occurrences you don’t have to go as far to the past as the Titanic incident. Take challenger, volcano eruptions or Fukushima – just to name a few.

But words can describe so much. Let’s run an experiment. I run through this experiment a few months back when I approached Michael Bolton to explore MBT. So, this is really his exercise. Let’s apply model-based testing to a simple website for children. Here is the Horse Lunge Game. Develop a model for this game. I will wait until you come back.

Finished? Already? Maybe you should look deeper. Go once again over your model to see if you left something out.

Alright, this should be fine right now. Now, how many bugs did you notice which were not part of your model? None? To get an idea watch the fence in the background. Is it floating continuously at the same pace, or is there a jump in it? Part of your model? Really? If this didn’t convince you yet, does your model incorporate that none of the horses should have wireframes? No? Well, does any of the horses has wireframes?

My point is that the notion of MBT as presented on that website I linked earlier (I won’t do that again!), ignores the impossibility to test everything. Usually I refer to testing a compiler at this point. Testing a compiler completely would mean that I run a test for each program this compiler will compile – ever. So, I must digest any program that is in existence for the particular language that I compile, (even my compiler might be written in that language) and any program that will be invented in the next – say – twenty years in that language. Then I run each and every program through this compiler, and check the results. This does not only hold for compilers, but for frameworks in general as well. (Side note: Frameworks are the new programming languages.) This is not even possible. Now, with MBT I would have to create a model for each program that is going to be created in the future. So, I would need to invent a time machine first. Maybe science has found a way to time travel, and MBT is a hint to this. But I doubt it.

This leads me to another point. How much does it cost to create the model? How much does it cost to maintain the model? How long does it take to run the millions of tests referenced in the text? There is no statement about this. Our industry does not even agree on how many tests to automate or what the return on investment (ROI) for software test automation is. And now we are faced with a different question based on MBT which is what the ROI of test automation AND model-based testing will be. These are two variables in a complex formula. Do you see now why I don’t believe that I can save a bunch of money just by applying MBT?

A different approach

An approach that works for me is to apply ATDD or Specification by Example in combination with session-based Exploratory Testing. Using Specification by Example helps me derive specifications for the software from the people that matter. Session-based Exploratory Testing helps me to grow a mental model of the software, and cover the risks that are relevant at the time. I don’t claim that I can cover all risks in the time I got. Instead I can cover the risks that are most meaningful right now. Remembering Rudy’s Rutabaga Rule (Secrets of Consulting, page 15):

Once you eliminate your number one problem, number two gets a promotion.

in a complex system like software development this seems to be an approach that brings more value. And besides that, Exploratory Testing has a high emphasize on learning which MBT appears to neglect completely.

On Programmer-Tester separation

One of the things that testers in my classes on ATDD and Exploratory Testing struggle the most with is that programmers and testers appear to give up independence with this approach. My first reaction to this is often that I hear them asking to leave their testing silos in place, and start to convince them to collaborate more with the programmers. By hard I recently learned that the Helpful Model (“No matter how it looks, everyone is trying to be helpful.” Secrets of Consulting, page 101) and the Rule of Three Interpretations (“If I can’t think of at least three different interpretations of what I received, I haven’t thought enough about what it might mean.”, Quality Software Management Volume 2 First-order measurement, page 90) also applies to myself.

Continue reading On Programmer-Tester separation