German Agile Testing and Exloratory (GATE) Workshop

Maik Nogens and myself are organizing the German Agile Testing and Exloratory (GATE) Workshop. We are proud to announce the call for contributions for this workshop. It will be held in Hamburg, Germany on October 1st 2011.

GATE will be a low budget, non-profit peer workshop. This means that we might split expenses for the location and commodities equally among the participants (probably below € 100). Every participant will take care of the remaining costs for her- or himself, e.g. travel, lunch.

As the main goal we identified the following elevator pitch:

For ambitioned testers who want to learn established approaches and practices in the craft of software testing the GATE workshop provides a platform for an equal knowledge exchange. In contrast to traditional conference formats the GATE workshop provides a practical, low budget, controversial experience on software testing.

That said, we are interested in contributions such as

  • realistic experience reports
  • controversial testing techniques
  • testing in practice (Testing Dojos, Hands-on testing)
  • options for distributed software testing

Feel free to contact us if you are unsure. We will be most glad to provide you feedback.

The language of the workshop will be dependent on the workshop participants. It might be German all day, but if we got international contribution, we might decide to hold the workshop in English.

If you want to attend the first German Agile Testing and Exploratory Workshop, come up with a contribution, and send it to us until September 5th. Further details on the attendees, the program, and travel information will be provided later.

If you are interested in the format, there have been quite a few of these peer workshops in the testing space recently. Here are some pointers:

Hope to meet you in the nicest city in the world. Hope you are as excited as we are.

Testing Dojos from the Back of the Room

Last week I organized together with my colleague Meike Mertsch a Testing Dojo for a client of ours. The testers there had attended some Testing Dojos in the past, but with some drawbacks. As part of the clarification process the testers picked an internal application – their billing system. I knew that a billing system can be quite extensive to test within a setting of a single hour. So, I asked for an introduction in the morning, in order to make up my mind on the mission for that dojo. One thing I noticed in the morning was the applicability of the 4Cs scheme for the dojo.

The 4Cs are described in Sharon Bowman’s book Training from the back of the room. The 4Cs stand for Connections, Concept, Concrete Practice, and Conclusion. My colleague Stefan Roock also explained me that he sometimes exchanges Concrete Practice with Concept in his classes. I found that we could apply the same pattern to the billing system testing dojo.

Connections

Since the billing system is large in nature, we needed to make a connection to the known stuff of that billing system. The teams had rarely to do with the billing system. So the tacit knowledge was hard to grasp, and it was hard to keep the lessons learned for that system. I knew we needed to bring up a connection to that past knowledge. We picked to have a group brainstorming to bring back the few elements of the billing system that stuck in the heads of the testers in the beginning. We spent 10 minutes on that.

Concrete Practice

We had overall roughly two hours. I had planned to run a combination of Weekend Testing sessions in pairs of two or three and Testing Dojos intervened by short debriefings about the findings. I ran a similar setting back at WeekNight Testing live in March 2011. We scheduled two to three 20 minute sessions of testing with 5 minutes of debriefing and pair switching. The twenty minutes seemed to be enough to run one or two tests in the billing system. So, it was a good fit.

Concept

We mixed this up with the concept the testers should learn more about. For the billing system there was a RESTful interface. This was introduced and documented in the company’s wiki, but just one tester appeared to know about this. So, we derived a mission which included the REST API for testing the billing system. We had print-outs of the documentation with us, so every pair could get a grip on it. This worked surprisingly well.

Conclusion

For the conclusion we sat together for the final half hour and talked about ideas for improvement in the billing system. To get the discussion started I asked the participants to apply wishful thinking, and come up with anything they would like to see there. Despite stuff like influence on time and running bills for single accounts, there were also concerns about traceability like logfiles and progress report.

After that Meike came up with an action list. She asked for things that the participants can apply right away as well as for volunteers to carry out the actions and follow-up on them. We identified two points which two of the participants put on their agenda.

Reflection

So, the 4C model worked surprisingly well with this sort of Testing Dojo. It was absolutely necessary to make the connection to the system in the beginning. The attendees knew a lot about the system. We had to bring this together. Together with the complexity of the billing system, we would have been lost if we had skipped that part.

The debriefings after each session were also necessary with the pair switch to spread not only the knowledge but also bring several learning insights based upon different machines. From the discussions in the morning I got the notion that previous Testing Dojos did not come with a mission as well as debriefings. Since the Exploratory Test Management Roungtable at last year’s EuroSTAR conference I consider the mission and the debriefing the most mandatory element of any session-based test management implementation. The experiences at this client underlined this impression.

On the PageObject Pattern

Recently I started to think more in depth about the PageObject pattern – at least how some folks called it initially. One problem I noticed in discussion with colleagues and clients is the lack of pattern description for this so called pattern. I decided to explore this problem in more depth, and try to come up with a constructive solution to that problem.

Continue reading On the PageObject Pattern

EuroPLoP from a first-timers perspective

Last week I had the pleasure to attend EuroPLoP. I submitted a pattern that I started back on October 1st at the inaugrual AA-FTT pattern writing workshop. Back then I called it Essential Examples, whilst through several round of shepherding and workshops I ended up with the name One clear purpose currently.

From the conference, I had several impressions as well as some insights which I would love to have gotten earlier. I decided to write these down for the next first-timers in the future to consider – even before submitting a pattern in first place.

Continue reading EuroPLoP from a first-timers perspective

Improvement vs. Radical Change

In the Kanban, Scrum and Lean hemisphere there is a continuous discussion about the radical change nature of Scrum vs. the evolutionary change method of Kanban. Both are often referenced as Kaizen or Kaikaku in Kanban. But what’s the difference? When do I need one or the other? And what does this say about change in general?

Kaizen

Kaizen is about improvement. Retrospectives in the Agile methodologies help to foster continuous improvement. After a limited time the team gets together and reflect on their past few weeks. In the original methods retrospectives where bound to the iteration limit, like one to four weeks. With the aspiration of iteration-less methodologies like Kanban retrospectives get their own cadence, and don’t necessarily fit the planning boundary.

Retrospectives help to improve one to three things that didn’t work well. Ideally applied the actions from a retrospectives help to change the development system just a little bit. In complex systems such changes may have unpredictable consequences. This is why we restrict changes to one to three items. If we try to implement more changes at a time, we are likely to completely turn the underlying system around, thereby getting an unpredictable system.

Over time such little changes eventually lead towards a system which gets stuck. If you keep on improving a little bit time after time, you climb yourself uphill towards a local optimum of improvements. Once you picked a particular route on that journey, you might find yourself on a local optimum besides too higher mountains. But how do you get from your local optimum to a higher optimum?

Kaikaku

This is where Kaikaku comes into play. If you got stuck in a local optimum, you may have to apply a radical change to your system in order to improve further. This comes with a risk. Once you set up for a radical change, you will get another system. Like introducing Scrum to most organizations comes with the effect of radical change. Where does my project manager belong in the three roles of team, ScurmMaster and ProductOwner? How do we integrate our testing department into the development teams? These are rather larger steps.

Comparing this with the landscape of mountains, Kaikaku is a large jiggle, like an earthquake, which might throw you onto a completely different area of the landscape. This might mean that you find yourself on another mountain top afterwards. This might also mean that you find yourself in a valley between two larger mountains. This might also mean that you end up in the desert of the lost hopes.

This also explains that too much radical change eventually leads to an uncontrolled system. Since you keep on jumping from left to right, you never settle in order to get a stabilizing system in which you can apply smaller improvement steps. In fact your system might totally collapse by too much Kaikaku.

This also explains that you should seek for the right mix of smaller improvements and larger radical changes. Once you get stuck, try to reach for a larger step, but just one at a time. From that new ground start to go uphill again by taking smaller improvements – one at a time. Over time you will eventually end up within a system that is continuously improving itself.

The Landscape of Testing

In the past few days I had the opportunity to stay with Diana Larsen. She explained a bit about her current studies on Human System Dynamics, and introduced some tools from this school of thought to my colleagues and me.

One of the interesting tools was the Landscape Diagram which you can find explained in more depth here. The basic concept aligns around two axis (on a meta level a lot of consulting tools do). The y-axis relates to the level of the agreement within the team on the approach to take. This might relate to the agreement on using a particular process like Scrum, or on introducing a particular tool in the software development process like FitNesse or JUnit. The x-axis aligns the certainty level with this decision. This might relate to the familiarity with the process or tool introduced. On the lower end, there is high agreement, while on the upper end the team is far from agreement. On the left end the certainty with the decision is very high, while the right end represents a great level of uncertainty.

Now, this Landscape Diagram reminded me of Stacey diagrams which I knew from the CSM course I took a few months back. In the Landscape diagram there are just like in the Stacey diagram some interesting areas. When the level of agreement and the degree of certainty is high on the lower left corner, we speak of a high degree of order and organization. In here rely domains like accounting, and taxes where a high degree of regulation is the way to go. Though turning in your tax statement might horrify you each year, you probably want to have that kind of order to support your community.

In the upper right corner, in the area of low agreement and high uncertainty lies the area of chaos. If you don’t agree to anything, and you are new to the field, then you are basically cowboy-coding.

Between these two extremes lies the field of self-organization. With some agreement and some certainty your team will self-organize. But this also holds for low certainty if you agree on what to do, and for low agreement if you have a high certainty or familiarity in what you are doing.

Interestingly you can move from chaos downwards into more ordered fields by providing structure thereby creating certainty and agreement. You can move in the other direction by opening thought process to new ideas. I think Jerry Weinberg refers to this in some of his books (i.e. Exploring Requirement – Quality Before Design) with asking “Do I have too few ideas? Or too many? Or just the right amount?” You can either amplify your ideas and generate more perspectives when you got too few of them, or remove ideas when you have too many of them.

But what has this to do with testing you ask? Well, let’s take a closer look on the four schools of thought in testing like Bret Pettichord defines them.

First, there is the Analytical school which

sees testing as rigorous and technical with many proponents in academia.

This is the school of testing which motivated analytical approaches to proof the software is right theoretically. I think this school of thought has been abandoned in the 80s, but it provided rather ordered techniques to achieve this. So, I would consider it in the lower left corner of the Landscape diagram.

The Standard school

sees testing as a way to measure progress with emphasis on cost and repeatable standards.

Here we have a high degree of certainty based on measured progress and emphasis on cost and repeatable standards, which means a high agreement on what we are supposed to do. Also in the lower left area of the Landscape diagram.

Let’s check the Quality school. It

emphasizes process, policing developers and acting as the gatekeeper.

From my observation this also means high certainty by providing policies and processes, and a high degree of agreement in the gatekeeper role. Lower left organized area.

The Context-driven school

emphasizes people, seeking bugs that stakeholders care about.

This is interesting to the degree that people bring in a new perspective here. People are sometimes hard to predict and this results in a lower degree of certainty in our process. On the other hand the agreement to bugs that stakeholders care about provides the prerequisite for a self-organizing team. If you add some agreement practices like Specification by Example you support this process as well. The structures of Exploratory Testing provide another agreement that helps the self-organizing process. So, the context-driven school appears to me to be the only one of the four mentioned here that does actually emphasize the self-organization of teams.

The bigger problem for me comes when noticing that the highly organized field is de- and un-skilling highly educated people in self-organization (which I actually consider a key property of being human). This de- and un-skilling more than often leads to de-motivation, and over time ends up in a high degree of uncertainty and agreements by trying to trick your boss on the crap he proposed. From the point of view of the managers though they keep themselves in the belief that they got a highly organized process in place. If you don’t believe me, take a look for effects like watermelon reporting in which you have a green status when viewed from the outside of the system, but you won’t know whether your system is red or rotten until you peek inside.

Now, compare this to a self-organized way to test. Here people got some simple rules on how to act like focus on stakeholder value, and support your team, in order to achieve the same goals. To some people this looks scary and chaotic. The Landscape diagram explains that it’s a bit more chaotic, but not completely chaotic when applied correctly. The problem often comes with the correct application. If you end up in a chaotic system, you should first of all notice where you are, and then stop to pretend that you are organized. Provide some simple rules to overcome the chaotic state and achieve a basic level of either certainty or agreement to reach for the self-organizing area of the landscape. And yes, this might mean that you got to do some management work. Hooray!

Now, get out of your chair and see how your team is agreeing and which level of certainty it currently has, and take some action on it if necessary.

The World Quality Report

On one of my newsfeed recently an article turned up claiming that cloud computing pushes the demand for quality forward. The referenced study is nothing less than the World Quality Report.

I was about to rant about this, and the flaws I see starting with the fact that with Capgemini, Sogeti and HP at least two cloud computing companies have paid for this study and report. But I figured that for the majority of my readers wouldn’t be necessary at all. You are probably aware that Quality is value to some person (Quality Software Management – Volume 1 Systems Thinking, Gerald M. Weinberg, Dorset House), and that the number three obstacle to innovation is the single solution belief – the belief that modern psychology has all the right answers (Becoming a Technical Leader, Gerald M. Weinberg, Dorset House).

So, go ahead, read the report for yourself, and leave me any flaws and/or supportive statements in the comments.

Some Software Craftsmanship treasures

While reviewing some proposals for the SoCraTes (un)conference, the German Software Craftsmanship and Testing conference, I wanted to look something up in the principles statements that we came up with back in 2009 shortly after writing down the manifesto. Unfortunately I found out that Google Groups is going to turn down files and pages inside groups, and you can just download the latest versions of the files and pages now.

After downloading them, I found some treasures, which I would like to keep – even after Google took down the pages section in their groups. So, here it is.

Software Craftsmanship Ethics

I was involved in the discussion that came up to identify the principle statements similar to the Agile manifesto and principles there. It was Doug Bradbury from 8thLight who constantly tracked what the other twelve people on the mail thread were replying, and derived something meaningful out of it in the end. I don’t recall why these principles – which we later called the ethics – were never published on the manifesto page, but I think it had something to do with the discussion on the mailing list after we announced the final draft for discussion. (I obviously didn’t take the time to follow that discussion. There were too many replies for me to keep track.) So, here is the final version. Interestingly we saw the four main categories also in the four statements of the manifesto.

The Software Craftsman’s Ethic

***DRAFT*****

We Care
We consider it our responsibility
  to gain the trust of the businesses we serve;
    therefore, we
      take our customer’s problems as seriously as they do and
      stake our reputation on the quality of the work we produce
.

We Practice
We consider it our responsibility
  to write code that is defect-free, proven, readable, understandable and malleable;
    therefore, we
      follow our chosen practices meticulously even under pressure and
      practice our techniques regularly.

We Learn
We consider it our responsibility
  to hone our craft in pursuit of mastery;
    therefore, we
      continuously explore new technologies and
      read and study the work of other craftsmen.

We Share
We consider it our responsibility
  to perpetuate the craft of Software;
    therefore, we
      enlist apprentices to learn it and
      actively engage other craftsmen in dialogue and practice.

Original Software Craftsmanship Charter

In the early days we were struggling on how to get started. Back in November and December 2008 we collected together some statements from the books that we felt strong about. In the archive, this is kept as the original Software Craftsmanship charter. Later some of these statements were turned in for the manifesto and the principles. You can already see the structure of the final manifesto in there, but it’s still merely a brainstorming list. Here is the version from the google groups pages:

Original Software Craftsmanship Charter

Raising the Bar

As an aspiring craftsman/professional,

… we can say no– Do no harm

… we can work in a way we can take pride in.

… we take responsibility for the code we write

… we believe the code is also an end, not just a means.

… we follow a strict set of practices and disciplines that ensure the quality in our work

… we live and work in a community with other craftsmen

… we will help other craftsmen in their journey

… are proud of my portfolio of successful projects

… can? point to the people who mentored me and who I mentored

Here are some of my suggestions: (DougB)

As aspiring Software Craftsmen we are raising the bar of professional software development.
??? We are proud of our work and the manner of our work
??? We follow a set of practices and disciplines that ensure quality in our work
??? We take responsibility for the code we write
??? We live and work in a community with other craftsmen

??? We are proud of our portfolio of successful projects
??? We can point to the people who influenced us and who we influenced
??? We believe the code is also an end, not just a means.
??? We say no to prevent doing harm to our craft

My suggestions: (Matt Heusser)
? We take responsibility for the code we write ++
? We take responsibility for our own personal software process(*)
? We take responsibility for the outcome of the process
???? That is to say, a laborer delivers software on specification
???? A craftsman develops a solution to an interesting and ambiguous problem in a way that delights the customer

(*) – not the one owned by Watts Humphries

 
 
Suggested by Ben Rady
“We follow our chosen practices?deliberately,?to ensure quality in our work”

 

List of questions

Someone (I forgot who) mentioned a list of interview questions from the 8thLight office. They held some of the inaugural software craftsmanship user group meeting back in December 2008 there in Chicago, and eventually crafted together a basis for the manifesto. One of the attendee wrote down some interview questions which were floating around there. Here is the list:

List of Questions

What questions should be asked to a candidate to understand if he/she cares/understands what software craftsmanship is?
 
  • Do you follow a particular process to create your work?
  • What tools have you built to enhance your work?
  • When do you stop re-factoring and enhancing your code?
  • What are your training techniques?
  • How much time do you spend per week coding outside your main job?
  • How do you react when you discover a bug in your own software?
  • What are the first things you would teach a new apprentice?
  • How many languages do you know and can use consistently in the same project?
  • What are your most important influences in the programmers community?
  • Who is the best developer in the world in your opinion?
  • What makes you passionate about software?
  • Who else would call you a craftsman?
  • Do you consider your self involved with the software community?
  • Can you deliver consistent results in your code?
  • Can you define what good code is?
  • Can you point to some source code that you consider a masterpiece?
  • How do you react to something that you are forced to ship but is not consistent with your practices? (for example not tested?)
  • How do you stay current with industry standard?
  • Would you go back to a past customer project to fix your own bugs?
  • How do you define aesthetics and pragmatism in software?

Final words

So, having put these artifacts from the early days of software craftsmanship on my blog, I hope they won’t get lost. I still hope that the ethics statements we came up with will make it to the manifesto page one day, but until then I can reference this blog entry.

Pomodoro Testing

In the past week I was asked whether could test something. As of my curious nature, I took the challenge to learn something new. In this case it was an iPhone application. I own an iPhone myself, but I never tested an application on the iPhone purposefully – I tested some of them in terms of the banana software that seems to get shipped despite (or maybe just because?) Apple’s checks before an app reaches the App-Store. But I digress.

I planned a whole day – Thursday – for this. I knew I was going to explore the app in sessions influenced by the Bach’s session-based test management paper, and I knew that I had to take a closer look into the specialties of an iPhone application like memory management and stress testing. Prior to Thursday I had already taken a look into the application, and seen some of the features. My colleagues shared the current feature backlog with me covering the features that were already implemented.

Another thing I knew was the fact I was about to deal with some customers during the day in order to arrange some new consulting gigs. So, I was rather that I won’t be able to deal the whole day with testing of the application.

Taking all these considerations into the context of my testing approach, I felt that a test session of one hour was too long to test the application and deal with interruptions during the day. Additionally one hour seemed too long to cover one aspect of the application. I could easily go for multiple features in that one hour. A year back I had crossed the Pomodoro Technique to manage my time, and I decided that a combination of session-based test management and Pomodoro Technique looked promising to me. In the Pomodoro Technique a time slot is usually 25 minutes long, – called a Pomodoro – interrupted by 5 minutes of break for mails, fetch a new coffee or go on a body break. This seemed to be just the right time frame for my testing sessions. That’s when I coined the term Pomodoro Testing.

So far, so good. I took my Time Timer with me that day to cope with the session time frames. In the morning I took a first pomodoro to plan my work for that day. I quickly identified that I needed a brief overview of the application in order to learn more about the sessions for the day. So I planned a first session of 25 minutes of inspectional testing followed by a debriefing in order to lay out the additional sessions after I gathered more information. Having learned some things about iPhone application testing a few days before, I also planned sessions for special behavior on the iPhone. I also planned the work I had to follow-up on besides my testing activities for the day.

I started my first testing pomodoro following the mission to find out as much as possible about the application. I identified several areas of the application, that I would like to see more about. There were ten or so main categories which I wanted to conduct. In the first pomodoro I initially took some notes on paper, but collected them together on a mindmap in the debriefing. For each of the ten main categories I planned a pomodoro, and realized that this probably will be too much for a single day.

After the first two testing pomodoros I got more and more familiar with the application. I noticed some things in the settings, and some things in the UI which seemed confusing to me, but there was nothing serious for me to note down. The application was pretty straight forward.

On the afternoon I noticed though that I was able to pull in more than one session that I had planned into a single pomodoro. I think this was caused by me getting more and more familiar with some aspects of the application. I planned on a very granular level for this application, but still I made fewer progress in the beginning, and could easily deal with 2-3 topics that I initially planned for later.

At the end of the day I had collected many things in my mindmap which I then shared with the product owner and the whole team. I annotated bugs that I found and stuff that I found inconsistent with little icons to grasp quickly, and put some more lines about my major findings into that email.

Pomodoro Testing for me is the opposite of thread-based test management. In thread-based test management there are things of an application that may take more than a single session of focused testing. For an iPhone application it seemed right to me to limit the session length down further since there were not as many aspects I wanted to explore. In addition it seemed the right level to plan my testing activities for a whole day.

In reflection, I wouldn’t want to further cut down the session length than that. 25 minutes was pretty short for some things, and I also extended that time frame in one or two sessions to finish up what I was dealing with rather than stopping my curiosity forcefully.

Since I worked alone I didn’t do many debriefings over the day. I took one in order to plan additional sessions after the first pomodoro, and I talked to a colleague later for a complete debrief about my findings. That’s pretty much it. If I would have applied this in a team, I would experiment with debriefing times in order to achieve between one and two debriefings over the course of the whole day.

I think there are some smaller application – like mobile apps – where Pomodoro Testing is the right practice to apply in that context. For the majority of web applications and desktop application though this technique is probably too time constraint to serve the purpose.

“Fully automatic software testing now possible” – Really? Hmm? Soooo?

Part of the gap between computer science as taught at universities and software development as done in our industry is what Alistair Cockburn lists as one of the early methodologist errors: I did this, now do what I did, and you’ll be happy ever after. This notion is not only disrespectful for the achievements that our industry has come about, but it also lacks the particular difference between lab situations and the context of software development in software development shops all around the globe. This is nothing I came up with, but an observation I made while teaching anything. The first reaction people have to something new that’s imposed on them is “but this does not work for me” until you show them how that’s possible – and find out yourself that the combination of Spring, SOA, JBoss, GWT, SWING, and Ruby or even any other combination of buzzword technologies from the past two decades come with their own pitfalls and fallacies, and your beloved approach ends up being useful. In fact a while ago James Bach claimed that Quality is dead due to the unmanageable stack of technologies and abstractions our industry has to deal with. I would even go further and claim that no one will be able to handle the Y10k problem in eight-thousand years if we continue with this.

One of the fads that seems to be reappearing is the idea of automating away humans from the software development work at all. This fad came up with the rise of UML, and the most recent fad appears to be model-based testing. One of the interesting things I noticed is the ignorance of other past movements. It seems that universities keep on bringing up new talents from time to time that claim to save the world because universities favor a particular competition-based learning model in which everyone wants to be the next hero. Of course this is garnished with some flavor of Pandorra’s Pox:

Nothing new ever works, but there’s always hope that this time will be different.

(The Secrets of Consulting, Gerald M. Weinberg, Dorset House Publishing, page 142)

On a side-note the same author just recently wrote about this ignorance to past experiences in the context of development models like structured programming or Agile.

Up until now my hope was that model-based testing was a fad that would disappear quickly with industry leaders ignoring it completely to start with. But it seems that the hope that model-based testing will be different keeps on re-occurring besides the voices of highly-skilled consultants in our fields – for example take this blog entry from James Bach, dated 2007. Four years have passed since then.

I decided to ignore this fad for as long as possible, but this morning I read about model-based testing in a way that made me angry. Of course, the problem is not model-based testing in itself, but my reaction to what was written on that particular webpage. Fully automatic software testing now possible is the title of this webpage. So far, I have criticized some of the work on model-based testing. These discussions almost always ended up in an agreement that you shouldn’t base all of your testing efforts on model-based testing (MBT) but combine it with other approaches as well. I am fine with this. If MBT serves you better than another approach for a particular effort, go ahead. But I wouldn’t invest all of my money in a single stock option. Or run a consultancy with a single client. In fact, most countries won’t allow you to start a consultancy if you got just a single client. Think about why for yourself.

But the article – no, I wouldn’t call it article, it’s rather a marketing piece for a particular company – seems to fully ignore such contextual considerations. Let’s take a look at some sentences before I explain the pitfalls of model-based testing, and what to do instead.

The system not only facilitates quick and accurate software testing, but it will also save software developers a great deal of money.

Really? Every software developer in the world? Even those who are highly performing using TDD and Specification by Example? I doubt this. Since there are no data mentioned at all this claim is hard to justify. And in the end, how much is “a great deal of money” in first place? A great deal for my private life would be 100.000 Euros, but a great deal of money for an enterprise surely looks differently.

Our automated method can improve product quality and significantly shorten the testing phase, thereby greatly reducing the cost of software development.

And if I don’t treat testing as a separate phase to start with, what happens then? Also notice that their understanding product quality is not defined. (On a side-note I added the word “quality” to my set of lullaby language terms.) So, what does an improvement in product quality mean? How would I notice it? Connecting the first quote with this second, it also appear that shortening testing is correlated to the costs of software development. This violates three fallacies from Jurgen Appelo’s Management 3.0 class at once:

  • Machine Metaphor Fallacy – Don’t treat organizations as a machine (organizations are complex systems).
  • Linear Behavior Fallacy – Don’t assume things behave in a linear way (things behave complex in a complex system).
  • Unknown-Unknowns Fallacy – Don’t think you have covered all risks (the Titanic effect will hit you if you do).

The testing phase for new software consists of three steps: developing the tests, running the tests and evaluating the results.

I think this belief is the reason why testers are treated like monkeys, and software testing keeps on being outsourced to underpaid countries where the power system still has outages. What’s missing in this list? Think about it. There is one key ingredient missing, and I have not seen anyone trying to automate this key ingredient. You’re with me? Yes? Exactly! It’s the learning. Test execution and test results lead to learning when executed manually, or when supervised. This learning informs the development of the next test – in one way or the other. There are learning approaches that could work for some particular part of software testing. For example there are systems in place which learn risky code changes based on changes that introduced bugs in the past. But this addresses one of many risks in software development.

When used properly, the method completely eliminates the need for manual software testing

Besides the hint to “When used properly” (people are not very disciplined at using something properly), I sincerely doubt that the need for manual software testing will be eliminated by their system. Proof me wrong. Please.

Model-Based Testing has a number of major advantages: it makes the software testing process faster, cheaper and more accurate.

I’m speechless about such statements. Speechless except for: “compared to what?”

It is not uncommon for manual software testing to take anywhere from several months to years.

Finally, a hint to the problem they are trying to solve. So, manual testing takes too much time appears to me to be the problem that MBT addresses. But I don’t see a clue why testing is taking so long. Matt Heusser taught me a while back that this is an impossible to answer question. “So, testing should take fewer time. Which of these risks should I leave unaddressed with my testing then?” is a more congruent answer to that question.

The pitfalls

One of the pitfalls for MBT if sold with the sense of humor as in the above statements – and I really hope they don’t mean these statements seriously – is the belief that I covered all risks. My favorite quote from Jerry Weinberg on this is the Titanic Effect from Quality Software Management Volume 1 – Systems Thinking

The thought that disaster is impossible often leads to an unthinkable disaster.

If you look for such occurrences you don’t have to go as far to the past as the Titanic incident. Take challenger, volcano eruptions or Fukushima – just to name a few.

But words can describe so much. Let’s run an experiment. I run through this experiment a few months back when I approached Michael Bolton to explore MBT. So, this is really his exercise. Let’s apply model-based testing to a simple website for children. Here is the Horse Lunge Game. Develop a model for this game. I will wait until you come back.

Finished? Already? Maybe you should look deeper. Go once again over your model to see if you left something out.

Alright, this should be fine right now. Now, how many bugs did you notice which were not part of your model? None? To get an idea watch the fence in the background. Is it floating continuously at the same pace, or is there a jump in it? Part of your model? Really? If this didn’t convince you yet, does your model incorporate that none of the horses should have wireframes? No? Well, does any of the horses has wireframes?

My point is that the notion of MBT as presented on that website I linked earlier (I won’t do that again!), ignores the impossibility to test everything. Usually I refer to testing a compiler at this point. Testing a compiler completely would mean that I run a test for each program this compiler will compile – ever. So, I must digest any program that is in existence for the particular language that I compile, (even my compiler might be written in that language) and any program that will be invented in the next – say – twenty years in that language. Then I run each and every program through this compiler, and check the results. This does not only hold for compilers, but for frameworks in general as well. (Side note: Frameworks are the new programming languages.) This is not even possible. Now, with MBT I would have to create a model for each program that is going to be created in the future. So, I would need to invent a time machine first. Maybe science has found a way to time travel, and MBT is a hint to this. But I doubt it.

This leads me to another point. How much does it cost to create the model? How much does it cost to maintain the model? How long does it take to run the millions of tests referenced in the text? There is no statement about this. Our industry does not even agree on how many tests to automate or what the return on investment (ROI) for software test automation is. And now we are faced with a different question based on MBT which is what the ROI of test automation AND model-based testing will be. These are two variables in a complex formula. Do you see now why I don’t believe that I can save a bunch of money just by applying MBT?

A different approach

An approach that works for me is to apply ATDD or Specification by Example in combination with session-based Exploratory Testing. Using Specification by Example helps me derive specifications for the software from the people that matter. Session-based Exploratory Testing helps me to grow a mental model of the software, and cover the risks that are relevant at the time. I don’t claim that I can cover all risks in the time I got. Instead I can cover the risks that are most meaningful right now. Remembering Rudy’s Rutabaga Rule (Secrets of Consulting, page 15):

Once you eliminate your number one problem, number two gets a promotion.

in a complex system like software development this seems to be an approach that brings more value. And besides that, Exploratory Testing has a high emphasize on learning which MBT appears to neglect completely.

Software Testing, Craft, Leadership and beyond