Seventeen years of testing experience, test engineer, test manager, qa manager and head of test engineering positions and last week I made a rookie mistake, one that cost us releasing today.
A week ago, I made a simple fix to our product, in an area of code that I have expertise. I made this fix following some in depth debugging to determine the exact cause of the bug that our customers were experiencing. It was a two line fix, and I got the fix reviewed by one of the developers. Simple.
I started the release process this morning after freezing ‘trunk’ and pinning our dependencies etc. Validated the overnight automated tests, performed most of the upgrade/downgrade tests.
However, during the test of the standalone installer, installing it clean, in a clean VM, it became apparent that the original bug was not, in fact, completely fixed.
It turns out that one of the variables in the whole combinatorial nature of the bug escaped me when I tested it originally, and this inevitably came back to bite me in the arse.
Thankfully, we hadn’t made the release claiming this bug was fixed, so small mercies and all that, but even so I felt very ashamed that I made such a rookie mistake. DON’T TEST YOUR OWN FIXES!
Something we tell our developers all the time is that it needs to be checked and verified by someone other than themselves. That someone will have a different perspective on the issue, and will not be swayed by assumptions and preconceptions.
I subsequently identified the missing piece of the puzzle for this bug, fixed it (again it was another two line fix!), tested it, but now someone else is validating the fix. Actually, two people are!
This has all been very embarrassing, and I should know better. I DO know better, but it seemed like such a cut and dried fix that I bypassed our procedures.
It reminds me of my flying instructor giving me very insistent instructions to always follow the pre-flight checklist religiously. He said that you can absolutely guarantee the one time you get complacent and skip over some bits on the checklist, it will be the one time there is a problem with that instrument or control surface, or fuel etc. Of course the consequences of something going wrong whilst flying are somewhat more dire than our software showing slightly undesirable behaviour, but the sentiment is the same.
So the moral of this story is don’t get complacent. Sods Law says that if anything can go wrong, it will, and you can absolutely guarantee it will go wrong when you don’t do something that you should have done!
[ed] at the time of writing, the fix is looking good and we are set to release this afternoon!