I got a question from one of my students taking my Software Engineering Management course about root cause analysis. They had a Software Engineering team that always produced buggy software. It had little fault tolerance and where it failed it didn’t do so gracefully. Most of it is caught by testers and returned but this is an inefficient loop. No matter how much they tried they could not get the software engineers to write less buggy code.
We eliminated tools, process, and technology and ended up (as often as I do) that the root cause is most likley in the recruitment processs. Job descriptions and pre-screening by HR had a bias for optimists. Anyone with an ounce of skepticism rejected.
I always advocate a balance between optimists and skeptics in any Software Engineering Team:
- Optimists assume their code will never fail or some exceptions may never occur. They reduce testing as a result and always believe they have done enough.
- Skeptics even with code that appears to work well assume it will not. They will capture all known exceptions and ensure whenever there is failure it is graceful. They will test extensively.
I think it is imperative that the Software Engineering Manager is involved in all stages of recruitment for their team.