What are we testing: Chasing metrics versus actual software quality
- 6 August, 2020
- Article - Software Best Practices
Many engineering teams are focused on redundant input metrics as an indication of the success of “quality assurance” on their technology implementation projects. Are we so busy measuring the number of test cases, pass rates, code coverage and bugs discovered that we have forgotten what real quality means in the eyes of our users?
Are we blinded by automation, following widely advertised and theoretical best-practices in the pursuit of quality?
We should actually be working backwards from the outcome we want, and applying tools best fit to measure only those outcomes.
In the broader sense, quality means that a solution meets the following requirements:
1. The system is robust and reliable
2. It does what it says on the box
3. It does not frustrate its users
Seems obvious, right? But do our measurements actually speak to these requirements or are we simply ticking predefined boxes?
The surge of DevOps has allowed us to streamline and automate the entire software development lifecycle. DevOps assumes that just about everything, and certainly testing, can be completely automated. Automation has made processes easier to implement and standardise, but standardisation blinds us from the real objectives. It’s easy, and that’s where the danger lies. We are forgetting to work backwards from results, and to think critically about the quality we’re putting forward. Instead, we are at risk of using analysis tools and automation as a crutch.
Similarly, code coverage has also been idolised as a way to ensure that all code is functional end-to-end. Looking back at how we defined quality above, it is clear that while this gives the development team confidence, it does little for the users. The truth is, we don’t actually need 100% code coverage. Code coverage might be proportional to quality, but it doesn’t guarantee quality. If our code coverage doesn’t speak to the end goal, we may cover all our code and have a perfect system that won’t break, but can still have a system that frustrates our users and doesn’t do what it’s intended to.
Why are we not specifically determining what software quality actually means for each unique system, and testing for that?
Testing the right things, while balancing time, money and value is what matters. We shouldn’t spend time testing boilerplate code or unit testing other frameworks. There is a degree of assumption that we’re allowed to make and not all failures are equal. Testing high-risk functionalities such as price calculations for insurance premiums is crucial, because the damage it’ll cause to both the system and the business when it breaks is immeasurable. In 2016, a system error from the NHS caused 300 000 heart patients to receive the wrong medication – something that could have been prevented by thorough testing. However, investing time testing low-impact and cosmetic ‘features’ can just be wasteful.
Testing in an empathetic frame of mind will allow us to deliver to the users instead of ourselves. The ‘assurance’ component of QA is not intended for the engineering team, because quality is not for our own sake, but for the success of the product and happiness of our users. Many of these metrics do instil confidence which does carry some value, but actual quality should be measured from our users’ experiences with our systems. Sometimes it’s acceptable to have a nervous engineering team and a satisfied user base.
We should aspire to confidence for both, an achievable goal if we start with a clear outcomes-defined basis for quality, and structured in a way which can be measured from both perspectives.