Software Quality Metrics

10 Software reliability: measurement and prediction

A. Törn - Contents - - Previous chapter - Next chapter - - Previous page - Next page

Software reliability is a key high-level attribute. The chapter addresses the software-reliability growth problem: i.e., estimating and predicting the reliability of a program as faults are identified and attempts made to fix them. We just summarize the topics treated.
The quantitative methods used date back to early 1970s, evolving from hardvare reliability. The reliability theory rests on probability theory.
Models for estimating mean time to failure (MTTF) are presented. Reliability growth occurs if MTTF increases. Also the problem of accuracy of predictions is treated. Recalibration of the model and the stability of the operational environment are key elements. The models should be applied only to systems with modest reliability requirements.
Beacause the technique is probabilistic we cannot guarantee a prescribed level of reliability assurance in a particular case. Therefore other techniques have been considered for systems where ultra-high reliability is required:

Design diversity for fault tolerance: building the system in several independent ways and running the systems in parallel. Unfortunately, evidence suggests that independently-developed systems will not fail independently.
Testability: making the program highly testable. High testability here means high probability that the program will fail if faulty. The critic is that in this way also remaining faults after testing will have an increased probability to make the program fail.
Program self-checking: Blum et al describes a theory of program self-checking restricted to a narrow class of mathematical-type problems. As an example, consider a program for evaluating the square root of x. If the computed result is y then it is easy to check if abs(yy - x) is small enough.
Formal methods: the program is formally proved to be correct, a difficult and laborious technique probably applicable only to "moderately large" systems if performed to 100 %.
Fenton & Pfleeger: "In the absence of proven methods of assuring ultra-high software reliability levels, it is surprising to find that systems are being built whose safe functioning relies upon their achievement".