Much of the industry's present efforts to reduce the probability of errors is based on the idea of rigorously observed bureaucratic procedures, intended to reduce software engineering and programming to a considerable number of mechanistic steps. No account is taken of abundant evidence of remarkably great variations between the performance of individual programmers. Much emphasis is placed on "qualification" of programmers without regard to the clear indications in the literature that formal training and even experience seem to have no relevance to programming competence.
These factors profoundly affect the execution of one of the bureaucratic steps, that from the program specification to the creation of the source code. Since all the steps are in series, they form a chain of which the strength cannot be greater than that of this weakest link.
Ferrentino gives the following typical ranges of variation between different individual programmers working on identical problems
Coding Time 25 : 1
Error Removal Time 26 : 1
Size [lines of code] 5 : 1
He also presents a graph that indicates that in a large number of projects in the range from about 3,000 to 100,000 lines of source code, the range of total man-months of programming effort for a given number of lines was 200: 1. It appears that the same range would also apply to smaller programs, but his graph "bottoms out" at 1 man-month.
Sackman et al, could not get significant information about the main subject of their experiment because of large differences in performance between individual programmers. All the programmers were "experienced". Addressing two different problems with 12 different programmers, the ranges of effort were
Debug hours 28 : 1
Coding hours 25 : 1
Program size 6 : 1
Leveson et al, carried out a well controlled experiment in which programs were written to the same functional specification by 27 different programmers who were variously candidates for bachelor's, master's and doctorate degrees in computer science and whose programming experience ranged from zero to over 10 years. The authors reported that "There appeared to be no correlation between the programmers' experience levels and the quality of their programs."
In fact, with two other reports, the following can be gleaned:
The ranges of time in hours taken by 27 programmers to program the same task were, Item Min Max Average
Reading the specification 1 35 5.4 Writing the program 4 50 15.7
Debugging 4 70 26.6
Total 47.7
For a total number of programmer-hours of 47.7 x 27 = 1287.9 p.hThe end product was 27 corrected programs for the same task [in addition to the gold program] for a total effort of 47.7 x 27 = 1288 person-hours. The programs in Pascal ranged from 327 to 1004 lines long, average 666. They involved some comp1ex geometrical manipulations. [It was later reported that 2 faults were found in the gold program !]
McCabe was ".... delighted to find several programmers who never had formal training in structured programming but consistently write code in the 3 to 7 complexity range .... ", that is, what he regards as very good code.
In the present context of avoiding errors in programs, the figures and statements above speak for themselves. If in a particular project a programmer produces a program that is five times as long as it should have been, or requires 25 times the writing effort or 25 times the debugging effort, it is to be expected that the probability of errors being made and escaping detection, even though small, will be greater by a factor of 51.5[= 11] or even 25, than it should be.
This large source of potential uncertainty and trouble was not recognized in the Darlington programming and may be part of the explanation for the enormously long and complicated programs that were produced compared with the hardwired/generic equivalent and much less complex systems.
The problem is not in any way new. It is as old as engineering, as old as human organization, and probably as old as Homo sapiens. The answers are trial and error, experimentation, competition and selection, summed up as good management..
Boehm refers to the advantages of rapid prototyping over the "specification driven" approach. In fact, the later approach serves to enhance the probability of errors. There are many documents which perpetuate these fallacies. For example Littlewood et al is typical of many similar reviews that show almost no understanding of the design of highly reliable systems. Specifically it implies that computers cannot be used in important applications unless the programs are proved to be perfect.
Mills, Ref. 24, has exactly the right advice; "start with a design competition and keep the simplest one [program]."
Within the safety critical realm of the nuclear power plant it is now the general approach to proceed with the following philosophy as taken from the aforementioned AECL Research Co. Report:
The most urgent immediate need in the design of CANDU shutdown systems is experimentation - it could well be called "research and development" - to find out the most reliable kind of programs in the presence of the great variability of individual programmers herein discussed.