Managing Custom Software Development Projects: The Productivity of Programmers varies quite a bit

In “The Mythical Man-Month”, Brooks observed that the productivity of different programmers can vary by more than a 10-to-1 factor.

In other words, one programmer will need a week to do what another programmer can accomplish in half-a-day.

I submit that there is no other profession in which there is such an extreme difference in productivity levels.

Moreover – and possibly equally surprisingly – the quality of the more productive programmer will most likely exceed the quality of the less productive programmer! In most professions, if one worker is significantly more productive than another worker you expect that the quality of work will suffer as a result. If one cabinet worker (for example) is twice as productive as another worker, you expect that the faster worker is taking shortcuts. But with computer programmers – generally speaking – the more productive the programmer is, the higher the quality of work that programmer produces.

I actually witnessed an example of this disparity in productivity that is worth sharing.

The example involves string manipulation.

For the less technical readers, a “string” is a block of text. It can vary in length from a null string (which consists of zero characters) to a block of text the length of the book “War and Peace” by Tolstoy.

In the vast majority of cases, a “string” is one or more words that make up a name (e.g. “George W. Bush”), a title and a name (“President George W. Bush), an address (“1600 Pennsylvania Avenue”) or simply represent the name of some “thing” rather than some “one” (“The United States of America”).

Similarly, most strings consist of smaller pieces of information, also considered to be strings. “George W. Bush” consists of a string showing the first name (“George”), a shorter string for the middle initial or name (“W.”) and another string showing the last name (“Bush”, of course).

Most of the information stored in a database is stored as a string. Even information containing only digits is generally stored as a string. For example, zip codes should be stored as strings in databases even though the strings are all digits. This is because of problems handling zip codes beginning with a ‘0’ if those strings are stored as numbers. (Those are easily solvable problems, but there is no need to have those problems at all if the zip codes are stored directly as strings.)

Strings are often “manipulated” in computer software. They are separated into their component parts (first name, middle name, last name) or concatenated (appended one to another) to form larger strings.

Strings are such common variables in computer programs that any – and I mean any – experienced computer programmer will have substantial experience in string manipulation software development.

With that background, I will describe my example of varying productivity.

I discovered, basically by accident, that two programmers working for the same company had both independently solved the same programming problem. But they had done so in significantly different ways and in significantly different amounts of time.

The specific problem to be solved was to write a software procedure (i.e. a block of code that performs a specific operation). That procedure was to take an input string which might be short, but which could also be fairly long, and parse it into somewhere between 1-5 smaller strings of roughly equal size. A minimum size was required for each of the smaller strings. The rules to be applied to each of the smaller strings were the same.

Examples (using country names):

Sweden - parsed into a single string (“Sweden”)
United States of America - parsed into three strings: “United”, ”States of”, “America”

I could go into more details including the maximum lengths for the parsed string outputs from the procedure, but the discussion so far is hopefully sufficient to get an idea of the problem to be solved.

One programmer had written the procedure as an iterative loop (i.e. a loop that ran the same code multiple times with different inputs each time). The loop was run up to five times – once for each of the shorter strings, continuing to loop as long as there was some portion of the input string still to be parsed.

The procedure as written required about 20 lines of code. The programmer said that it took him about two hours to write.

The second programmer had written the procedure to first pull each word (i.e. characters separated by spaces) and put the words into an array, the size of the array depended on the number of words in the input string. (The country name “Sweden” would require a single element array. The name “United States of America” would require a four element array because of the four words making up that name.) Then the individual 1-5 output lines were reconstructed from the array elements. The programmer had a difficult time getting the program to correctly process the indices to each of the array elements, so rather than write the program as a loop, he copied-and-pasted the code for the first line four times, once for each of the other possible lines.

The second procedure took nearly three printed pages of code (probably about 125 lines) and I was told that it took three days to write.

Both procedures were tested and both worked.

However the first programmer produced a working procedure in about 1/12 of the time that it took the other programmer.

Also, by most measures of software quality, the first program was also higher quality.

It was shorter and therefore took fewer resources (such as memory). Because it was shorter it would also run a bit faster (this was a web-based program so that it would have to be downloaded over a network. The shorter the program the less time it takes to download. The differences would be fairly insignificant, but still they would exist).

Most significantly, the first program would be easier to modify and reuse if necessary.

For example, if the requirements changed such that the maximum number of output lines went from five to four, only a single digit would need to be changed in the first program (requiring five minutes of programmer’s time at the most). Only a minimal amount of testing would be required to confirm that the modified program worked correctly.

However, in the second program, if that same change was required, a significant number of lines of code would have to be deleted and some variables within the program would also have to change as well. That would require a couple of hours to change and a lot of different test cases would have to be used to insure that it worked correctly.

This is an example of where one computer programmer created a higher quality program twelve times faster than another programmer. I doubt that similar examples can be found in other professions.

A single example such as this one could be something of an anomaly; sometimes programmers like other people just get “mind-blocks”. In fact the “slower” programmer in this example had the reputation of being one of the better programmers working on the project. Or it could be the case that the first programmer was actually much more efficient than the second programmer.

When managing software development, it is obviously important to know which of your programmers are more productive and which are less productive.

Implications:

When managers first become aware that these productivity differences are real, their inclination is to try to hire only the most productive programmers. While there is nothing wrong with the idea of hiring the most productive people possible, it should be pointed out that even the less productive programmers have a place on development staffs.

I described earlier how programmers tend to like their jobs. But they only like their jobs if those jobs are challenging. When developing a complete solution for a customer, some of those jobs aren’t all that challenging.

For example, an integral part of an accounting program will be an “Aging” report (a report that shows all unpaid invoices more than 30 days old, sorted by customer and then generally by date). If the database design is done well, this is a fairly trivial exercise.

If you give the development to one of the more productive programmers, they will complete it quickly. But they won’t be challenged by the task.

If you give that same programmer a number of such jobs, they will lose interest in the project.

Or – in order to challenge themselves – they will add complexity and complications that aren’t really needed.

Also I have seen many instances where a more productive programmer becomes impatient with a programmer who is somewhat less productive and a personal conflict results. That conflict diminishes their ability to work together.

Even more significantly, I have seen many instances where the more productive programmers have problems relating to users. Even when those programmers don’t have to relate directly to individual users (such as talking to them on the phone) they may not even be able to relate to the users indirectly. The programmer may argue that a problem reported by one (or even more that one) user is simply due to a lack of intelligence or sophistication on the part of the user.

So depending on the other strengths that they bring with them, even lower productivity programmers have their place within a project team,.

It is also important for the programmers to have some sense of their own productivity levels.

Before actually writing the code for a specific task, the programmer should form his or her own estimate of the time that will be required. Depending on the size of the task those estimates may be included in the project plan.

If a programmer begins to work on such a task and about 10% of the allotted time has passed, the programmer should assess his or her progress. If progress on the task is better or roughly equal to the time estimate, the programmer should probably continue down the path that they originally chose.

On the other hand, if the progress seems to be significantly less than expected at that point, I recommend that the programmer first put that particular task away and work on something else for a while. Then when they come back to work on that task, they may be able to look at it with a fresher outlook (helping to eliminate the sort of “mind-block” that I mentioned earlier). In the specific productivity example that I described earlier the second programmer may realize that separating the words into an array was not an efficient way to attack the problem.

Alternatively, if nothing simpler seems to be apparent, I recommend that the programmer do some brain-storming with another programmer and try to determine whether it is the approach or the estimate that should be modified.

Managing Custom Software Development Projects

Wednesday, March 18, 2009

The Productivity of Programmers varies quite a bit

No comments:

Post a Comment

Followers

Blog Archive

About Me