http://invisible-island.net/personal/
Copyright © 2009-2013,2014 by Thomas E. Dickey


Introduction

This is an overview of the guidelines which I use in maintaining change-logs and similar information for computer programs.

Background

One of the things that the maintainer does (or used to) is to keep the change-log up-to-date. Though I've been developing software for some time, it wasn't until 1992 that a combination of circumstances (declining in-house development opportunities, and the Internet) prompted me to provide fixes for "free" software.

By 1994, I had contributed changes to about 65 programs. In that process, I had of course encountered various personalities. But the worst of those were simply slow to incorporate the changes.

Starting in 1994, I arranged to have the programs which I had been developing for my personal use excluded from my employee agreement. These included ded (the motivation for the resizeterm function), vile and tin as well as related programs. One of the related programs was ncurses.

The case with ncurses was ... different. Rather than a single developer, there were two. And they used a mailing list, unlike most. The nominal maintainer was Zeyd Ben-Halim, who was rather nonresponsive. The result of submitting patches was not good—it seems that they intended to copyright everything for themselves. That's a workable situation if they wrote everything themselves. They did not.

For instance, they incorporated Juergen Pfeifer's libraries in 1995, which greatly increased the size of ncurses. After incorporating it added 11,183 lines of code (pcurses was just under 10,000 lines of code before it became ncurses). In 1.9.7a, Juergen's name appeared in 3 places in those libraries (two pro-forma README's and one comment in a makefile noting that optimization did not work properly). Zeyd and Eric's copyright notice appeared in the same files 36 places.

The NEWS file notes:

* integrated Juergen Pfeifer's forms library.
* integrated Juergen Pfeifer's menu code into the distribution.

I noticed that patches were sent to the mailing list (including my own) and that the NEWS file would include the change, but not mention the contributor. My name appears in the NEWS file twice, as well, for that time period, though—as I pointed out later—I had done about half of the work. Not all of my changes were mentioned, and most of them were unattributed. The casual reader would assume that Eric and Zeyd did almost all of the work.

Zeyd, being the nominal maintainer, appears to have done most of the edits to NEWS. However ESR also sent changes to the mailing list incorporating changes from others without mentioning this in his announcements.

After I stopped sending patches to Eric and Zeyd in April 1996 (and providing ncurses, myself), I resolved to maintain the NEWS file with attribution for each contributor. That's the way we were doing it in vile and tin, for example. Philippe De Muyter suggested that I also note who reported the problem to be fixed as well. I did that.

Keep it Simple

Of course you're keeping your project in some type of revision control system. You can extract that information with various tools and render it as a change-log. Any idiot can do that.

Unfortunately, many change-logs are automatically generated, and indeed appear to have been generated by "any idiot".

Just the Facts

What is missing in many automatically-generated change-logs is the information which is typically not supplied by developers:

One advantage of automatically-generated change-logs is that it is possible to get the dates on which changes were made. Not all automatically-generated logs show this, but it is a strong possibility.

Whether or not the change-logs are automatically-generated, there is an additional problem if changes are collected and applied by a project maintainer—recording the contributors consistently.

Contribution Categories

There are of course changes by primary contributors.

patch by
The patch is usable without rework required.

Often, for conciseness, the "patch by" is left out and only the name of the contributor given. They are equivalent.

As a rule, if I am applying a contributor's patch which (aside from formatting details) works properly, I use the rcs "-w" option to mark that revision as originating from that person. It is rare that patches good enough for this come from completely anonymous developers, so an appropriate string is seldom lacking.

Most patches require rework or adaptation.

integrated patch
The patch requires work, e.g,. it is not ifdef'd as required for all optional features.
adapted from patch
The patch has some logic flaw, requires modification to build and work.
analysis by...
Someone told how to go about fixing the problem, or else they provided a detailed enough report that the solution was apparent to the developer. This may/may not be the same person who reported the problem.
discussion with...
A discussion with someone brought out an idea, but it is unclear who was the source.
prompted by discussion with...
Talking to someone prompted me to realize a bug or solution. Without their input, the idea/fix would not have been apparent.

Occasionally their report and discussion is completely incorrect, but the "prompt" was useful. This does not apply to hostile or untruthful contributors of course.

reported by...
Someone reported the problem, but did not provide the solution.

These categories are oriented toward direct communication with the program's maintainers. Accounting for indirect contributions is not as straightforward.

Problems in Categorization

There are a few basic problems to address:

Bug-tracking systems

Bug-tracking systems are a major source of indirect contributions.

If all of the report is within the bug-tracking system, and there is no analysis by other people, nor proposed (useful) fixes, then I'll cite only the bug-tracking system and its number for the bug.

On the other hand, if there are useful direct contributions toward the solution (reports without analysis are indirect), then I'll cite those individuals in addition to the bug-tracking information.

Updates of Bundled Sources

A few files (such as config.guess and config.sub) are maintained by other developers. The changelog for these says "updated", and if the origin is volatile (the config.* scripts are a good example of this) or relatively obscure, says where it was found. Read their changelog for credits.

Hostile/untruthful contributors

Bear in mind that I'm not a public service.

I get some reports indirectly, via web-searches in various forums. Some of the comments are useful, others partly (because they point out details for an issue). However, it is not uncommon for those to be mixed in with secondhand comments. As is usual with hearsay, much of it is inaccurate, and much of the repetition in public forums is not intended to be constructive commentary.

Still, an occasional comment is useful.

Of course, in this case, I'll categorize it as "adapted from", etc., noting that it makes it automatically an indirect contribution rather than a direct contribution.

If the information is from a discussion between different individuals, none of whom appears to be knowledgeable about the issue, I will simply cite the group where the information was given.

Other problematic contributors

In general, we would assume that developers submit their own work. This is not always true.

When reviewing a change, I do take the time to scrutinize it, attempt to determine a proper attribution for the change. It happens that I may notice (or recall, if I'm subscribed to a given mailing list) that the change was originally developed by a different individual. In that case, I'll amend the description to cite the actual developer. If the code has a comment citing the developer, that suffices, though even that has been a matter of dispute on occasion, when the intermediary insists on sharing the credit.

Individuals who do this repeatedly (there are a few) will either be banned, or subject to scrutiny on every change. In either case, they generally go away and provide their services to a different project. Rather than leave, some of these use the public bug-tracking systems as a forum.

Dates, Timestamps, etc.

Change-logs should have dates, to establish when a change was made.

Examples

Not all of the change-logs are in the same textual format. I wrote a script which handles the most common cases, and have massaged some change-logs to follow the format which it recognizes, to collect information about contributors. Essentially, it reads the text, looking for the markers which I use to denote direct- and indirect-contributions, and gives totals and names for the direct contributions.

For some (lynx and vile) I have not reformatted the older change-logs. In those cases, the dates below correspond to the beginning of the change-logs that I have reformatted.

Here is a list (as/of May 2010) of the change-logs for which I have useful metrics, noting the percentage for my own contributions, and the number of other contributors (disregarding "external", since there is no active involvement).

Program Percent Other Date
diffstat 81 12 June 1994
xterm 83 150 January 1996
ncurses 76 176 April 1996
vttest 96 3 June 1996
lynx 45 136 February 1997
vile 76 36 November 1999
dialog 78 64 December 1997
cdk 85 24 May 1999
byacc 97 4 February 2002
luit 91 0 August 2006
mawk 73 6 September 2008

Other changelogs

I use rcs2log for a few programs (ded, (byacc, autoconf macros, etc), which did not have a history of other contributors, and/or which are very stable.

Other metrics

There are other ways to measure contributions. Not all of them work as well as inspecting the change-log.

For instance, the Orbiten survey several years ago ignored the change-logs and RCS identifiers in my projects, and credited virtually all of my work to other people. Some of those credited were never contributors. Rather, Orbiten noted the mention of various individuals and organizations in README's and comments, and credited them with the entire work.

Other people have pointed out that Orbiten also did not factor out programs such as libtool, which are bundled with other programs.

Any metric requires inspection and tuning to validate the results. Lacking that step, the metric is worthless.

Reciprocity

Unsurprisingly enough, my change-logs cite contributions from people who also maintain change-logs. They do not necessarily reciprocate, e.g., some developers who borrow from my work. I don't work with those people.