Monday, June 4, 2007

Basic Source Control Using RCS

 
Applying RCS and SCCS: From Source Control to Project Control

Applying RCS and SCCS

From Source Control to Project Control



Basic Source Control Using RCS

The Revision Control System, or RCS, was developed by Walter F. Tichy at Purdue University in the early 1980s. Implemented later than SCCS, and with full knowledge of it, RCS is a more user-friendly system, and in most ways a more powerful one. In this chapter we present the most basic capabilities of RCS, by showing how you can apply it to the source file modification cycle.

Background

Traditionally, RCS has been included in BSD UNIX distributions; currently, it is also distributed by the Free Software Foundation [Egg91]. RCS has not traditionally been included in AT&T-derived UNIX distributions. Despite the technical merits of RCS, its absence from System V and earlier AT&T systems can present practical and political obstacles to those who would like to use it.

If RCS is of interest to you, make sure your system provides it. If not, you'll be obliged to obtain it from another source, such as the FSF. (We provide instructions for doing so in Appendix H, References.)[1] The FSF, of course, distributes RCS in source form only. Though it's normally trivial to configure and build RCS for a UNIX-like ("POSIX-compliant") platform, this is still something you would have to do yourself if you obtained the system in this way.

[1] A successor to RCS called RCE (for Revision Control Engine) has recently been announced by a group working in Germany with Walter Tichy [XCC95]. RCE is built atop a difference generator that works between arbitrary files, and is implemented as a library, permitting source control operations to be integrated with existing applications. A "stand-alone" command-line interface that is compatible with RCS is provided, as well as a graphical interface. We have not evaluated RCE; if it's of interest to you, see Appendix H for information on finding out more about it.

In this book we describe RCS version 5.6.0.1, the most recent one available as this book was being written.[2] This version of RCS contains the commands listed in Table 3.1.[3]

[2] Note that RCS 5.6.0.1 differs from 5.6 only in that it provides partial support for a new form of conflict output in doing three-way merges. (See Chapter 5, Extending Source Control to Multiple Releases, for a discussion of of such merges.) Since the new support is incomplete, it's disabled, making 5.6.0.1 effectively identical to 5.6. Version 5.7 of RCS was released just after this book went into production. Though it changes nothing fundamental, 5.7 does introduce a few new features. We flag the most important or visible of these in footnotes at the relevant points in our presentation. Appendix G, Changes in RCS Version 5.7, gives a more complete summary of what changed.

[3] We include in Table 3-1 the rcsclean(1) command, which in older RCS releases was a useful but limited shell script. In the current release, the command is implemented in C and uses much of the same internal code as the other RCS commands. Though the RCS sources still flag rcsclean as experimental, we think it's "grown up" enough to warrant inclusion as part of the standard system.

Table 3.1: The RCS Command Set
Command Description
ci Check in RCS revisions
co Check out RCS revisions
ident Identify files
merge Three-way file merge
rcs Change RCS file attributes
rcsclean Clean up working files
rcsdiff Compare RCS revisions
rcsmerge Merge RCS revisions
rlog Print log messages and other info on RCS files

Conventions

Before describing the basic RCS commands, let's define some terms and take a look at command-line conventions, especially how you specify files to the system.

Nomenclature

When RCS creates an archive file, the name of the archive file is the source file name with ,v appended to it. Thus if you created an archive for the file xform.c, xform.c,v would become the archive's name. The ",v" nominally refers to the multiple "versions" of the source file stored in the archive file. Naturally enough, RCS calls its archive files "RCS files."

All of the terms that we introduced in prior chapters to talk about source control in fact come from RCS. Thus RCS uses the term "revision" to refer to each stored version of the source file. It also uses the term "check-in" to refer to the addition of a new revision to an RCS file and "check-out" to refer to the retrieval of an existing revision from an RCS file. And a source file that's been retrieved from an RCS file is known as a "working file."

RCS Command Lines

Like most programs with a UNIX heritage, all RCS commands expect a command line that consists of a command name followed by one or more file names. The file names may be (but don't have to be) preceded by one or more options. If given, options change how the command works. So to summarize, a command line looks like

command-name [options] files

Each option begins with a hyphen, which is what distinguishes it from a filename. After the hyphen comes a single letter that identifies the option; then (for some options) comes a string that serves as a value for the option. Never insert whitespace between an option letter and its value--let them appear as one argument on the command line. (The first argument not starting with a hyphen is assumed to begin the filename arguments for the command.) Each file named on a command line can be either a source file or an RCS file, as we explain below.

Thus a typical command might be

% rcsdiff -r1.2 xform.c

This invocation of the rcsdiff(1) command specifies one option (-r, which has 1.2 as its value) and specifies one file, xform.c.

One final note is really not related to RCS, but to entering quoted strings on a shell command line. As we'll see, you sometimes have the choice of entering a description of an operation either at a prompt from the program you're running or directly on the program's command line. If you want to give the description on the command line, you'll need to enter it as a quoted string (because it will contain whitespace). And if you want to continue the description over more than one line, you'll have to use whatever convention your shell supports for continuing a command line.

For example, the -m option to the ci program specifies comments for a check-in operation. If you want to give a multiline comment and you're using csh(1) (or derivatives) as your shell, you need to precede each carriage return in the commentary with a backslash:

% ci -m"Fix CR 604: vary conditioning algorithm according to\
? range data supplied by caller." filter.sh

However, under the Bourne shell (or ksh(1) or bash(1)), as long as the value for -m is quoted, you don't need to do anything special to continue the comments onto a second line:

$ ci -m"Fix CR 604: vary conditioning algorithm according to
> range data supplied by caller." filter.sh

Naming Files

In running an RCS command, you can name either a working file or the corresponding RCS file on the command line; the command will automatically derive the other file name from the one you provide. This means, for instance, that these two command lines are equivalent:

% rcsdiff -r1.2 xform.c,v % rcsdiff -r1.2 xform.c

Another feature provided by RCS is the automatic use of an RCS subdirectory for RCS files. If you create such a subdirectory beneath the directory where you're working, RCS will try to use it before trying to use the current directory. RCS will not, however, create a subdirectory if one doesn't already exist.

Let's examine naming in more detail. Say that your working file has the name workfile and that path1 and path2 are UNIX pathnames. Then the full set of rules for specifying names to an RCS command looks like this:

  • If you name only a working file (such as, say, path1/workfile), the command tries to use an RCS file with the name path1/RCS/workfile,v or path1/workfile,v (in that order). Naturally, this is also what happens in the simple case in which path1 is not present.

  • If you name only an RCS file without a pathname prefix (such as workfile,v), the command tries to use workfile,v first in any RCS subdirectory beneath the current directory, then in the current directory itself. If it can use an RCS file with one of those names, it tries to use a working file named workfile in the current directory.

  • If you name an RCS file with a pathname prefix (such as path2/workfile,v), the command expects to be able to use an RCS file with exactly that name. Then it tries to use a working file named workfile in the current directory.

  • If you name both a working file and an RCS file, then the command uses files with exactly those names during its execution. In this case the two files can be specified in either order, and can come from completely unrelated directories.

Suppose, for instance, that in your current directory you had a source file xform.c, as well as an RCS subdirectory. Then any of these command lines would create an archive file named RCS/xform.c from the xform.c source file. (The command here is the "check-in" command we describe below.)

% ci xform.c
% ci xform.c,v
% ci RCS/xform.c,v
% ci RCS/xform.c,v xform.c
% ci xform.c RCS/xform.c,v

When the source file and the RCS file are in the same directory (or when the RCS file is in an RCS subdirectory), there's no need to give both file names on the command line. This becomes useful only if the two files are in unrelated directories. For example, if xform.c were in the directory /home/cullen/current/src/mathlib, but you wanted to create the RCS file in the directory /project/archive/mathlib, either of these command lines would do the trick:

% ci /home/cullen/current/src/mathlib/xform.c /project/archive/mathlib/xform.c,v
% ci /project/archive/mathlib/xform.c,v /home/cullen/current/src/mathlib/xform.c

Command lines like this one become useful when you put files into separate trees according to their type. (We mentioned this possibility at the end of Chapter 2, The Basics of Source Control.) This is one of the key concepts behind project control, as we'll see time and again in later chapters. If you take this approach, though, you won't want to be typing horrendously long command lines all the time. It's far better to create some kind of "tree mapper" to manage the filenames for you. Such a mapper is fundamental to systems like TCCS.

Naturally, for any RCS command, you can specify more than one file, and the command will process each file in turn. For your own sake, if you frequently process more than one file at a time, you'll probably want to use an RCS subdirectory to hold RCS files. This helps you avoid naming RCS files by mistake when you use wildcards to name groups of working files.

Note that RCS will take a command line of intermixed working filenames and RCS filenames and match them up using the rules we outlined earlier in this chapter. Though this may work all right for simple cases, however, the potential for ambiguity or erroneous file inclusion is great enough that you should avoid the situation altogether and just segregate your RCS files in an RCS subdirectory.

This is desirable for more general administrative reasons as well. Working files and RCS files are innately different, and it only makes sense to keep them in distinct places to make it easy to administer them appropriately. In particular, by segregating your RCS files, you make it harder to access them accidentally in any way other than through the RCS command set. An rm -rf will still remove them, of course, but the added safety of an RCS subdirectory shouldn't be neglected.

Basic RCS Commands

Now we present one iteration of the source file modification cycle, using RCS commands to implement each operation. We also cover a few other basic commands that are not strictly part of the cycle. All of this involves only some of the RCS commands and few (if any) of their many options. Later chapters explore more of the potential of the full RCS command set.

Figure 3.1 depicts the basic source control operations. This is the same picture we presented as Figure 2-1, but with the "bubbles" annotated to show which RCS command actually implements each operation. So once again, the central part of the figure shows the modification cycle.

Figure 3.1: Basic source control operations in RCS

Figure 3.1

Let's cover each of the operations in more depth. Roughly speaking, we'll describe them in the order in which they appear in the figure, working from top to bottom.

Creating an RCS File

You create an RCS file with the command ci(1) (for "check-in"). The command line need not specify anything but the name of the source file you're checking in. So a simple example of creating an RCS file is

% ci xform.c

which will create the file xform.c,v. If you've already made an RCS subdirectory, then the file will be created there. Otherwise, it will be created in the current directory.

By default, when you create an RCS file, you're prompted for a short description of the file you're putting under source control. You may enter as much text as you like in response, ending your input with a line containing a period by itself. The interaction looks like this:

% ci xform.c
xform.c,v <-- xform.c initial revision: 1.1 enter description, terminated with ^D or '.': NOTE: This is NOT the log message! >> Matrix transform routines, for single-precision data.
>> .
done

Once the RCS file is created, your source file is immediately deleted. It is, of course, now safely stored in the RCS file and can be extracted as a working file whenever you want it.

As the warning "NOTE: This is NOT the log message!" implies, you really create two descriptions when you check in the initial revision of an archive file. The first description, which is what ci prompts for by default, is for the file itself--this message is meant to describe the role of the file in your project. In addition, ci also creates a log message (a term we'll come back to later), to describe the first revision of the archive file--you can use this description to trace the origins of the source file you're checking in.

By default, ci creates a log message with the value "Initial revision". If you want to use the message actually to capture some useful data, you can use the -m option on the ci command line to specify it, like this:

% ci -m"As ported from 9000 engineering group sources,\
? version 4.13." xform.c

Of course, this message has to be quoted, and the usual rules apply if it extends across multiple lines.

The ci command also lets you specify the archive file description on the command line, instead of being prompted for it, via the -t flag. In fact, you can use -t on any check-in, not just the first one, to change an archive's description. You can give a description either as the value to -t or in a file, which you name using -t.

If the value of the option starts with a hyphen, it's taken to be the literal text of the description; otherwise, it's taken to be the name of a file containing the description. So either of these command sequences would be equivalent to the original ci command we showed above:

% cat > xform.desc
Matrix transform routines, for single-precision data.
^D
% ci -txform.desc xform.c
% ci -t-"Matrix transform routines, for single-precision data." xform.c

Getting a Working File for Reading

You extract a working file from an existing RCS file with the command co (for "check-out"). The co(1) command is designed to be the mirror-image of ci(1). So, once again, in the simplest case you specify nothing but a filename when you run the command. A simple example of creating a working file is

% co xform.c
xform.c,v --> xform.c revision 1.1 done

This command will look for an RCS file with the name RCS/xform.c,v or xform.c,v and create a working file from it named xform.c. Here, the output from the command confirms that revision 1.1 of xform.c was extracted from the RCS file xform.c,v.

Since the command line doesn't explicitly say that you want to modify xform.c, the file is created read-only. This is a reminder that you shouldn't change it unless you coordinate your change with the RCS file (by locking the revision of xform.c that you want to modify).

If a writable file already exists with the same name as the working file that co is trying to create, the command will warn you of the file's presence and ask you whether you want to overwrite it. If a writable copy of xform.c existed, for instance, the exchange would look like this:

% co xform.c
RCS/xform.c,v --> xform.c revision 1.1 writable xform.c exists; remove it? [ny](n):

At this point, co expects a response from you that starts with y or n--responding with n, or with anything other than a word beginning with y, will cause co to abort the check-out. If you abort the check-out, co confirms that with the additional message

co error: checkout aborted

This message, of course, doesn't really indicate an "error"--just that the check-out was aborted as you requested.

CAUTION: As we said in Chapter 2, co will silently overwrite any read-only copy of a file that already exists with the same name as a working file it wants to check out, on the assumption that the existing file is the result of a previous co for reading. So if a file is under source control, do not try to maintain changed copies of it manually (i.e., outside source control). If you do, then sooner or later you're likely to delete a file you wanted to save.

You can change the way co checks out a file if for some reason the usual safeguards it provides against overwriting a working file aren't appropriate. First of all, you can use the -f option to "force" co to create a working file even if a writable file of the same name already exists. You might use -f if you had copied some outdated copy of the file into your work area but now wanted to overwrite it with a current copy from the archive file.

At the other extreme, you can check out a revision without affecting any file that already exists with the same name, by using the -p option. With -p, co will "print" the checked-out revision to its standard output, rather than putting it into a working file. You can then redirect standard output to capture the file in whatever way is appropriate. You might use -p if you wanted to have more than one revision of a file checked out simultaneously--you could check out all but one revision with -p into files with special names. (Of course, -p is purely a convenience. You can always avoid using it by doing regular check-outs and renaming the working files afterward.)

Getting a Working File for Modification

If you want to change a source file for which you've created an RCS file, you need to get a writable working copy by adding the -l option to the co command line. To check out xform.c for modification, you use the command line:

% co -l xform.c
xform.c,v --> xform.c revision 1.1 (locked) done

Compare the output from this command to that from the last co we looked at. As you can see, the current output confirms that a lock has been set on the revision of xform.c you've checked out. Now that you have the lock, you have the exclusive right to change this revision (revision 1.1) of the file and eventually to check in your working file as the next revision of the RCS file.

If someone else already held the lock to revision 1.1, you would not be able to lock it yourself. However, even when you can't lock a given revision, you can still check it out for reading only (that is, without the -l option). The assumption here is that you won't modify the file when you obtain it for reading only. If, for example, you requested the lock for revision 1.1 of xform.c but couldn't get it, co would inform you with an error message like this one:

% co -l xform.c
RCS/xform.c,v --> xform.c co error: revision 1.1 already locked by cullen

In this case you don't have the option of forcing the check-out to proceed, so co doesn't ask whether you want to. The check-out is aborted unconditionally. The error message points out what user owns the lock, which lets you contact him if you absolutely need to modify the file now. Perhaps he can check it back in. Even if he can't, waiting is better than circumventing RCS.

Occasionally, you may need to set a lock in an archive file without checking out a working file. Say, for instance, you're archiving sources distributed from outside your group, you've moved a new distribution into place, and now you want to check in the new version of each file. Checking out working files would overwrite the new sources with the older ones. To set a lock without creating a working file, use the command rcs -l.

Having a lock set in each archive file will enable you to check in the corresponding newly imported source file as a successor to the existing revision you locked.

Comparing a Working File to Its RCS File

To compare a working file against the RCS file it came from, use the rcsdiff(1) program. If, for instance, you want to compare the current contents of the working file against the original revision you checked out of the RCS file, just give the command with no options, as in

% rcsdiff xform.c

The rcsdiff command will output a line indicating what revision it is reading from the RCS file, then it will run the diff(1) program, comparing the revision it read against the current working file. The diff output will show the original revision as the "old" file being compared and the current working file as the "new" file being compared. Typical output might look like this:

% rcsdiff xform.c
=================================================================== RCS file: xform.c,v retrieving revision 1.1 diff -r1.1 xform.c 4a5 > j_coord = i_coord - x; 11,12c12,13 < for (j = j_coord; j < j_max; ++j) < if (a[j] < b[j]) { --- > for (j = j_coord + 1; j <= j_max; ++j) > if (a[j - 1] < b[j]) { 20d20 < j_coord = i_coord - x;

In other words, in the working file (the new file in the diff listing), an assignment to j_coord has moved from line 20 to line 4, and the first two lines of the for loop currently at line 12 have been changed.

You can also use rcsdiff to compare a working file against some revision other than the one it started from or to compare two different RCS file revisions to each other. To compare your working file against any revision of the RCS file, add a -r option to the rcsdiff command line, naming the revision you're interested in. For instance, to compare the current contents of xform.c against revision 1.3 of its RCS file, you use the command

% rcsdiff -r1.3 xform.c

To compare two different revisions already checked in to the RCS file, just give two -r options, as in

% rcsdiff -r1.1 -r1.2 xform.c

This command produces a diff listing with revision 1.1 as the "old" file and revision 1.2 as the "new" file. This form of rcsdiff can be particularly useful for debugging, since it lets you see recent changes to the file other than your own.

Adding a Working File to Its RCS File

When you're satisfied with the current state of your working file and want to save it for future reference, use the ci command to add it to the corresponding RCS file. This is, of course, the same command you used to create the RCS file in the first place; ordinarily, to check in a working file, you give the same simple command line as you did then. For instance,

% ci xform.c

This command would check in the current contents of xform.c as a new revision in the corresponding RCS file, then delete the working file. When you run ci, you'll be prompted for a description of your changes to the working file, in the same way as ci originally asked you to describe the file itself. A typical interaction might be

% ci xform.c
xform.c,v <-- xform.c new revision: 1.2; previous revision: 1.1 enter log message: (terminate with ^D or single '.') >> In function ff1(): move declaration of j_coord; fix
>> off-by-one error in traversal loop.
>> .
done

Again, you can enter as much text as you like, and you terminate your entry with a period on a line by itself.

Sometimes, you may prefer to give revision commentary directly on the ci command line. This can be handy when you're checking in more than one file and want all of the files to have the same commentary. You do this using the -m option to ci (as we've mentioned a few times in other contexts). For instance, the last check-in that we showed could be phrased as

% ci -m"In function ff1():  move declaration of j_coord; fix\
? off-by-one error in traversal loop." xform.c

Notice that we gave the comments as a quoted string, as they contain white space. Since in this example we're using a csh(1)-style shell, we had to type a backslash to extend the comments onto a second line.

Ordinarily, ci expects a newly checked-in revision to be different from its ancestor and will not complete the check-in if the two are identical. You can use the -f option to "force" a check-in to complete anyway in this case. By default, ci deletes your working file when the check-in is complete. Often, you'll still want a copy of the file to exist afterward. To make ci do an immediate check-out of the working file after checking it in, you can add either of two options to the command line. The -u option will check out your working file unlocked, suitable for read-only use. The -l option will set a new lock on the revision that you just checked in and check out the working file for modification. Both of these options are simply shorthand for doing a separate co following the check-in.

Discarding a Working File

If you decide that you don't want to keep the changes that you've made, you can use the rcs(1) command to discard your changes by unlocking the RCS file revision you started with. Run rcs just by naming the file you want to discard, preceded by the option -u (for "unlock"):

% rcs -u xform.c
RCS file: xform.c,v 1.1 unlocked done

This command will remove any lock you currently have set in the RCS file. However, it doesn't do anything to the working file you name and doesn't even require that the file exist. If you want to remove the working file, you have to do that yourself with rm(1).

If you've set more than one lock in a file under the same username, you need to tell rcs the revision you wish to unlock, by adding a revision ID to the -u option. Without it, the command can't tell which pending update to the archive file you want to cancel. This command, for instance, would unlock revision 1.1 of xform.c,v even if you had another revision locked:

% rcs -u1.1 xform.c
RCS file: xform.c,v 1.1 unlocked done

If you want to discard a working file and replace it with the original revision it came from, it may be more convenient to use the command co -f -u. The -u option causes co to unlock the checked-out revision if it was locked by you, while -f forces co to overwrite your writable working file with the original revision from the archive file.

Viewing the History of an RCS File

As we've seen, the ci command asks you for a description when you create an RCS file, as well as when you add a revision to one. Together, these descriptions form a history, or log, of all that's happened to the RCS file since its creation. The descriptions can be displayed by using the rlog(1) command.

As usual, you simply give on the command line the name of the file you want to examine. Here's an example:

% rlog xform.c RCS file:        xform.c,v;   Working file:    xform.c head:            1.2 branch: locks:           ;  strict access list: symbolic names: comment leader:  " * " total revisions: 2;    selected revisions: 2 description: Matrix transform routines, for single-precision data. ---------------------------- revision 1.2 date: 95/05/10 14:34:02;  author: rully;  state: Exp;  lines added/del: 3/3 In function ff1():  move declaration of j_coord; fix off-by-one error in traversal loop. ---------------------------- revision 1.1 date: 95/04/23 14:32:31;  author: rully;  state: Exp; As ported from 9000 engineering group sources, version 4.13. =============================================================================

The output of rlog can be divided into three parts. First appears a summary of various characteristics of the RCS file, which is unrelated to what we've discussed in this chapter. Next, following the description: line, we find the text entered when the RCS file was first created. Last, a list of revision entries appears, one for each revision in the RCS file. These entries are output with the most recent first. Each one contains the description that was originally entered for that revision.

Cleaning Up an RCS Source Directory

To help you tidy up a source directory when you're done working there, RCS provides a program called rcsclean. This program compares working files in the current directory against their archive files and removes working files that were checked out but never modified. More specifically:

  • A working file that was checked out for reading is removed only if it still matches the head revision on the default branch of the archive file.[4]

    [4] See Chapter 6, Applying RCS to Multiple Releases, for a discussion of RCS archive file branches.

  • If -u is given, a working file that was checked out for modification is removed if it still matches the original revision (that is, the one checked out locked by the user).

  • If a working file does not match the revision noted in the last two cases, then rcsclean will never remove it.

When -u is given, if rcsclean removes a working file, it also removes any lock corresponding to it. Any commands rcsclean executes are echoed to its standard output so you can see what's going on.

If you invoke rcsclean with no arguments, it will process all of the working files in the current directory. If you provide arguments, then only the working files you name will be processed. Needless to say, rcsclean has no effect on files other than working files checked out from an RCS file.

If you want to see what commands rcsclean would execute, if given a certain command line, you can use the -n flag. Then rcsclean will echo the commands it normally would run but will not actually execute them. Note that the output from rcsclean -n looks exactly like the normal output, so be careful not to confuse a -n run with the real McCoy.

Summary

You can put and keep files under source control with RCS by using only two commands, ci and co. This simplicity is a strong point of the system. We've also introduced the rcs command to abort a pending modification to an RCS file and informational commands rcsdiff and rlog, which give you detailed information about the contents of an RCS file. Finally, we presented rcsclean to remove unmodified working files.

Table 3.2 summarizes our presentation so far, by relating each operation in the source file modification cycle (plus a few other basic ones) to the RCS command that implements it.

Table 3.2: Basic RCS Commands
Command Basic Operation
ci Creating an archive file
co Getting a working file for reading
co -l Getting a working file for modification
rcsdiff Comparing a working file to its RCS file
ci Adding a working file to an RCS file
rcs -u plus rm Discarding a working file
rlog Viewing the history of an RCS file
rcsclean Cleaning up a source directory

Remember, too, that all of these commands employ an intelligent command-line interface that fairly well balances simplicity and flexibility and can provide an advantage over SCCS. That said, let's see what SCCS has to offer.

No comments: