Software Configuration Management and Version Control
A User's Guide to Learning Version Control Practices
||Copyright © 1997-2002 by Walt Stoneburner|
All Rights Reserved.
In a moment we will focus upon just one aspect of software configuration management: version control. But first, let's step back from the buzzword.
Requirements Management. Every requirement for the system should be tracked from end to end. This prevents development time being spent on non-essential features; it also allows QA to do a better job of testing. Should changes be introduced mid-development, their impact can be measured. Additionally, capturing requirements facilitates communicating them in a coordinated manner.
Design. The architecture and implementation blueprints of the system can be captured, clearly communicated, and easily distributed among developers, managers, and other decision-makers.
Version Control. As the software is developed, changes in the source are tracked. This is useful to developers, especially for development and bug fixing; it is useful for build automation, QA reporting, providing information to help desks, obtaining past releases, and managing patches.
Build Tools. The build process can be made fairly automated, easily constructing different platforms and releases as desired. Some tools provide the ability to track the interface versions that make up a release.
Defect Tracking. Allows problem reports to be directed to the development team, with reporting to management of problem areas and identifying trends. Developers need to be able to communicate problems among themselves.
Automated Testing. Provides assurance for regression testing and identifies problems during stress testing. Automation can rigorously beat on software and run through permutations faster than a human.
Release Management. Captures the contents of a particular release and information to construct a consistent build environment. This includes software patches for post-releases. Certain releases may have their own histories or pre-installation requirements.
Installation Tools. Some packages assist or construct the install process, or provide options for what gets installed, including licensing.
Configuration Management. Captures the hardware and software configuration at the installation site. Some customers may have different platforms, licenses, or patch releases applied.
Help Desk. Customers need a point of contact where they can go to report a problem, check the status of a fix, ask for a new feature, or to learn about the product.
Impact Analysis. When a new requirement is levied on the system, the impact from development, to QA, to documentation, to distribution, and so on, should be measured. Sometimes it isn't cost effective to do something just because you can.
Process Management. The product's lifecycle can be defined, communicated, sometimes enforced,1 tracked, measured, and improved.
Document Tracking. Documentation consistency, changes, distribution, and store-housing are all managed under this branch.
...Software Configuration Management is when you combine all these things into practice.
Just because the help desk needs to be able to report a defect to the developers does not mean that the help desk should use the developer's defect tracking system, nor the other way around. And because the developers ought to be tracking defects and requirements, does not mean the requirement tool and the defect tracking tool should be integrated into version control or rely on an all-inclusive tool. Instead the process should require the information be captured, for example "all bug fixes will include the unique bug number in the code-fix's description." This makes the development process tool (and vendor) independent.
The all-in-one stereo system, the combination TV/VCR, the toaster-oven / alarm-clock radio, even the common jack-knife all suffer from this problem in one way or another. What they have in perceived convenience and/or coolness-factor, they lack in power, simplicity, reliability, and usability. Try adding a new equalizer to the all-in-one stereo, fixing a broken gear in the embedded VCR, getting consistent toast from the toaster-oven, or chopping down a tree with the jack-knife and you'll agree. Software tools share the same problem.
Consequently, a software tool should be simple, reliable, and useable while covering all the necessary feature sets of its user's requirement, not riddled with poorly implemented marketing features and technical sounding buzz words. No matter how new, shiny, and feature-laden a jack-knife looks in the display case, when you want "the knife feature" the best tool in the world is a well-made, simple knife.
The solution is simple. Have a collection of tools, using the right tool for the right job. A screwdriver can screw, pry, and hammer, but it really does only one of those things well. Together we'll examine just one tool, version control, in the context of a whole toolbox of configuration management.
|Version Control.||Intersolv PVCS, Microsoft SourceSafe, TLIB, RCS, SCCS, TCCS, MKS Source Integrity|
|Build Tools.||Make, NMAKE, OpusMAKE, Intersolv Configuration Builder|
|Defect Tracking.||Elsinore's Visual Intercept, PVCS Tracker, MKS TrackIntegrity, Archimedes' BugBase|
|Install Tools.||InstallSHIELD, Packager, RPM|
...Ironically, for processes that require the recording of information, just having the discipline to capture information when you have it, even in a plain text file, can serve as quite useful3 until the need for a specific tool becomes apparent.
We'll now examine a progressing situation illustrating why version control is needed.
"How would you like a promotion to team lead? We've just signed a contract that requires major expansion of the software project you've been working on."
"Great!" you reply, "but doesn't a team lead require a team?"
"Why yes, meet Fred. He'll be working with you."
After short introductions, you explain the project to Fred, how the project directories are laid out, and what files do what. Fred asks for a copy of the project, and you give him one, which he installs on his machine.
Later, after a long lunch of techno-talk with Fred, you start work modifying the project. Fred does the same.
Soon it comes time to test the revision, and in walks Fred with a floppy. Unfortunately, to build the software you need some of your files and some of Fred's.
It takes a few minutes, but you get things under control and hand Fred a new copy. Clearly, this procedure isn't going to work.
Lesson: You need a single copy of project in one place that you can both share.
After downing a Jolt-cola, you save your work and run off to show Fred your latest modification. Fred's just exiting his editor and has something to show you, too.
Unfortunately only Fred's stuff works, and oddly enough, you can't even find your source code - it doesn't even look like you've done a thing!
You're becoming suspicious that maybe Fred took out your modifications, but after a quick questioning there's no malice.
After running through the scenario and wondering why your editor possibly didn't save your changes, you piece together exactly what happened. You were both working on the same file.
You opened the file with your editor, and brought a copy into ram.
And you modified it.
Near the same time, Fred opened the file with his editor and brought a copy into his ram, where he modified it.
You saved your copy to the LAN.
Fred saved his, wiping out your copy.
...the last one to save "wins."
Lesson: You need to coordinate which files you're editing with other team members.
Luckily, you found a backup copy your editor made, and you and Fred merged all the various changes together within a couple hours.
You explain the situation to your boss and manage to get a public whiteboard for the hallway. On it you'll write the name of every file. You explain to Fred that before he's allowed to change a file, he first has to check, and possibly revise, the whiteboard. Whenever you, or Fred, want to change a file, you'll write your name next to the filename. When you are done, you'll erase your name.
You realize if you and Fred don't agree to use the whiteboard, work will be lost and tempers will flare.
But mistakes will happen. The whiteboard scheme isn't enforceable, nor is it scaleable when the number of files increases, or when the number of developers increases. Plus, a multi-person merge would be horrific.
But you can manage; it's just you and Fred.
But no sooner than you finish explaining the procedure, your boss pops a head out into the hallway, "just thought you should know, I've hired five more people to help you on the project."
Oh well, at least you got a whiteboard out of the deal.
Lesson: A person should have the ability to physically lock a file so that no one else can modify it concurrently.
However, when you want to edit a file, you have to check it out of the system. This gives you a copy that has read/write permissions. [See CHECK OUT] Your name is associated with the master copy, just like the whiteboard scheme. At this point the master copy is said to be "locked."
Once the file has been checked out, no one may checkout another read-write copy until it is "unlocked," not even the person who has the file checked out.
Since you're no longer limited by whiteboard space, and the process is automated, it no longer matters how many files or how many people are working on a project. The system enforces mutual exclusion. To get a file with read/write permissions, you must use the system.
In addition, it now becomes possible to determine if anyone is working on anything, who has which files checked out, when the files were checked out, and even for how long.
When you have a checked-out copy, you may now make all the changes you desire. When you are done, you simply check the file back in to version control. [See CHECK IN] Your name is removed thereby "unlocking" the file. Sometimes this operation is called a "commit" or doing a "put."
What happens to your checked out file that you were editing? One of two things, it either becomes read-only in nature, or it gets automatically deleted. Your choice, usually the former.
Can Fred change the file's read-only status to read/write? Yes. Can Fred then modify the file? Yes. Can Fred check the modified file back in? No. Fred does not have a lock on the file, so the system won't let him do a check-in.
But what if there is no lock on the file? And what if no one has checked it out, then can Fred change the file permissions on his dated copy? Fred can do that. Can Fred replace the version control file with his new one? No. Fred didn't obtain an exclusive lock before he started.5
Though your local file still sits there with read-write permissions, you can't check it back in.6
If that bothers you, you can either do another get, turning it into a read-only file with the contents of what's in the master copy, or you can do another checkout, getting a read/write copy with the contents of what was in the master copy. At least you can't hurt anything. Suppose you don't make any changes and accidentally check in the file? No problem, the system will do the equivalent of an unlock. [See No Change Check-In]
In theory you and Fred could edit the same file. If you change the top half, and he changes the bottom, then merging isn't so bad. Where you get into trouble is when the changes you make overlap.
Version control systems can be made to allow concurrency by letting multiple people grab locks on the same file. When people want to check back in their files it forces them to merge their changes. Quite often an automated tool is provided to assist in the matter. Some tools are better than others, and once again we find that tools that are designed for merging, and merging only, are typically better than those that come with the version control systems.
While concurrency is a great marketing feature, most commercial software shops (whether right or wrong) tend to disable this feature in order to avoid all the troubles and heartaches that come from not understanding how to properly use it.
After a few test runs you get a sinking feeling in your stomach. It appears that a code change you made a bit ago has broken something else in your code and you just now noticed. You're now stuck trying to undo recent changes in hopes of recovering the old behavior to analyze what when wrong.
Lesson: Make frequent backups as you work, so you'll have a copy of things when they last worked. With our new found hindsight, let's go back in time7 and re-approach the same dilemma. No problem, you've been copying your project directory off to floppy every now and then as an external backup. The problem is you've made several iterations of changes to the code. You now have a floppy that contains broken code and one that contains working code, but the working copy is too early in time to be of much use.
Lesson: Make a copy of your project directory each time you make a major or minor change to a file.
Back in time again... No problem, this time you have the code you need. And you fixed the problem. But you're noticing that the stack of floppy diskettes is several feet high. This is getting expensive, and if stored on your hard disk, the free space would be eaten up in no time. If you put it on the LAN, your infrastructure team would bite your head off. To make matters worse, you're not the only person making changes to the files.
Even so, what if you needed one file from one point in time and another file from a different point in time?
Lesson: You need to store things more efficiently by storing just the changes to a file.
Then the next time you register a change, it compares the new file to the old one and records just the differences.9 The file that we've called a master file so far is actually called an archive file because it contains an archive of all the prior versions of that file in one place.10 In addition, version control also stores the date and time of the change along with your name.
Version control is very efficient in this way, storing only the information that is needed to reconstruct any prior version of the file.
You also get to store a note about why you changed the file during the check in process, so later you can look back on history and tell what the change was for, even if you weren't the one who made the change.
Note that lines may have been added, deleted, or changed. It is now actually possible to see how substantial a change really was.
It is possible to look at the history of an archive, either in full or over a range of versions. You can now see what your team has been up to.
When someone tries to place a lock, and fails, they can consult the log and get a better understanding of what's happening.
Quite often you'll see that people don't make comments during the check out phase; this is okay.
But now what? There's no change to make to the source code because it doesn't use the evil compiler feature, but you want to record your findings. So you check out the file and immediately check it back in "Why would this be a viable course of action," you're asking?
You can explicitly request to check in a file with no changes to force a comment entry.
Now this looks really odd since the logs will record no lines added, deleted, or changed. You would be adding a change explanation note for no change. A use for this is to let you add a note in the version's history.
Such no-change notes are very rare and should be used to call special attention to themselves, so be careful in answering prompts when doing a check-in with no changes. There's nothing worse than running through pages and pages of logs that tell you that nothing has happened for no reason.
If you do see such a log entry, note that the developer that forced a check in will have their name associated with that entry. You can then track them down and smack 'em silly, if they didn't leave an important comment.
Earlier revisions are obtained by specifying either an internal revision number or by specifying a date and time (to which version control will find the correct version for you). It is also possible to specify revisions in a relative form. For instance, getting revision -3 would take the third most recent version.
As long as the history of the file is intact, you can go back to the point when the archive was first created! If you go back any further, the version control system won't retrieve the file because it didn't exist then.
The resulting copy will have the same date and time stamp as when the file was checked in. This behavior can be overridden to get the check-out time, but doing so isn't very useful.
There are three comparison features that are of importance to the version control user: comparing two different files on the disk, comparing a file on disk with a particular version in its archive, and comparing two versions in the same archive.
One other special case is comparing a version inside one archive with a version inside another.
Version control can do that, and it doesn't have to be just for binary files. The result would be that each stored version would have its own copy of the whole file. Clearly this eats disk space fast.
Version control has keywords that expand when a revision is pulled from version control. These work similar to precompiler macros, and are usually just a keyword with dollar signs around it. Various keywords expand to different things: the filename, the archive name, the revision number, the check in time, the check out time, the person who made the lock, the person who got the file, the person who created the archive, the history log, and so on.
But what if the file shouldn't expand keyword strings? It would be really bad if a binary file changed its content unexpectedly. The keyword expansion feature can be turned on or off on a per archive basis.
This option removes the changes, even the very archive history, that any other content was part of the archive at some point in the past.
It is possible to delete a version off the tip of the archive, thus "backing out" changes (also called a "roll back"). This is useful if you've checked in something and decide you don't want it in there. Note you can't delete the version if someone else has it checked out.
Compression is nothing more than a size / speed trade-off. With today's disk caches, the speed degradation is hardly perceptible. Common administrative practices are to leave compression off until disk space is needed, then to turn on compression (and immediately order hard disks with larger capacity).
Some version control systems can be made to keep only the last 'n' versions around.
By explicitly deleting a range of historical versions, the archive file can be made smaller. Yes, you lose the comments associated with the revisions along with the deltas. The delta information for the remaining versions is updated.
The version numbers never renumber or reorder. Ever. They just disappear. This method can be used to squish out versions you no longer care about, leaving only the important versions.
Because this operation results in the loss of information, it is
In its simplest form,12 a project is quite similar to a directory. It contains archive files and subprojects. Each project has associated with it an archive directory, which is usually shuffled away on the shared disk with permissions so mere mortals can't get to it directly, and a working directory which specifies where revisions that are checked out should be put by default. The user can override the working directory as needed. The directory the revision is actually checked out into is called the checked out directory.
Unless the user has reason to sprinkle checked out files all over, the checked out directory is the same as the working directory.
It is also possible to override this behavior and force version control to pull brand new copies.
The version control system keeps tracks of organizational changes so you can tell when this happened. [See AUDITING]
If a file (or project) is supposed to be removed permanently, it is deleted then purged. Once the purge operation has happened, the file is gone for good. Your only hope for recovery is from a backup.
Permissions can limit (or allow) a person's ability to do certain version control tasks. For example, no one but the administrator can purge deleted revisions.
Permissions can be granted on a per user, per project, or per archive basis by the administrator.
Since a project may contain other projects, it is possible for the version control system to contain multiple products. And this just happens to be the case when your phone rings...
You don't want to give them access to the master sources, so you
clone it for them into a project of their choosing. This
causes them to get an exact duplicate of your project's content and
its history. In the event that they decide to do a roll back to
creation, you won't lose a thing.
In this case, sharing is the solution. It works like a symbolic link. The projects point to the same physical file. Consequently, changes are immediately reflected in both projects at the time of a check-in.
This technique saves considerable disk space among projects that share
After an exchange of jokes about politics,14 you snap a mock salute and dial Fred, saving yourself a 15-foot trek to the next office. You explain that the encryption project is going to branch. One branch will have the current code; the other will have the revised code. Both will continue to be maintained from now on.
Since each started from a common history, you'd like to share that without wasting disk space. Future revisions are only unique to the corresponding project.
This kind of branching is great for when a product splits and becomes multiple products.
This illustration shows branched versions 188.8.131.52 and 184.108.40.206.
[See Version Numbers]
You need to go back and make a change to a prior revision in one file. So you want to CHECK OUT a previous release for modification.
Doing so creates a branch. You can now get and put releases to this branch until the bug is fixed. That particular release has its own change history. Branching, in theory, can nest indefinitely.
This kind of branching is ideal for patches.
When the trunk of a revision tree contains the same change as a branch, the branch may become obsolete. Branches may be merged, or also called grafted, back into the revision tree.
An entry is recorded in the archive's history which trunk revision now
has the update.
|If he clones it, then when they want the revised version, codes changes would have to be duplicated15 between projects.|
|If instead the project was deleted and recloned, they'd lose their project history.|
|If he shares the code, they'll be able to tweak his code (and Fred doesn't like that idea at all). And, branching certainly doesn't seem applicable.|
What Fred wants is pinning. The sharing project points at a particular revision for each archive. Like a thumbtack, this points at the highest revision desired. So when they grab "the latest and greatest" they really get "their latest and greatest." Fred can keep developing, and when they trust some future version, the other group just moves the pin higher.
|Can you explain the version number scheme to me? -Thanks|
You compose a response and send it to everyone with a version control account:
Hi! I've been asked if I can explain the numbering system that the
version control system is using. Here goes:
Each revision in an archive file has a revision number associated with it. The first time a file is checked in, and the archive file is automatically made, the revision starts at zero.
Each version adds one to that number. The revision number is unique to that archive file only. So grabbing revision 17 from all archives will *not* give you the same point in time. Some files might not even have a revision 17 yet.
As our product goes through releases to the public, our version control system can track that as well. You'll note in our case all files start with "1." - so you get 1.0, 1.1, 1.2, and so on. This is not a decimal point, but a period used as a separator. After 1.9 comes 1.10, then 1.11.
The last digit is the only important one.
If we wanted to, we could declare that the code is mature and increment the major number. Revision numbers would then start with "2." A file might have in its history 1.16, 1.17, 2.0, 2.1, and so on.
Anytime we branch off of a version, that version (known by the filename and unique revision number) gets its own set of revision numbers. Follow that? If we branched at 1.4, then the new branch would be 220.127.116.11, 18.104.22.168, 22.214.171.124, etc. Yes, you can increment the branch's major number if you want to: 126.96.36.199 And you can branch off of a branch: 188.8.131.52.1.0 ...and so on, though anyone doing something this wild should check with me first.
If you think about it, a project is really nothing more than a mechanism for masquerading the leading digits before the final version number for any given archive. (Not all archives in project will have the same masqueraded number sequence.)
This lets us pretend that files have just one revision number (the x) when they're in a project. It's up to us whether we expose what's really happening or not. So revision 3 in "Some Project" (see above) is really be version 1.3
However, with labeling, this isn't a problem. Any revision in an archive can be given a label name. This is any name you make up. For example, "Alpha Release", "Code relating to bug 122C", "First cut to QA", "Release 42", "Beta 4", "Code Drop 2", etc.
A label name can only appear once in an archive; this doesn't pose a problem since a revision may have any number of labels. However, verging on extra-useful, the same label name can exist across multiple archives. This lets you tell version control to GET revisions in all archives that have a specified label in them; the corresponding revisions will be checked out. [See GET] For instance, a GET 'Beta 1 Release' from all your archives might grab revision 1.3 from file abc.c, revision 1.0 from file def.c, revision 2.64 from file main.c, and so on thereby reconstructing your "Beta 1 Release" code.
If you move a label to another revision, then the same GET command that references that label will grab the new revision specified!
Labels let you logically mark associations between files and different points in time. By labeling all required archives at the appropriate revisions before doing a build, it is possible to start fresh, pull the applicable files and corresponding revisions (usually the most recent) from version control for a project.
Snapshots are not limited to just a single moment in time, it is possible to make snapshots that include revisions from different moments in time. Commonly this technique is used to say "This is what we put on tape 7" or "this is what when into internal build number 42." But it can be used to construct complex sets of archive revisions: "this is all the new code, with the old encryption module, plus customer X's patches."
Nor are snapshots limited to the whole product across all the archives, a snapshot can contain a subset. This is used when products are made up of several projects that are each released on their own schedules.
Imagine having predefined labels that impose a fixed sequence order. This prevents code from being developed and shipped without having passed by intermediate levels of confidence.
Instead of getting revisions based on some arbitrarily assigned name to a revision, you can get revisions based on confidence. The build team can get everything developers are ready to have tested for QA, or they might get everything QA blessed for deployment, or so on.
If a label had an associated history, you could revert a moved label's association back to its prior revision. This is exactly what states are doing. States work like intelligent labels.17 Effectively they can remember where they once were, so code that is promoted to a level of confidence can be demoted back to the same revision if a defect is later found.
Multiple life cycles may exist in parallel. You can define arbitrary state names and their relationships. Not only code, but documentation must also be shipped. Its lifecycle may go from editing, to ready to proof, to deployment.
What makes this interesting is that only files at the lowest level states can be CHECKED OUT, other retrievals must use GET. This prevents a developer from sneaking in and changing production code.
A file must be demoted in its confidence to the very lowest level when changes are introduced.18
Generally, when a system allows this kind of behavior it is violating the purpose of states by using them as privileges.
Development, testing, and deployment continue, but another project pins to an older shipped revision that has a historical label.
States differ from pinning in that states change as the code confidence does, completely independent of label names.
Getting a revision based on a level of confidence may not yield the same revisions today as it did yesterday.
Technically, labels may also be moved, but typically that's infrequent.
Pinning indicates that a project wants to "borrow" a specific historic revision of code from another project.
Quite often, the revisions that are pinned-to have already passed
through some kind of delivery state and tend to also contain labels
indicating official code drops to a build team.
Version control allows auditing, also called journaling, of files, projects, people, and commands.19
Just like an archive contains a history of changes, the version control system retains a history of events. You can see what was done, and by whom, to any file or project. You can look at the things someone did and what they did it on. You can look how various commands were used, on what, and by whom.
In the event something gets lost, damaged, or you are trying to establish how it evolved to present day, it is possible to go back and get a very granular level of understanding, which might help to reconstruct events.
Naturally, the auditing can be disabled. Small projects don't tend to worry much with monitoring; large ones depend on it.
The policies about to be discussed are manual disciplines that can not be enforced by a tool. Either team members agree to follow them, or they will run into lots of development issues they weren't counting on.
You call the team to the conference room and point out that people have been saving files that are generated by the compile process. Icons, resource files, bitmaps, sound clips, and so on are okay. But if a compiler generates a file, then there's no reason to save the output in version control you can always generate it again.
Someone in the front raises an excellent point, "but what if we change compilers? We may not get the same binaries."
"Correct," you point out, "but capturing the binaries themselves does not record information about the build process, does it? As a developer, your responsibility is not recreating master copies of prior builds. In fact, binaries you build are not even delivered to the customer; there is a formal process for generating The Golden Master copy. Consequently, those in charge of release management capture not only a snapshot of the source used and the binaries it produced, but the instructions, environment, the compilation messages, regression test results, and software needed to produce the distribution." (Release management, which is part of the Software Configuration Management process, is beyond the scope of this discussion, other than to recognize it exists.) "Any time real prior build contents are needed, they go back to the historical vault.
"Now if we want to, we can go to version control and pull a copy of the source we gave to them. Ideally the source base should match; this gives us at least one verification test. We'll either build fixes with our latest tools, or if absolutely necessary, use their records to recreate the old environment."
There are a few smiles at knowing there's another phase out there complementing version control. The object files are removed from the projects, and after a couple of weeks where no one misses them, they are purged for good. In the meanwhile you let infrastructure know the additional disk space isn't needed just yet.
You need the file to make an important change. Well, he's going to get a surprise when he returns. You instruct version control to remove his lock. [See Breaking Locks] That solves your problem. But this guy now has a file with read/write attributes on his disk. If he attempts to do a check in, it will fail. If he attempts to do a check out, it will tell him it will overwrite his file. That's enough to clue him into doing a comparison20 between what's on his disk and what's in version control.
You make your changes, and two weeks later he returns. You instruct him to check out the file to a temporary location, do a comparison and insert his changes.21
Lesson: Try your best not to leave things in a locked state if you're going to be unable to check it back in or release your lock yourself.22
Fred sticks his head in and says he wants to have a meeting. How odd. You get up, stop by the coke machine, and head to the conference room.
Fred starts to explain that he's noticed people are checking out whole projects at a time. It is making things difficult for more than one programmer to work in the same project.
You interject, "the solution is simple, only check out what you need when you need it."
People explain that they are doing bulk checkouts for convenience, but once others chime in how they can't get to the files they need; everyone agrees to stop the madness. You show that it is possible to maintain a list of files in a simple text file, and have version control operate on the contents of that file. It is also possible to use the drag'n'drop features to pull files as well.
Lesson: Only CHECK OUT what you need to change, GET everything else.
It would appear the programmer didn't make any comment. It's empty!
You decide to visit this developer personally.
"I was trying to save time. I knew what the change was for, but I didn't feel like typing it all in."
It's times like these that there's a reason you're the team lead: "Listen. You may know why you made those changes, but will you remember in six months? Maybe. What if aliens abducted you and sucked your brain out?23 No one else on the rest of the team knows what was running through your mind; your job is to communicate that. There's a reason you can enter long comments. Use it, or find another team, because your little time saver is causing the rest of us to waste ours unnecessarily."
Naturally you think all this instead of saying it. But you explain the problem and all the reasoning behind it in kind, gentle words that demonstrate discretion is the better part of valor.
Lesson: Provide usable comments for all changes made to a file, no matter how simple the change.
Your team agrees to adapt to putting the word 'BUG' followed by the defect number and status level prefixing any comment pertaining to a bug fix. An example would be: "BUG 302A: Added a check for the NULL pointer."
Lesson: If you keep useful information in your comments, you don't get tied to particular vendor implementations of Software Configuration Management tools... you can use the one(s) that best meet your needs.
However, in the split second between check in and subsequent check out, someone else grabs the file!
There's no need to call a meeting for this one. The version control system allows a version to be checked in and immediately checked back out in one command, without losing the lock. You can proceed happily in the future.
Lesson: Don't check anything into version control that you haven't successfully compiled.24
A common threat is that if any developer deliberately violates policy (version control or coding), there will be no complaining allowed when the version disappears by higher authority.
So you call a meeting, set people straight, and nuke the evil code from the archive. There's a little bit of grumbling in the back of the room by one programmer, but it soon becomes clear this is for the good of all and everyone agrees to it.
Fred returns to his office enjoying that last bit of cruelty a little more than he should have. Meanwhile the quality of code in the version control system is now reliable enough you can always do builds.
You call another meeting immediately. "Someone altered a file I had checked out!" you glare.
Version control orchestrates mutual exclusion. A user deliberately overriding file ownerships, or a file system that does not support them, opens a gateway of potential trouble.
See, in the name of saving disk space, everyone has been working in a common area. It would appear that while you had a file with read/write permissions, someone accidentally loaded it into their editor using wild cards.25
It was an honest mistake, and everyone promises to be more careful.
Lesson: Only edit the files you explicitly checked out.
With the output still on the screen, you run the program again. This time it bombs.
"Better keep at it," your boss chuckles and heads for the door.
"No! Wait this isn't my fault. It was working a second ago!" But, too late, your boss is gone.
A closer look reveals that while you were getting your boss, Fred was updating his module. He compiled and made a broken executable.
Now you see the error of your ways time for another meeting.
You gather everyone together and stand at the front of the room with a whiteboard marker in hand. You begin, "we're all working in one common area. For the most part that's fine. But as I'm working on my module, I want no, I expect the rest of the system to stay stable. When it was just Fred and I, we could coordinate this kind of thing. But there are now seven of us."
You draw on the whiteboard. "What we want is the ability to suck out of version control the system in a known condition. That's now possible since everyone is verifying their code before they put it in version control. By checking the code out to our local directories, we can effectively play in our own sandbox where nothing else will change.
"Once we know our modifications work, we check things back in. This also solves the problem of people 'accidentally' editing someone else's files they have checked out."
Should developers need to work together they can make a temporary shared sandbox directory among themselves and coordinate efforts manually. This is different that a common area where all developers work because the developers who set it up know what is changing and when.
No longer do the side effects or activities of another developer impact a developer.
"No," you reply, "you can grab just the files you need, use a batch file or something."
"But I'd really just like to point at the top of the project and have it recursively get only the things I need."
As it turns out, you're in a great position to look good. Version control does have a feature called cloaking. It allows a developer to say 'start at any point in a project tree and work all the way down, but only pay attention only to projects I'm interested in unless I explicitly ask for them.'
Your solution is met with nods and approvals.
Lesson: Cloaking can really shorten the time to grab stuff from version control -and- save disk space.
This sounds like a good idea, but what happens when people are working and that code base changes out from under them? What happens when someone makes a change and then you have to wait for the update process to roll around? You point out these issues, and what you really find out is that people just don't want to wait the two minutes it now takes to check everything out.
Today's your lucky day. Version control also has shadow directories.26 These are directories that always reflect the latest revision of the archive files. Anytime someone checks something in, or deletes a tip, the directory is automatically updated then and there — immediately.
You explain that the shadow directory is not the archive directory. It reflects the tips of all the archives.
You explain they are never to manually put stuff in there. If additional files are stuffed in this directory, it no longer reflects the archive, does it?
You explain that the shadow directory is never to be used to check stuff into. This is not a public workspace.
You explain that never is anyone to copy files from the shadow directory with intentions of modification; they are to always use check-out. This does not replace the check-out step just because recent files show up here.
You explain that no one is to ever modify the files that are in there. The files should reflect what's in the archives.
You explain the files will always be read-only. This way files can't get accidentally stomped on.
...everyone agrees and crosses their heart, swearing to stick needles in their eyes, and so on. So you enable shadow directories, and it becomes a little easier for developers to do personal builds. The date/time stamps on the files also happen to indicate when the files were last checked in.
This technique should never, ever, ever be used because someone else is using the revision. It will prevent them from checking in their changes.
Either the administrator can either can allow concurrency, which will create multiple locks on the revision, and then you have to go through the merge exercise, or simply contact the developer who has the lock and see how much longer it will be.
Here's how trouble brews and then is resolved.
Wanting to have the most shining performance review in the company, you decide to come in on the weekend. You start to do some work and realize you need one of the files that Fred has checked out.
Fred left his machine on, and you walk over wanting to do a comparison of what's in the file to what's in the archive. Unfortunately, Fred disabled his keyboard with the lock on the front of his computer (and you can't find the key). You're in luck, there's enough information on the screen to deduce that Fred hasn't touched the file.
Good enough, removing his lock won't cause him to lose a thing. [See Breaking Locks]
The problem is two fold. First Fred didn't unlock his files before leaving. Secondly, you didn't let Fred know you took this course of action.
On Monday, Fred comes in and makes some changes to the file that he last remembered checking out. It has read-write access, so nothing clues Fred off that something's amuck.
Then Fred goes to check in the file and version control states that he can't because he no longer has a lock. Irked, Fred checks the archive log and sees you made the most recent change. Then Fred checks the audit logs and sees you removed his lock over the weekend.
Guess who just walked into your office...
Fred's response is less cheerful, because he knows he's got to redo everything he just did in your latest revision.
You apologize, buy Fred a Coke, and think over the problem. You both head back over to Fred's office.
"First we don't want anyone else making changes, so we'll put a lock on the file," you begin. "But that will pull a new copy over mine, losing my changes, won't it?" asks Fred.
"Ah," you smirk, "I said put a lock on the file, not do a check-out." Fred makes an interesting expression and watches over the keyboard to see how you do this. [See Creating a Lock]
"Next, we want a copy of the latest revision." Again Fred interrupts, "ok now that will step on my file, can we make a backup copy first?"27 "Nah, we'll be okay," you state just trying to make Fred more nervous - you really should have made a backup ever before starting, but its more fun to watch him squirm.
"What we'll do it tell version control to get a copy, but we'll override the filename." "Override the filename?" asks Fred. "Yes, we'll have it extract the revision into XYZ.TMP" "You can do that?" "Yup," and your fingers tickle the keys. [See Overriding the Filename During Get]
"Now we just compare and merge changes between your file and our XYZ.TMP file." Within minutes you compare Fred's file to version control, then your file to version control, and once you know what pieces changed and where, you and Fred happily integrate code (just like the good old days), test it, and decide you're done.
"That was easier than I thought," states Fred tossing the empty Coke can into the trash.28 You remind him to recycle, and point out all Fred has to do is check in the his file with the merged changes. Because Fred holds the lock, he can do this.
You close the door. "Yes. Yes you can. You can do that put overriding the checked out directory or even just specifying the file. This is, to version control, exactly the same as if you just checked out a file and copied the contents of another file over it and checked it back in."
"But that could very well lose other people's changes!"
"Right. That, is why I don't encourage it. People would work locally, check out a file, and immediately check the stuff they worked on locally back in. If someone had made changes since the time they first grabbed that file, a lot of work could be lost." You stand up and open the door. "In the professional field, this is what we call bad." Fred considers this and shudders. "But why have the ability to override directories if it can be abused?"
"It's not as bad as you might think. First, suppose you got something out of version control and then later it appeared that what you had didn't match. You could check out what's in version control to another directory and compare the two.29Secondly, there's the case of what happened today; we would have to shuffle a lot more files than we did, and shuffling files makes the process more error prone. And finally, how did you know to come talk to me?"
"Easy, I checked the archive history and the audit files when things weren't behaving as I expected."
"Right. Version control keeps track of those activities, so if someone does abuse policy we can find out who, when, and what they did and then take actions to correct the process. So it isn't all bad."
Fred smiles and you leave.
|ss get filename(s)|
|co -l filename(s)|
|get -L filename(s)|
|ss checkout filename(s)|
|ss checkin filename(s)|
|rcs -u filename(s)|
|vcs -u filename(s)|
|ss undocheckout filename(s)|
|co filename(s) (change the file attribute manaully)|
|get -W filename(s)|
|ss get filename(s) -W|
|co -r revision filename(s)|
|get -R revision filename(s)|
|ss get filename(s) -v revision|
|co -r label filename(s)|
|get -V label filename(s)|
|ss get filename(s) -vl label|
|co -L filename(s)|
|get -L filename(s)|
|ss checkout filename(s) -G|
|co -L filename(s)|
|vcs -L filename(s)|
|ss checkout filename(s)|
|co -p filename > destination|
|ss checkout filename destination|
|ci -L filename(s)|
|vlog -L filename(s)|
|ss status filename(s)|
|rcs -u filename(s)|
|vcs -U filename(s)|
|ss status filename(s)|
2 A good example is the editor SlickEdit. The Unix
version costs hundreds to thousands, solely because of multi-user
licensing issues. The Windows version is quite affordable and far
more feature enriched. By using samba it is
possible to mount Unix drives on NT; allowing use of the better (and
cheaper) version of the editor on the desired file. This holds true
in general for code checkers, pretty-printers, editors, compilers,
version control tools, etc.
3 On a major project that lacked formal bug tracking at the start, the
author managed to capture defects in a formatted text file, writing
small AWK scripts to extract information as needed.
This process worked for months successfully, until the data and
process needed to be shared real-time, at which time the company knew
exactly what tool to purchase because they understood what data and
process they needed. The transition was a smooth and rapid one.
4 Like the dining philosopher's problem,
if you can't check out all the files you need, then consult the
version control system about who has them checked out (and for how
long), and make a trip down to that person's office and beat them
up.* Or, uncheckout the files you did get, and
go do something else for a while.
6 If the whole purpose was to get a file with read-write file
without a lock, you can either do a regular get and change the
permissions yourself, or there is a parameter to the get command to do
this for you automatically. [See Get Writable Copy]
Yes, I know. If you could go back in time, you could see
what the code looked like before. However, in the real world you
can't go back in time, so let's just recognize this as the
cute literary device it is and get back to the discussion at hand.
8 Version, also called revision, numbers have nothing what-so-ever to do
with the product's marketing version number.
9 Most version control systems do just this, some store the most
recent version of the file and the deltas to the prior versions
(called "reverse delta"). It improves retrieval for recent changes.
10 The archive file typically hides somewhere on the shared disk where
it is protected from mere mortals, only the version control system has
access to it.
For an advanced look at projects, see the e-mail
you are about to write. A product is a special case of a
project. Version control treats everything as projects.
We just recognize that a software product (in our eyes) can be made up
of collections of sub-projects.
13 As in single unit, not E=mc2
15 Yuck. If you ever see code duplicated by hand, something's usually
wrong with your tool, process, or programmer.
16 We're not talking about code that feels confident asking girls out on
dates, but rather the degree of confidence that we as humans trust the
code does what its supposed to do without breaking anything else.
17 States are actually implemented internally with clever labeling. In
systems without states, it is possible to emulate them with labels and
a little discipline, or without labels and a lot of discipline.
18 Some developers may balk at this when it comes time to make a "quick
fix." This really isn't a problem at all, since code can be promoted
just as quickly. Remember, in the grand scheme it isn't just the fix
that is being validated, but the system's reconstruction as a whole.
19 This is not a search engine, but rather somewhere version control
makes an audit file that holds information like "8/29/97 10:32:22 AM
Fred checked out file.cpp into C:\Fred\MyApp and made 3 changes. The
he checked it back in and added a label. After that he deleted the top
three tip revisions from the archive."
20 Well, that and the nasty email you'll leave him for going on vacation
and leaving a file locked...
22 There is no magic timeout device, because you can't make assumptions
about how long a reasonable period is to have something checked out.
It is also quite possible you checked out a file to your local machine
and then turned it off.
23 And the thought runs through your mind that while this may actually be
an improvement, it certainly is the raw end of the deal for the
24 Later in the development cycle it becomes possible to insist people
have passed their code through a code analyzer, made a complete build,
run unit tests, and walked through each line in a debugger.
25 Some editors let you load more than one file at once, by using
wildcard notation such as *.*
26 ...that are totally optional...
27 Fred really has an excellent point here about making backups
before you do dangerous actions, but we'll continue without
a safety net for most examples.
28 If this paper were really environmentally aware, it would be
printed double sized.
1 Never depend on a tool to enforce a process or discipline. Users can always subvert tools or avoid using them completely. Tools that try to enforce process typically fail for this reason. The best process control measure is education of why a process is needed (not how to use a tool). The benefits of applying process, the consequences for not, along with the promise of adaptation as needed inspire adherence. When a tool forces a person to alter the way he works, it often introduces inefficiency. Typically process-oriented tools aren't capable of adequately handling changes of process in mid-stream.
* No programmers were hurt during the writing of this paper.
Version control's purpose is not to keep developers honest, it exists
to prevent them from stepping all over each other. If you're about to
ask, "but what if Fred needs to" we'll get to that.
2 A good example is the editor SlickEdit. The Unix version costs hundreds to thousands, solely because of multi-user licensing issues. The Windows version is quite affordable and far more feature enriched. By using samba it is possible to mount Unix drives on NT; allowing use of the better (and cheaper) version of the editor on the desired file. This holds true in general for code checkers, pretty-printers, editors, compilers, version control tools, etc.
3 On a major project that lacked formal bug tracking at the start, the author managed to capture defects in a formatted text file, writing small AWK scripts to extract information as needed. This process worked for months successfully, until the data and process needed to be shared real-time, at which time the company knew exactly what tool to purchase because they understood what data and process they needed. The transition was a smooth and rapid one.
4 Like the dining philosopher's problem, if you can't check out all the files you need, then consult the version control system about who has them checked out (and for how long), and make a trip down to that person's office and beat them up.* Or, uncheckout the files you did get, and go do something else for a while.
6 If the whole purpose was to get a file with read-write file without a lock, you can either do a regular get and change the permissions yourself, or there is a parameter to the get command to do this for you automatically. [See Get Writable Copy]
7 Yes, I know. If you could go back in time, you could see what the code looked like before. However, in the real world you can't go back in time, so let's just recognize this as the cute literary device it is and get back to the discussion at hand.
8 Version, also called revision, numbers have nothing what-so-ever to do with the product's marketing version number.
9 Most version control systems do just this, some store the most recent version of the file and the deltas to the prior versions (called "reverse delta"). It improves retrieval for recent changes.
10 The archive file typically hides somewhere on the shared disk where it is protected from mere mortals, only the version control system has access to it.
12 For an advanced look at projects, see the e-mail you are about to write. A product is a special case of a project. Version control treats everything as projects. We just recognize that a software product (in our eyes) can be made up of collections of sub-projects.
13 As in single unit, not E=mc2
15 Yuck. If you ever see code duplicated by hand, something's usually wrong with your tool, process, or programmer.
16 We're not talking about code that feels confident asking girls out on dates, but rather the degree of confidence that we as humans trust the code does what its supposed to do without breaking anything else.
17 States are actually implemented internally with clever labeling. In systems without states, it is possible to emulate them with labels and a little discipline, or without labels and a lot of discipline.
18 Some developers may balk at this when it comes time to make a "quick fix." This really isn't a problem at all, since code can be promoted just as quickly. Remember, in the grand scheme it isn't just the fix that is being validated, but the system's reconstruction as a whole.
19 This is not a search engine, but rather somewhere version control makes an audit file that holds information like "8/29/97 10:32:22 AM Fred checked out file.cpp into C:\Fred\MyApp and made 3 changes. The he checked it back in and added a label. After that he deleted the top three tip revisions from the archive."
20 Well, that and the nasty email you'll leave him for going on vacation and leaving a file locked...
22 There is no magic timeout device, because you can't make assumptions about how long a reasonable period is to have something checked out. It is also quite possible you checked out a file to your local machine and then turned it off.
23 And the thought runs through your mind that while this may actually be an improvement, it certainly is the raw end of the deal for the aliens...
24 Later in the development cycle it becomes possible to insist people have passed their code through a code analyzer, made a complete build, run unit tests, and walked through each line in a debugger.
25 Some editors let you load more than one file at once, by using wildcard notation such as *.*
26 ...that are totally optional...
27 Fred really has an excellent point here about making backups before you do dangerous actions, but we'll continue without a safety net for most examples.
28 If this paper were really environmentally aware, it would be printed double sized.