Earlier in this chapter (in the section called “Strategies for Repository Deployment”), we looked at some of the important decisions that should be made before creating and configuring your Subversion repository. Now, we finally get to get our hands dirty! In this section, we'll see how to actually create a Subversion repository and configure it to perform custom actions when special repository events occur.
Subversion repository creation is an incredibly simple task. The svnadmin utility that comes with Subversion provides a subcommand (svnadmin create) for doing just that.
$ # Create a repository $ svnadmin create /var/svn/repos $
Assuming that the parent directory
/var/svn
exists and that you have
sufficient permissions to modify that directory, the previous
command creates a new repository in the directory
/var/svn/repos
, and with the default
filesystem data store (FSFS). You can explicitly choose the
filesystem type using the --fs-type
argument,
which accepts as a parameter either fsfs
or
bdb
.
$ # Create an FSFS-backed repository $ svnadmin create --fs-type fsfs /var/svn/repos $
# Create a legacy Berkeley-DB-backed repository $ svnadmin create --fs-type bdb /var/svn/repos $
After running this simple command, you have a Subversion repository. Depending on how users will access this new repository, you might need to fiddle with its filesystem permissions. But since basic system administration is rather outside the scope of this text, we'll leave further exploration of that topic as an exercise to the reader.
The path argument to svnadmin is just
a regular filesystem path and not a URL like the
svn client program uses when referring to
repositories. Both svnadmin and
svnlook are considered server-side
utilities—they are used on the machine where the
repository resides to examine or modify aspects of the
repository, and are in fact unable to perform tasks across a
network. A common mistake made by Subversion newcomers is
trying to pass URLs (even “local”
file://
ones) to these two programs.
Present in the db/
subdirectory of
your repository is the implementation of the versioned
filesystem. Your new repository's versioned filesystem begins
life at revision 0, which is defined to consist of nothing but
the top-level root (/
) directory.
Initially, revision 0 also has a single revision property,
svn:date
, set to the time at which the
repository was created.
Now that you have a repository, it's time to customize it.
While some parts of a Subversion repository—such as the configuration files and hook scripts—are meant to be examined and modified manually, you shouldn't (and shouldn't need to) tamper with the other parts of the repository “by hand.” The svnadmin tool should be sufficient for any changes necessary to your repository, or you can look to third-party tools for tweaking relevant subsections of the repository. Do not attempt manual manipulation of your version control history by poking and prodding around in your repository's data store files!
A hook is a program triggered by some repository event, such as the creation of a new revision or the modification of an unversioned property. Some hooks (the so-called “pre hooks”) run in advance of a repository operation and provide a means by which to both report what is about to happen and prevent it from happening at all. Other hooks (the “post hooks”) run after the completion of a repository event and are useful for performing tasks that examine—but don't modify—the repository. Each hook is handed enough information to tell what that event is (or was), the specific repository changes proposed (or completed), and the username of the person who triggered the event.
The hooks
subdirectory is, by
default, filled with templates for various repository
hooks:
$ ls repos/hooks/ post-commit.tmpl post-unlock.tmpl pre-revprop-change.tmpl post-lock.tmpl pre-commit.tmpl pre-unlock.tmpl post-revprop-change.tmpl pre-lock.tmpl start-commit.tmpl $
There is one template for each hook that the Subversion
repository supports; by examining the contents of those
template scripts, you can see what triggers each script
to run and what data is passed to that script. Also present
in many of these templates are examples of how one might use
that script, in conjunction with other Subversion-supplied
programs, to perform common useful tasks. To actually install
a working hook, you need only place some executable program or
script into the repos/hooks
directory,
which can be executed as the name (such as
start-commit or
post-commit) of the hook.
On Unix platforms, this means supplying a script or
program (which could be a shell script, a Python program, a
compiled C binary, or any number of other things) named
exactly like the name of the hook. Of course, the template
files are present for more than just informational
purposes—the easiest way to install a hook on Unix
platforms is to simply copy the appropriate template file to a
new file that lacks the .tmpl
extension,
customize the hook's contents, and ensure that the script is
executable. Windows, however, uses file extensions to
determine whether a program is executable, so you would
need to supply a program whose basename is the name of the
hook and whose extension is one of the special extensions
recognized by Windows for executable programs, such as
.exe
for programs and
.bat
for batch files.
Subversion executes hooks as the same user who owns the process that is accessing the Subversion repository. In most cases, the repository is being accessed via a Subversion server, so this user is the same user as whom the server runs on the system. The hooks themselves will need to be configured with OS-level permissions that allow that user to execute them. Also, this means that any programs or files (including the Subversion repository) accessed directly or indirectly by the hook will be accessed as the same user. In other words, be alert to potential permission-related problems that could prevent the hook from performing the tasks it is designed to perform.
There are several hooks implemented by the Subversion repository, and you can get details about each of them in Subversion Repository Hook Reference. As a repository administrator, you'll need to decide which hooks you wish to implement (by way of providing an appropriately named and permissioned hook program), and how. When you make this decision, keep in mind the big picture of how your repository is deployed. For example, if you are using server configuration to determine which users are permitted to commit changes to your repository, you don't need to do this sort of access control via the hook system.
By default, Subversion executes hook scripts with an
empty environment—that is, no environment variables
are set at all, not even $PATH
(or %PATH%
, under Windows). Because of
this, many administrators are baffled when their hook
program runs fine by hand, but doesn't work when invoked by
Subversion. Administrators have historically worked around
this problem by manually setting all the environment
variables their hook scripts need in the scripts
themselves.
Subversion 1.8 introduces a new way to manage the
environment of Subversion-executed hook scripts—the
hook script environment configuration file. If a Subversion
server finds a file named hooks-env
in
the repository's conf/
subdirectory, it
parses that file as an INI-formatted configuration file and
applies the option names and variables found therein to the
hook script's execution environment as environment
variables.
The syntax of the hooks-env
file is
pretty straightforward: each section name is the name of a
hook script (such as [pre-commit]
or [post-revprop-change]
), and the
configuration items inside that section are treated as
mappings of environment variable names to desired values.
Additionally, there is a
special [default]
section, which can be
used to configure environment variable mappings that should
be applied to all hook scripts (unless
explicitly overridden by per-hook-script settings). See
Example 5.1, “hooks-env (custom hook script environment
configuration)”
for a sample hooks-env
configuration
file.
Example 5.1. hooks-env (custom hook script environment configuration)
# All scripts should use a UTF-8 locale and have our hook script # utilities directory on the search path. [default] LANG = en_US.UTF-8 PATH = /usr/local/svn/tools:/usr/bin # The post-commit and post-revprop-change scripts want to run # programs from our custom synctools replication software suite, too. [post-commit] PATH = /usr/local/synctools-1.1/bin:%(PATH)s [post-revprop-change] PATH = /usr/local/synctools-1.1/bin:%(PATH)s
Example 5.1, “hooks-env (custom hook script environment
configuration)” also demonstrates the nifty string substitution syntax
found in Subversion's configuration file parser. In this
example, the value of the PATH
option—pulled from the [default]
section of the file—is substituted in place of
the %(PATH)s
placeholder text in the
per-hook sections. For more about this special syntax,
see the README.txt
file which lives
in the Subversion runtime configuration directory. (And
for more information about that directory, see
the section called “Runtime Configuration Area”.)
Of course, having exact duplicates of your custom hook
script environment configuration files in every single
repository's conf/
directory could get
cumbersome, especially when you need to make changes to them
all. So Subversion's servers allow you to specify an
alternate (possibly shared) location for this configuration
information.
Repository hook scripts can offer a wide range of utility, but most tend to fall into a few basic categories: notification, validation, and replication.
Notification scripts are those which tell someone that something happened. The most common of these found in a Subversion service offering involve programs which send commit and revision property change notification emails to project members, driven by the post-commit and post-revprop-change hooks, respectively. There are numerous other notification approaches, from issue tracker integration scripts to scripts which operate as IRC bots to announce that something's changed in the repository.
On the validation side of things, the start-commit and pre-commit hooks are widely used to allow or disallow commits based on various criteria: the author of the commit, the formatting and/or content of the log message which describes the commit, and even the low-level details of the changes made to files and directories in the commit. Likewise, the pre-revprop-change hook acts as the gateway to revision property changes, which is an especially valuable role considering the fact that revision properties are not themselves versioned, and can therefore only be modified destructively.
One special class of change validation that has seen widespread use since Subversion 1.5 was released is validation of the committing client software itself. When Subversion's merge tracking feature (described extensively in Chapter 4, Branching and Merging) was introduced in that release, Subversion administrators needed a way to ensure that once users of their repositories started using the new feature that all their merges were tracked. To reduce the chance of someone committing an untracked merge to the repository, they used start-commit hooks to examine the feature capabilities string advertised by Subversion clients. If the committing client didn't advertise support for merge tracking, the commit was denied with instructions to the user to immediately update their Subversion client! Example 5.2, “start-commit hook to require merge tracking support” provides an example of a start-commit script which does precisely this.
Example 5.2. start-commit hook to require merge tracking support
#!/usr/bin/env python import sys # sys.argv[3] is a colon-delimited capabilities list if 'mergeinfo' not in sys.argv[3].split(':'): sys.stderr.write("""\ ERROR: Commits to this repository must be made using Subversion clients which support the merge tracking feature. Please upgrade your client to at least Subversion 1.5.0. """) sys.exit(1)
Beginning in Subversion 1.8, clients committing against a Subversion 1.8 server will still provide the feature capabilities string, but will also provide additional information about themselves by way of ephemeral transaction properties. Ephemeral transaction properties are essentially revision properties which are set on the commit transaction by the client at the earliest opportunity while committing, but which are automatically removed by the server immediately prior to the transaction becoming a finalized revision. You can inspect these properties using the same tools with which you'd inspect other unversioned properties set on commit transactions during the timeframe between which the start-commit and pre-commit repository hook scripts would operate.
The following are the ephemeral transaction properties which Subversion currently provides and implements:
svn:txn-client-compat-version
Carries the Subversion library version string with which the committing client claims compatibility. This is useful for deciding whether the client supports the minimal feature set required for proper handling of the repository data.
svn:txn-user-agent
Carries the “user agent” string which describes the committing client program. Subversion's libraries define the initial portion of this string, but third-party consumers of the API (GUI clients, etc.) can append custom information to it.
While most clients will transmit ephemeral transaction properties early enough in the commit process that they may be inspected by the start-commit hook script, some configurations of Subversion will cause those properties to not be set on the transaction until later in the commit process. Administrators should consider performing any validation based on ephemeral transaction properties in both the start-commit and pre-commit hooks—the former to rule out invalid clients before those clients transmit the commit payload; the latter “just in case” the validation checks couldn't be performed by the start-commit hook.
As noted before, ephemeral transaction properties are
removed from the transaction just before it is promoted to a
new revision. Some administrators may wish to preserve the
information in those properties indefinitely. We suggest
that you do so by using the pre-commit hook script to copy
the values of those properties to new property names. In
fact, the Subversion source code distribution provides a
persist-ephemeral-txnprops.py
script
(in the tools/hook-scripts/
subdirectory) for doing precisely that.
The third common type of hook script usage is for the purpose of replication. Whether you are driving a simple backup process or a more involved remote repository mirroring scenario, hook scripts can be critical. See the section called “Repository Backup” and the section called “Repository Replication” for more information about these aspects of repository maintenance.
As you might imagine, there is no shortage of Subversion
hook programs and scripts that are freely available either
from the Subversion community itself or elsewhere. In fact,
the Subversion distribution provides several commonly used
hook scripts in its tools/hook-scripts/
subdirectory. However, if you are unable to find one that
meets your specific needs, you might consider writing your
own. See Chapter 8, Embedding Subversion for information
about developing software using Subversion's public
APIs.
Hook scripts can do almost anything, but hook script authors should show restraint. It might be tempting to, say, use hook scripts to automatically correct errors, shortcomings, or policy violations present in the files being committed. Unfortunately, doing so can cause problems. Subversion keeps client-side caches of certain bits of repository data, and if you change a commit transaction in this way, those caches become indetectably stale, leading to surprising and unexpected behavior. While it is generally okay to add new commit transaction properties via a hook script, essentially everything else about a commit transaction should be considered read-only. Instead of modifying a transaction to polish its payload, simply validate the transaction in the pre-commit hook and reject the commit if it does not meet the desired requirements. As a bonus, your users will learn the value of careful, compliance-minded work habits.