Path-Based Authorization

Both Apache and svnserve are capable of granting (or denying) permissions to users. Typically this is done over the entire repository: a user can read the repository (or not), and she can write to the repository (or not).

It's also possible, however, to define finer-grained access rules. One set of users may have permission to write to a certain directory in the repository, but not others; another directory might not even be readable by all but a few special people. It's even possible to restrict access on a per file basis.

Both Subversion servers use a common file format to describe these path-based access rules. In this section, we will explain that file format, as well how to configure your Subversion server to use it for managing path-based authorization.

Do You Really Need Path-Based Access Control?

A lot of administrators setting up Subversion for the first time tend to jump into path-based access control without giving it a lot of thought. The administrator usually knows which teams of people are working on which projects, so it's easy to jump in and grant certain teams access to certain directories and not others. It seems like a natural thing, and it appeases the administrator's desire to maintain tight control of the repository.

Note, though, that there are often invisible (and visible!) costs associated with this feature. In the visible category, the server needs to do a lot more work to ensure that the user has the right to read or write each specific path; in certain situations, there's very noticeable performance loss. In the invisible category, consider the culture you're creating. Most of the time, while certain users shouldn't be committing changes to certain parts of the repository, that social contract doesn't need to be technologically enforced. Teams can sometimes spontaneously collaborate with each other; someone may want to help someone else out by committing to an area she doesn't normally work on. By preventing this sort of thing at the server level, you're setting up barriers to unexpected collaboration. You're also creating a bunch of rules that need to be maintained as projects develop, new users are added, and so on. It's a bunch of extra work to maintain.

Remember that this is a version control system! Even if somebody accidentally commits a change to something she shouldn't, it's easy to undo the change. And if a user commits to the wrong place with deliberate malice, it's a social problem anyway, and that the problem needs to be dealt with outside Subversion.

So, before you begin restricting users' access rights, ask yourself whether there's a real, honest need for this, or whether it's just something that “sounds good” to an administrator. Decide whether it's worth sacrificing some server speed, and remember that there's very little risk involved; it's bad to become dependent on technology as a crutch for social problems.^[70]

As an example to ponder, consider that the Subversion project itself has always had a notion of who is allowed to commit where, but it's always been enforced socially. This is a good model of community trust, especially for open source projects. Of course, sometimes there are truly legitimate needs for path-based access control; within corporations, for example, certain types of data really can be sensitive, and access needs to be genuinely restricted to small groups of people.

Getting Started with Path-Based Access Control

Subversion offers path-based access control in Apache via the mod_authz_svn module, which must be loaded using the LoadModule directive in httpd.conf in the same fashion that mod_dav_svn itself is loaded. To enable the use of this module for your repositories, you'll add the AuthzSVNAccessFile or AuthzSVNReposRelativeAccessFile directives (again within the httpd.conf file) pointing to your own access rules file. (For a full explanation, see the section called “Per-directory access control”.)

To configure path-based authorization in svnserve, simply point the authz-db configuration variable (within your svnserve.conf file) to your access rules file.

Once your server knows where to look for your access rules, it's time to define those rules.

The syntax of the Subversion access file is the same familiar one used by svnserve.conf and the runtime configuration files. Lines that start with a hash (#) are ignored. In its simplest form, each section names a versioned path and, optionally, the repository in which that path is found. In other words, except for a few reserved sections, section names are of one of two forms: either [repos-name:path] or [path] when AuthzSVNAccessFile is used. If you configured per repository access files via AuthzSVNReposRelativeAccessFile directive, you should always use [path] form only. Authenticated usernames are the option names within each section, and an option's value describes that user's level of access to the repository path: either r (read-only) or rw (read/write). If the user is not mentioned at all, no access is allowed.

Note

Paths used in access file sections must be specified using Subversion's “internal style”, which mostly just means that they are encoded in UTF-8 and use forward slash (/) characters as directory separators (even on Windows systems). Note also that these paths do not employ any character escaping mechanism (such as URI-encoding)—spaces in path names should be represented exactly as such in access file section names ([repos-name:path with spaces], e.g.)

Here's a simple example demonstrating a piece of the access configuration which grants read access Sally, and read/write access to Harry, for the path /branches/calc/bug-142 (and all its children) in the repository calc:

[calc:/branches/calc/bug-142]
harry = rw
sally = r

Warning

Prior to version 1.7, Subversion treated repository names and paths in a case-insensitive fashion for the purposes of access control, converting them to lower case internally before comparing them against the contents of your access file. It now does these comparisons case-sensitively. If you upgraded to Subversion 1.7 from an older version, you should review your access files for case correctness.

The name of a repository as evaluated by the authorization subsystem is derived directly from the repository's path. Exactly how this happens differs between the two server options. mod_dav_svn uses only the basename of the repository's root URL^[71], while svnserve uses the entire relative path from the serving root (as determined by its --root (-r) command-line option) to the repository.

Warning

The differences in the ways that a repository's name is determined by each of mod_dav_svn and svnserve can cause problems when trying to serve a repository via both servers simultaneously. Naturally, an administrator would prefer to point both servers' configurations toward a common access file. However, for this to work, you must ensure that the repository name portion of the file's section names are compatible with each server's idea of what the repository name should be—for example, by configuring svnserve's root to be the same as mod_dav_svn's configured SVNParentPath, or by using a different access file per repository so that section names needn't reference the repository at all.

If you're using the SVNParentPath directive, it's important to specify the repository names in your sections. If you omit them, a section such as [/some/dir] will match the path /some/dir in every repository. If you're using the SVNPath directive, however, it's fine to provide only paths in your sections—after all, there's only one repository.

Permissions are inherited from a path's parent directory. That means we can specify a subdirectory with a different access policy for Sally. Let's continue our previous example, and grant Sally write access to a child of the branch that she's otherwise permitted only to read:

[calc:/branches/calc/bug-142]
harry = rw
sally = r

# give sally write access only to the 'testing' subdir
[calc:/branches/calc/bug-142/testing]
sally = rw

Now Sally can write to the testing subdirectory of the branch, but can still only read other parts. Harry, meanwhile, continues to have complete read/write access to the whole branch.

It's also possible to explicitly deny permission to someone via inheritance rules, by setting the username variable to nothing:

[calc:/branches/calc/bug-142]
harry = rw
sally = r

[calc:/branches/calc/bug-142/secret]
harry =

In this example, Harry has read/write access to the entire bug-142 tree, but has absolutely no access at all to the secret subdirectory within it.

Tip

The thing to remember is that the most specific path always matches first. The server tries to match the path itself, and then the parent of the path, then the parent of that, and so on. The net effect is that mentioning a specific path in the access file will always override any permissions inherited from parent directories.

Similarly, sections that specify a repository name have precedence over those that don't: if both [calc:/some/path] and [/some/path] are present, the former will be used and the latter ignored for calc.

By default, nobody has any access to any repository at all. That means that if you're starting with an empty file, you'll probably want to give at least read permission to all users at the roots of the repositories. You can do this by using the asterisk variable (*), which means “all users”:

[/]
* = r

This is a common setup; notice that no repository name is mentioned in the section name. This makes all repositories world-readable to all users. Once all users have read access to the repositories, you can give explicit rw permission to certain users on specific subdirectories within specific repositories.

Note that while all of the previous examples use directories, that's only because defining access rules on directories is the most common case. You may similarly restrict access on file paths, too.

[calendar:/projects/calendar/manager.ics]
harry = rw
sally = r

Access Control Groups

The access file also allows you to define whole groups of users, much like the Unix /etc/group file. To do this, create a groups section in your access file, and then describe your groups within that section: each variable's name defines the name of the group, and its value is a comma-delimited list of usernames which are part of that group.

[groups]
calc-developers = harry, sally, joe
paint-developers = frank, sally, jane
everyone = harry, sally, joe, frank, jane

Groups can be granted access control just like users. Distinguish them with an “at sign” (@) prefix:

[calc:/projects/calc]
@calc-developers = rw

[paint:/projects/paint]
jane = r
@paint-developers = rw

Another important fact is that group permissions are not overridden by individual user permissions. Rather, the combination of all matching permissions is granted. In the prior example, Jane is a member of the paint-developers group, which has read/write access. Combined with the jane = r rule, this still gives Jane read/write access. Permissions for group members can only be extended beyond the permissions the group already has. Restricting users who are part of a group to less than their group's permissions is impossible.

Groups can also be defined to contain other groups:

[groups]
calc-developers = harry, sally, joe
paint-developers = frank, sally, jane
everyone = @calc-developers, @paint-developers

Username Aliases

Some authentication systems expect and carry relatively short usernames of the sorts we've been describing here—harry, sally, joe, and so on. But other authentication systems—such as those which use LDAP stores or SSL client certificates—may carry much more complex usernames. For example, Harry's username in an LDAP-protected system might be CN=Harold Hacker,OU=Engineers,DC=red-bean,DC=com. With usernames like that, the access file can become quite bloated with long or obscure usernames that are easy to mistype.

Fortunately, Subversion 1.5 introduced username aliases to the access file syntax. Username aliases allow you to have to type the correct complex username only once, in a statement which assigns to it a more easily digestable alias.

Username aliases are defined in the special aliases section of the access file, with each variable name in that section defining an alias, and the value of those variables carrying the real Subversion username which is being aliased.

[aliases]
harry = CN=Harold Hacker,OU=Engineers,DC=red-bean,DC=com
sally = CN=Sally Swatterbug,OU=Engineers,DC=red-bean,DC=com
joe = CN=Gerald I. Joseph,OU=Engineers,DC=red-bean,DC=com
…

Once you've defined a set of aliases, you can refer to the users elsewhere in the access file via their aliases in all the same places you could have instead used their actual usernames. Simply prepend an ampersand to the alias to distinguish it from a regular username:

[groups]
calc-developers = &harry, &sally, &joe
paint-developers = &frank, &sally, &jane
everyone = @calc-developers, @paint-developers

You might also choose to use aliases if your users' usernames change frequently. Doing so allows you to need to update only the aliases table when these username changes occur, instead of doing global search-and-replace operations on the whole access file.

Advanced Access Control Features

Beginning with Subversion 1.5, the access file syntax also supports some “magic” tokens for helping you to make rule assignments based on the user's authentication class. One such token is the $authenticated token. Use this token where you would otherwise specify a username, alias, or group name in your authorization rules to declare the permissions granted to any user who has authenticated with any username at all. Similarly employed is the $anonymous token, except that it matches everyone who has not authenticated with a username.

[calendar:/projects/calendar]
$anonymous = r
$authenticated = rw

Another handy bit of access file syntax magic is the use of the tilde (~) character as an exclusion marker. In your authorization rules, prefixing a username, alias, group name, or authentication class token with a tilde character will cause Subversion to apply the rule to users who do not match the rule. Though somewhat unnecessarily obfuscated, the following block is equivalent to the one in the previous example:

[calendar:/projects/calendar]
~$authenticated = r
~$anonymous = rw

A less obvious example might be as follows:

[groups]
calc-developers = &harry, &sally, &joe
calc-owners = &hewlett, &packard
calc = @calc-developers, @calc-owners

# Any calc participant has read-write access...
[calc:/projects/calc]
@calc = rw

# ...but only allow the owners to make and modify release tags.
[calc:/projects/calc/tags]
~@calc-owners = r

Some Gotchas with Access Control

If you're using Apache as your Subversion server and have made certain subdirectories of your repository unreadable to certain users, you need to be aware of a possible nonoptimal behavior with svn checkout.

Depending on which HTTP communication library the Subversion client is using, it may request that the entire payload of a checkout or update be delivered in a single (often large) response to the primary checkout/update request. When this happens, this single request is the only opportunity Apache has to demand user authentication. This has some odd side effects. For example, if a certain subdirectory of the repository is readable only by user Sally, and user Harry checks out a parent directory, his client will respond to the initial authentication challenge as Harry. As the server generates the large response, there's no way it can resend an authentication challenge when it reaches the special subdirectory; thus the subdirectory is skipped altogether, rather than asking the user to reauthenticate as Sally at the right moment.

In a similar way, if the root of the repository is anonymously world-readable, the entire checkout will be done without authentication—again, skipping the unreadable directory, rather than asking for authentication partway through.^[72]

^[70]A common theme in this book!

^[71]Any human-readable name for a repository configured via the SVNReposName httpd.conf directive will be ignored by the authorization subsystem. Your access control file sections must refer to repositories by their server-sensitive paths as previously described.

^[72]For more on this, see the blog post Authz and Anon Authn Agony at http://blogs.collab.net/subversion/2007/03/authz_and_anon_/.

You are reading Version Control with Subversion (for Subversion 1.8), by Ben Collins-Sussman, Brian W. Fitzpatrick, and C. Michael Pilato.
This work is licensed under the Creative Commons Attribution License v2.0 . To view a copy of this license, visit Creative Commons site or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
To submit comments, corrections, or other contributions to the text, please visit http://www.svnbook.com/.