Home -> Library -> Hardening -> Apache/Websites
 
Securing Your Web Pages with Apache

By Ken Coar, Jun 29, 2000, 16 :47 UTC

Maxwell's Demon and Hat Colour

"Long ago and far away
Maxwell felt the need one day
For a Demon, scarce as high
As the atoms going by.
Over heat he gave it sway,
Making warmth go either way
From the vector Nature gave.
Maxwell's Demon, come and save!"

     -- Christopher Stasheff, Her Majesty's Wizard

Chances are that your Web site has at least a few pages that you really don't want published to the Internet at large. How do you keep the Black Hats from seeing them, whilst not impeding the access of the White Hats that need the pages?

What Apache Security Won't Help

At the time I'm writing this (February 2000), there's a lot of current-events news about major Web sites being taken down temporarily by denial-of-service (DoS) attacks. The specific attack type in question cannot be stopped by Apache, even though it may be aimed at the Web site. Apache is just a software application running on the system; these attacks are aimed at the systems themselves. As someone has pointed out, "If you have 1GB/s heading for your server then the pipe is going to saturate before Apache even gets a chance to see the packets."

But for less extreme cases, Apache's implementation of the Web security mechanisms, when properly implemented, should be more than adequate to protect your sensitive pages from exposure.

Assumptions in This Article

For the rest of this article, I'm going to make the following assumptions:

  1. your Apache source tree starts at ./apache-1.3/
  2. your Apache ServerRoot is /usr/local/web/apache
  3. your Apache DocumentRoot is /usr/local/web/htdocs
  4. the username under which Apache runs (the value of the User directive in your httpd.conf file) is nobody

All of the cd and other shell commands in this article that refer to directories use these locations.

Mandatory versus Discretionary Access Control

There are two basic types of access control: those that verify who you say you are, and those that verify who you really are. The three basic verification methods are to check

  1. what you have,
  2. what you know, or
  3. what you are

or even some combination of these. In common non-computer usage, an example of the 'what you have' method would be having the key to a padlock; you can get in if you do. 'What you know' is the method used to keep other people out of your account; if they don't know your password, tough luck for them. And 'what you are' is coming into prominent play in criminal investigations, as DNA patterns are admitted as evidence.

The best security systems use a combination. Your bank's teller machines, for instance, use a combination of the first two methods: you need to have the ATM card, and know the PIN associated with the card (or the account).

But what's all this noise about 'discretionary' and 'mandatory,' you ask? Put simply, discretionary control (DAC) mechanisms check the validity of the credentials given them at the discretion of the user, and mandatory access controls (MAC) validate aspects that the user cannot control. For instance, anyone can tell you its username and password and you can then log in with them; which username and password you supply is at your discretion, and the system can't tell you apart from the real owner. Your DNA is something you can't change, though, and a control system that only allowed access to your pattern would never work for anyone else -- and you couldn't pretend to be someone else, either. This makes such a system a mandatory (also called non-discretionary) access control system.

In Web terms, and Apache terms in particular, discretionary controls are based on usernames and passwords, and mandatory controls are based on things like the IP address of the requesting client.

Another way to keep discretionary versus non-discretionary controls straight is to think about the way failures are handled: if you fail a discretionary check (such as if you misspell your password), you get another chance -- but if a mandatory check fails, you get a 'forbidden' error rather than 'not authorised,' and there's no way to say "give me another chance" without starting from scratch and requesting the page again as though for the first time. And unless something's changed on the server, even retrying isn't going to make a difference; you'll still be locked out.

Authentication versus Authorisation

Authentication is the process of verifying that credentials are correct -- that is, that the username is in the database and the password is correct for the username. Authorisation is the process of checking to see if a validated client is permitted to access a particular resource. For instance, Bob may have correctly supplied his username and password, but still not be able to access Jane's file because she hasn't included him in the authorisation list for it.

In Apache, almost all of the security-related modules (see a later section for a list) actually do both. The main feature that distinguishes them from each other is their authentication aspect; mostly, they let you store the valid credential information in one format or another. mod_auth, for instance, looks in normal text files for the username and password info, and mod_auth_dbm looks in a DBM database for it. They handle the authorisation side of their task in essentially identical ways, however.

The security modules are passed the information about what authentication databases to use via directives, such as AuthUserFile or AuthDBMGroupFile. The resource being protected is determined from the placement of the directives in the configuration files; in this example:

    <Directory /home/johnson/public_html>
        <Files foo.bar>
            AuthName "Foo for Thought"
            AuthType Basic
            AuthUserFile /home/johnson/foo.htpasswd
            Require valid-user
        </Files>
    </Directory>
  

the resource being protected is "any file named foo.bar", in the /home/johnson/public_html directory or anywhere underneath it. Likewise, the identification of which credentials are authorised to access foo.bar is stated by the directives -- in this case, any user with valid credentials in the /home/johnson/foo.htpasswd file can access it.

Realms: Areas of Controlled Access

In terms of discretionary control mechanisms on the Web, each protected area, whether it be a single document or an entire server, is called a realm. When a server challenges a client for credentials, it provides the name of the realm so the client can figure out which credentials to send.

The name of a realm is specified in the Apache configuration files with the AuthName directive, which takes a single argument: the name of the realm.

Note: In older versions of Apache, the entire remainder of the line following the "AuthName" keyword was taken to be the realm name. This caused problems when someone embedded a quotation mark (") in the string, since in the actual HTTP protocol the realm name is quoted. So more recent versions of Apache accept only a single argument to the directive; if you want to use multiple words, like "This is my realm", you need to enclose the entire string within quotation marks so that it will look like a single 'word.'

Realm names are implicitly qualified by the URI to which they apply, and subordinate URIs are implicitly part of the same realm. This means that if <URL:http://foo.com/a/> is in realm "Augh", then <URL:http://foo.com/a/b/c/foo.html> is also in realm "Augh" unless it's been overridden.

The implicit qualification also means that even if <URL:http://foo.com/a/foo.html> and <URL:http://foo.com/b/foo.html> are declared in two separate statements as being in realm "Foo", they're actually two different realms named "Foo". The only way they'd both be in the same "Foo" realm is if they had a common ancestor that was (such as <URL:http://foo.com/>).

The qualification rules will cause the client to prompt for credentials whenever it requests a document in a realm it hasn't visited before -- even if it's visited a different realm with the same name.

There is no default for the AuthName directive, except what might be inherited from an upper-level directory.

The Client/Server Authentication Handshake

When a client first attempts to access a document that's under some sort of discretionary access control, a lot goes on behind the scenes that the end-user probably never sees. Since on the first attempt the client won't know that the resource is protected, it won't include any credentials. When the server receives the request, it will go through all the phases of access checking; when the credentials (none) don't match any that are valid for the resource, the server will return a 'not authorised' status.

In almost all cases, a client that receives such a 'not authorised' response will realise that it didn't send any credentials, and will pop up a dialogue box for the end-user to complete. This box will display the name of the realm in which the document resides, and ask the user for a username and password. Once obtained, the client will make the same request again, only this time it will include the credentials. But as far as the end-user is aware, that first request was completely invisible and never happened.

If the client gets a 'not authorised' status in response to a request that included credentials, it typically responds a little differently: it will probably tell the user 'those credentials weren't accepted, want to try again?' It didn't say that the first time because it hadn't sent any.

In either case, if the end-user opts to not fill in the dialogue and presses 'cancel,' the client typically just displays the error page that the server sent along with the 'not authorised' status, and goes back to waiting for instructions.

Apache Security Processing Phases

The preceding sections have been subtly leading up to this topic. Apache handles all requests by running them through phases. Each Apache module has an opportunity to deal with the request during each of the phases, though most modules only do so for one or possibly two of them.

Apache has three processing phases relating to security checking. They occur in the following order, and are given the following names:

  1. access_checker -- This phase is where mandatory access checks are applied, such as mod_access' check for whether the client's IP address is allowed to access the document or not
  2. check_user_id -- This is the authentication phase, during which a DAC module such as mod_auth checks the user credentials to see if they're even in the database it's been told to use
  3. auth_checker -- This is the phase during which authorisation occurs; modules like mod_auth check to see if the user (who has already been authenticated) is allowed to access the document

Modules that impose discretionary access checks usually participate in the latter two phases.

Basic Authentication versus Digest Auth

How does the username and password get transmitted across the network? Well, in early 2000 the answer is: not very well. It's not that there are technical problems with the transmission; rather, the issues are more philosphical.

There are currently two main methods of passing credentials, called Basic authentication and Digest authentication. The Digest method is considerably more secure, but unfortunately less widely deployed -- so most authentication on the Web is done using the less-secure Basic mechanism.

Basic authentication involves simply base64-encoding the username and password and transmitting the result to the server. This means that anyone who can intercept the transmission can determine the username and password. Of course, this is only useful if those values are valid and end up getting successfully authenticated. <grin> Digest authentication transmits the information in a manner that cannot be so easily decoded.

Since the username and password are so trivially protected in the Basic authentication mechanism, the same authentication database can be used to store user information for multiple realms. The Digest mechanism, though, includes an encoding of the realm for which the credentials are valid, so you must have a separate credentials database for each realm using the Digest method.

When setting up discretionary controls in your Apache configuration, remember that the AuthType directive is required. The setting can be inherited from a higher-level directory or location, but something must set the value to be inherited; there is no default.

Mixing Mandatory and Discretionary Controls -- The Satisfy Directive

Sometimes you want to mix and match discretionary and non-discretionary access controls, such as allowing anyone on the local network to see documents freely, but requiring anyone else to enter a username and password.

This can be done with the Satisfy directive, which takes a single keyword:

All
In order to gain access to documents within the scope of a Satisfy All directive, a client must pass both any applicable non-discretionary controls (such as Allow or Deny directives) and any discretionary ones (like Require directives).
Any
Documents within the scope of a Satisfy Any directive are accessible to any clients that either pass the non-discretionary check (which occur first) or the discretionary ones

To illustrate, the following would permit any client on the local network (IP addresses 10.*.*.*) to access the foo.html page without let or hindrance, but require a username and password for anyone else:

    <Files foo.html>
        Order Deny,Allow
        Deny from All
        Allow from 10.0.0.0/255.0.0.0
        AuthName "Insiders Only"
        AuthType Basic
        AuthUserFile /usr/local/web/apache/.htpasswd-foo
        Require valid-user
        Satisfy Any
    </Files>
  

Restricting by IP Address

Since the IP address is one of those aspects of a client-server HTTP relationship that cannot be changed mid-stream, and cannot be easily faked (without the cooperation of the intervening network systems), it's considered a non-discretionary control. The Apache distribution includes a module for limiting access thusly, called mod_access.

mod_access allows you to specify what domains or addresses should or should not be allowed access, and in which order the two lists (allowed and denied) should be evaluated. The basic syntax of the Allow and Deny directives is

    Allow from host-or-network
  

The host-or-network can be:

  • a host or domain name (www.foo.com),
  • an IP address (10.0.72.3),
  • an IP address and subnet mask (10.0.0.0/255.0.0.0), or
  • an IP address and CIDR mask size (10.73.128.0/18)

Whenever possible you should use IP addresses instead of domain names; using names means that the Apache server needs to do a double-reverse lookup on them to make the translation to the IP address of the client. (A double-reverse lookup, which is always done by Apache when dealing with host names in security-related situations, involves translating the name to an IP address, and then translating that IP address back to a list of names. If the translations don't work in both directions, Apache will consider the host/domain name match to have failed.)

As an added fillip, an alternate form of the Allow and Deny directives, "from env=[!]envariable-name", allows you to make the go/no-go decision based upon the presence (or absence) of an environment variable. The envariable may have been set for the entire server environment, or it may have been set just for the current request by a module such as mod_setenvif.

The Order directive controls how the cumulative lists of Allow and Deny directives are interpreted. If the order is Allow,Deny (note that no spaces are permitted between the keywords!), then the initial state is the equivalent of Deny from All, the Allow conditions are processed, and then the Deny list is. For Order Deny,Allow, the opposite is the case -- the initial state is 'allow everyone,' then denials are handled, and then the allows are used to override them.

The easy way to remember the default state is to recall that it matches the last keyword: Deny,Allow means 'allowed,' and Allow,Deny means 'denied.'

There is a third possibility for the Order directive: mutual-failure. With this keyword, there is no 'default state' -- the only clients that will be allowed in are those that don't appear on any Deny directive, but do appear on at least one Allow directive.

Restricting by User Credentials

If you want to protect pages such that visitors need to enter a username and password, the mod_auth module is your tool. It is one of the simplest and easiest to use of the discretionary control modules.

The key directives in establishing access controls are those that define the location of the credential database and identify the authorised users. For mod_auth, the directives in question are AuthUserFile and Require. Other modules have similar directives.

The AuthUserFile directive simply takes a fully-specified filename path (such as /home/foo/.htpasswd-foo), which tells the module where to find the text authentication file for the module to use in the current realm. No path-shortening nor relative file specifications are permitted.

The Require directive is actually part of the core server rather than being specific to mod_auth, so it's documented (however sparsely) at <URL:http://www.apache.org/docs/mod/core.html#require>. Require is covered in more detail shortly.

Labeling

Different URLs within a realm can be protected in different ways, with different sets of credentials being valid for different locations. However, since the realm is the key the client uses to remember which credentials to send, being egregious about using multiple sets of credentials within the same realm tends to annoy users when they have to re-authenticate repeatedly for what looks like (and in fact is) the same realm. It's generally a good idea to have a one-to-one relationship between realms and sets of authorised credentials.

But how do you turn on access control in the first place? Just as you apply any other Apache directive: by having the directives appear in the appropriate scope. For example:

    <Directory /usr/local/web/htdocs/finance>
        AuthName Finance
        AuthType Basic
        AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
        Require valid-user
    </Directory>
  

This will protect the finance subdirectory and all files and subdirectories in it any below it. Other directories, such as products, remain unaffected.

<Directory> containers are all very well, but what if you want to protect only a single file? Or perhaps a document that isn't mapped to the filesystem, like the output from mod_status? The answer remains the same: use the appropriate scoping directives (such as <Files> and <Location>) to apply the security measures to the items you want protected.

Inheritance

Like almost all other Apache configuration details, the security directives that apply to a particular document or directoy may be inherited from the parent, or possibly even further up the tree. This means that at each level you need only supply those directives that are different. The following two fragments are equivalent:

    <Directory /usr/local/web/htdocs/finance>
        AuthName "Finance Department"
        AuthType Basic
        AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
        Require valid-user
    </Directory>
    <Directory /usr/local/web/htdocs/finance/strategy>
        AuthName "Finance Department"
        AuthType Basic
        AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
        Require user susan bob
    </Directory>

    <Directory /usr/local/web/htdocs/finance>
        AuthName "Finance Department"
        AuthType Basic
        AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
        Require valid-user
    </Directory>
    <Directory /usr/local/web/htdocs/finance/strategy>
        Require user susan bob
    </Directory>
  

The second fragment takes advantage of the inheritance of the values from the parent directory, and simply restricts the access list to only Bob and Susan.

It's generally not a good idea to make too many assumptions when dealing with security matters, so even though inheritance can seem to make your life easier by not requiring you to duplicate directives all over the place, this might be an illusion. Just wait until you see how complicated your life becomes when all the inherited values become compromised because of a single mistake at a higher level.

A related subject involves determining which of possibly several access control modules has the Final Say on whether access is granted or not. This is covered in a later section.

Requiring a Specific Username

Whereas the AuthUserFile directive and friends tell Apache (and the security modules) where to find the authentication databases, it's the Require directive that provides the instructions on how to use them. If a scope doesn't include (or inherit) a Require directive, then it isn't under discretionary access control regardless of whatever other directives may be present.

Multiple occurrences of Require are cumulative; each line gets added to the list of conditions. Whether processing stops at the first matching condition or if all of them need to be met is up to the module programmer; for mod_auth, for example, the first match satisfies the condition for access, even if the configuration contains something potentially confusing like:

    AuthUserFile /home/foo/.htpasswd-foo
    Require user foo
    Require user bar
  

In this case (and in most cases, in fact), the intended meaning is, "Require the username to be foo OR bar."

To avoid complicated configuration files when the access list is large, there's a shortcut notation: "Require valid-user". This means, "any of the usernames in the authentication database can access this realm." Obviously this won't work unless the database contains credentials only for users allowed access; if there are any users in it which aren't supposed to have access (such as might happen if you're sharing a single database across multiple realms), you'll need to use grouping or some other mechanism because the valid-user keyword won't grind finely enough.

Even though the Require directive isn't specific to any particular module, the syntax of the command is. That means that there aren't any restrictions on the syntax; "Require candy-type caramel" will be accepted, on the grounds that one of the security modules have understand what it means.

Most of the discretionary control modules also provide support for grouping users together, and granting access to groups rather than individuals. This can be done (for mod_auth) with the AuthGroupFile directive. Like the user file, the group file simply contains lines of text. Each line consists of a group name, a colon, and a list of comma-separated usernames. When the username is decoded from the request credentials, the module can look it up in the group file to see to which group(s) it belongs. Here's an example group file:

    board:annette,bill,james,gwynyth
    finance:susan,steve,phoebe,zoe,bill_s
    engineering:geekboy,lisa,melanie,george,j_johnson
  

To allow access by group, you simply change the Require directive to something like this:

    Require group board
  

As with normal Unix users, a single username may belong to multiple groups.

The Standard Apache Security Modules

Below is a list of the security-related modules that are included as part of the standard Apache distribution.

mod_access
This is the only module in the standard Apache distribution which applies mandatory controls. It allows you to list hosts, domains, and/or IP addresses or networks that are permitted or denied access to documents.

mod_auth
This is the basis for most Apache security modules; it uses ordinary text files for the authentication database. Entries are of the form "username:password"; additional fields may follow the password, separated from it by a colon, but they're ignored.

mod_auth_db
This module is essentially the same as mod_auth, except that the authentication credentials are stored in a Berkeley DB file format. The directives contain the additional letters "DB" (e.g., AuthDBUserFile).

mod_auth_dbm
Like mod_auth_db, save that credentials are stored in a DBM file.

mod_auth_anon
This module mimics the behaviour of anonymous FTP; rather than having a database of valid credentials, it recognises a list of valid usernames (i.e., the way an FTP server recognises ftp and anonymous) and grants access to any of those with essentially any passwords. This module is most useful for logging access to resources and keeping robots out than it is for actual access control.

mod_auth_digest
Whereas the other discretionary control modules suuplied with Apache all support Basic authentication, mod_auth_digest is currently the sole supporter of the Digest mechanism. It underwent some serious revamping in 1999, and the new version is currently considered 'experimental,' but no problems have been identified with the new code and it's likely to be moved back into the standard stable soon. Like mod_auth, the credentials used by this module are stored in a text file. Digest database files are managed with the htdigest tool. Using mod_digest is much more involved than setting up Basic authentication; please see the module documentation for details.

Allowing Users to Control Access to Their Own Documents

All of the security-related module directives can be used in per-directory .htaccess files. However, in order for Apache to pay attention to them, the directories in question need to be within the scope of a AllowOverride directive that includes the AuthConfig (for discretionary controls) or Limit (for mandatory controls) keywords. For instance, a standard Linux installation of Apache can enable this with the following lines in the httpd.conf file:

    <Directory /home/*/public_html>
        AllowOverride AuthConfig Limit
    </Directory>
  

Using Your System passwd File

This is a common request, and an incredibly bad idea: "How can I use my system's /etc/passwd file as my Web authentication database?"

The simple answer is: you don't. I'll just list a couple of reasons:

  1. If someone manages to crack the username and password of someone accessing a Web page, that person can now log onto your system. (Remember, most of the Web authentication uses the Basic method, which is incredibly simple to crack.)
  2. Unlike your system's login system, which will probably kick you out, disconnect you, lock your account, or do something equally extroverted and paranoid (and log the fact!) if you misspell your password a few times in a row, there are no such controls on the Web. So someone could very easily write a script that just banged away on your system, trying endless combinations of usernames and passwords, and nothing would automatically perk up and make rude noises.

If you still want to to it after reading the above and the additional information in the Apache FAQ, well, on your own head be it. You can do it with mod_access, and that's all I'm going to say about it. And that's probably already too much, too.

Which Database is Authoritative?

What if you want to mix and match and have multiple types of authentication database within a single realm? How does Apache figure out which one to check first, and how does it know to consult another if the first one fails to find the credentials?

The answer has to do with authoritativeness. Each of the discretionary control modules includes a directive named something like AuthAuthoritative. Each module's version of this directive is named differently, so that it can be associated with that module and no other, so we also have AuthDBAuthoritative, AutDBMAuthoritative, and Anonymous_Authoritative.

If a module is considered authoritative, then when Apache gets a "I don't know this person" response, it won't look any further. If the module isn't authoritative, the server can proceed to consult another module.

Technical note: Actually, the decision isn't made by the server itself. Each module knows whether or not it's authoritative (based on the presence/absence/setting of its *Authoritative directive), and so in the case of a failure it signals the stop/continue answer to the server by returning either HTTP_UNAUTHORIZED or DECLINED respectively.

By default, the modules tend to consider themselves authoritative until you tell them otherwise, on the principle that it's better to be safe than sorry. You can make this explicit with a AuthAuthoritative On line, or allow responsibility sharing with AuthAuthoritative Off. (Use the appropriate directive for the module in question!)

The htpasswd, htdigest, and dbmmanage Utilities

These three utilities are considered 'user' tools, since you don't need to be the Webmaster in order to use them to create access control files for your own Web directory. As user applications, their documentation is in the man/man1 subdirectory of your Apache server installation; you can read it with a command such as:

    % man /usr/local/web/apache/man/man1/htpasswd.1
  

Given the assumptions stated earlier, you should find all three of these applications in the /usr/local/web/apache/bin/ directory, and the source of their man pages in /usr/local/web/apache/man/man1/.

The htpasswd application is used to create and maintain text-based authentication databases for use with the mod_auth module. It gets the username and options from the command line, prompts for and reads the password from standard input (twice, for verification), and stores the username and the encrypted password in the specified text file. When the Apache server receives credentials to verify, it encrypts the submitted password using the same algorithm as the stored password, and then compares the results -- so the actual plaintext password doesn't live in a file on your system.

The syntax of the htpasswd command is:

    htpasswd [options] pwfile username [password]
  

htpasswd can encrypt the passwords using a variety of algorithms, indicated by the algorithm flag on the command line:

-m
Causes the password to be encrypted using an Apache-specific modified MD5 hash algorithm. Although no other application can understand passwords encrypted this way, they work on all Apache systems running 1.3.9 or later, and so you can transport your .htpasswd file from Linux to AIX to Solaris to Windows and have it work in each place without any changes. This is the default algorithm for the Windows and TPF platforms.

-d
Use the system's crypt() library routine to encrypt the password. This means that the encrypted passwords will be as safe as those in the system's user file -- but they're probably not transportable to any other system.

-s
This will cause the password to be encrypted using the SHA algorithm, which is used by Netscape servers. This is useful when migrating from one server to the other.

-p
The -p flag means 'plaintext -- don't encrypt the password at all.' This was added because of a problem in Apache 1.3.6 on Windows, which prevented MD5-encrypted passwords (the only other type supported on Windows by that version) from being correctly recognised. Don't use this option unless you're working with a password file for Apache 1.3.6 on Windows. Even then the vastly preferred remedy is to upgrade to a more recent version; 1.3.6 is from early 1999.

The encryption algorithm used is particular to each entry in the file, so it's entirely possible for a file to contain passwords encrypted in different ways.

The htpasswd tool understands two other flags, which control other aspects than encryption:

-b
Get the password from the command line rather than reading it from stdin. This flag is primarily intended to help Windows Webmasters, but it's useful on other platforms as well, as it allows script-based password management in a non-interactive environment (such as allowing a user to change is password with a CGI script). However, since the password appears in plaintext on the command line, it might be visible to another user in the output of a ps command, and there's no verification that it was spelt correctly. Use this option with caution.

-c
By default, htpasswd assumes that the pwfile authentication database file already exists, and will update it. To create a new one, or completely overwrite an existing one, add the -c flag to the command line.

The htdigest and dbmmanage tools, also in the /usr/local/web/apache/bin/ directory, are similar to the htpasswd application. htdigest allows you to maintain text database files for use with Digest authentication, and dbmmanage supports the DB, DBM, GDBM, and NDBM database formats. dbmmanage is a Perl script, so you will need to have the Perl interpreter (version 5 or later) installed on your system in order to use it.

Location of Your Authentication Database

Remember that one of the main things the Apache Web server does is serve up files to visitors from the Internet -- and don't put your authentication database files anyplace where that could happen to them!

For server-wide database files (that is, those managed by the Webmaster and listed in the httpd.conf file, rather than in user's .htaccess files), make sure you put them someplace where they're not under the DocumentRoot. Also make sure you don't put them someplace where they're under an Aliased or ScriptAliased directory.

For access control used by individual users to protect their own documents, the database files should not be under the directory listed in the UserDir directive in the server's httpd.conf file (typically public_html). Having your users put their database files in their home directory, or in another subdirectory (other than under public_html!) is a good idea.

Recent versions of Apache (those newer than 1.3.4 or so) include a default limitation on the common filenames used for per-directory authentication databases:

    <Files ~ "^\.ht">
        Order allow,deny
        Deny from all
    </Files>
  

This will prevent the server from processing requests for files named .htpasswd, .htaccess, .htpasswd-foo.db, and so on. Note that if you upgraded your Apache server from an earlier version, your httpd.conf file may not include these lines, and you may want to add them yourself.

Frequently-Asked Apache Security Questions

I've tried to address most of the common questions about Apache's security mechanisms that keep cropping up, but here are a couple I didn't cover (but which are still common):

Q:
How do I invalidate credentials? Someone has logged in to a protected page, but now wants to 'log out' so no-one else can use his browser window to access the page without logging in again. How do I make his browser forget the credentials that worked the first time?
A:
The simplest way is to redirect the client to a script that always returns a '401 Unauthorised' status, no matter what. That tells the client its credentials are invalid, so it should throw them away. To make this work, the script needs to be in the realm for which the credentials are being invalidated. The big disadvantage to this method is that the default client behaviour on getting a 401 status is to ask the user for new credentials -- so it's not a seamless operation. For a truly invisible invalidation of credentials, you need to remove them from the authentication database -- which means the user won't be able to log back in again. {sigh} It's not an easy thing to do; read the various discussions about it on the www-talk mailing list archives at the W3C.

Q:
How can I use the dbmmanage tool to manage an AuthDBMGroupFile database?
A:
In a word, you can't. At some point in the Apache 1.3 development cycle, the dbmmanage script was altered in such a way that it can now only deal with user files, and not with group files any more. This is a known deficiency, though, and hopefully the ability to handle group files will be added again to a release in the not-too-distant future.

Going Further

You can also find some documentation at the following URLs:

  • <URL:http://www.w3c.org/> (look for the archives of the www-talk mailing list)
  • <URL:http://www.apache.org/docs/mod/mod_access.html>
  • <URL:http://www.apache.org/docs/mod/mod_auth.html>
  • <URL:http://www.apache.org/docs/mod/mod_auth_db.html>
  • <URL:http://www.apache.org/docs/mod/mod_auth_dbm.html>
  • <URL:http://www.apache.org/docs/mod/mod_auth_digest.html>
  • <URL:http://www.apache.org/docs/mod/core.html> (see the Satisfy and Require directives)
  • <URL:http://modules.apache.org/> (the Apache modules registry, for third-party modules

Conclusion

Apache provides a rich set of control mechanisms for protecting Web pages, and continues to track emerging standards, such as the Digest Authentication one, very closely. With care and a little creativity, you should be able to easily apply whatever protections you want to your Web site.


Got a Topic You Want Covered?

If you have a particular Apache-related topic that you'd like covered in a future article in this column, please let me know; drop me an email at <coar@Apache.Org>. I do read and answer my email, usually within a few hours (although a few days may pass if I'm travelling or my mail volume is 'way up). If I don't respond within what seems to be a reasonable amount of time, feel free to ping me again.


About the Author


Ken Coar is a member of the Apache Group and a director and vice president of the Apache Software Foundation. He is also a core member of the Jikes open-source Java compiler project, a contributor to the PHP project, the author of Apache Server for Dummies, and a contributing author to Apache Server Unleashed. He can be reached via email at <coar@apache.org>.


Copyright © 2000 by Ken A L Coar. Limited exclusive rights granted to Internet.Com. Duplication, republication, or redistribution, in whole or in part, is expressly forbidden without the permission of the author.