Apache Week
   
   Issue 37, 18th October 1996:  

Copyright 1996-2005
Red Hat, Inc.

In this issue


Apache Status

Release: 1.1.1
Beta: None

Bugs reported in 1.1.1:
  • Report that if .htaccess is a directory, children go into a loop

Apache is being prepared for a public beta release. While there is currently no date set for the release, the development work has slowed down to allow the internal testing of a stable version.

The following items are under development and may be included in the next release of Apache. See our Apache 1.2 Sneak Preview for other new features in the next release.

Maximum Number of Clients Silently Enforced

The directive MaxClients tells Apache a maximum number of child processes to use. This is used to prevent a 'run-away' system from creating more and more processes, leading to higher load and (probably) eventual system overload. It is called MaxClients because it limits the number of browsers (clients) which can be connected concurrently to the server.

The maximum value that MaxClients can be set to is 150. If it is set any higher, a limit of 150 will be applied anyway. At present, there is no warning that this is being done, so the administrator might think that they have successfully set a limit above 150. The only way to set a higher limit is to edit the Apache source code and recompile. The code to alter is in the httpd.h file. Find the line which sets HARD_SERVER_LIMIT and change the 150 to the preferred maximum number.

The next Apache release will address make this more obvious. It will probably issue a warning if an attempt is made to set a MaxClients limit higher than the pre-compiled value. Alternatively, it might be possible to get the MaxClients value to override the internal limit. The original reason for a compiled-in limit was because of the external scoreboard file, which needed to be created initially to the correct size. Now most systems use a scoreboard in shared memory it should be easier to resize it as required.

Better API Access to Status Codes

As part of a general cleanup of HTTP handling to get ready for HTTP/1.1, all the HTTP status codes that the server can return have been added into Apache, with appropriate error messages. In addition, the #define names for the statuses have been changed to reflect their HTTP/1.1 names. For example, in the current API, a module can return a document ok status (200) by using the name DOCUMENT_FOLLOWS. It is now available as HTTP_OK. All status values are defined starting with HTTP_, to prevent conflict with existing define names. The old names are still defined for backwards compatibility, but new modules written for the Apache 1.2 module API should use the new names.

In addition, other references to hard-coded HTTP status values have been removed. In some cases, a range of values has a particular meaning, for example, any code between 200 and 299 means the request is successful. In Apache 1.2, modules can use new macro definitions to get the meaning of response codes without hard-coding the response code numbers. For example, to see if a code means that the document is being returned, a module function can use the macro is_HTTP_OK().


Feature: Using User Authentication

There are two ways of restricting access to documents: either by the hostname of the browser being used, or by asking for a username and password. The former can be used to, for example, restrict documents to use within a company. However if the people who are allowed to access the documents are widely dispersed, or the server administrator needs to be able to control access on an individual basis, it is possible to require a username and password before being allowed access to a document. This is called user authentication.

Setting up user authentication takes two steps: firstly, you create a file containing the usernames and passwords. Secondly, you tell the server what resources are to be protected and which users are allowed (after entering a valid password) to access them.

Creating a User Database

A list of users and passwords needs to be created in a file. For security reasons, this file should not be under the document root. The examples here will assume you want to use a file call users in your server root at /usr/local/etc/httpd.

The file will consist of a list of usernames and a password for each. The format is similar to the standard Unix password file, with the username and password being separated by a colon. However you cannot just type in the usernames and passwords because the passwords are stored in an encrypted format. The program htpasswd is used to add create a user file and to add or modify users.

htpasswd is a C program that is supplied in the support directory of the Apache distribution. If it is not already compiled, you will to compile it first. Run

  make htpasswd

in the support directory to compile it (you might need to modify the Makefile first, since any configuration you did when compiling the server itself is not available to this makefile). After compilation, you can either leave the htpasswd binary where it is, or more it to a directory on your path (e.g. /usr/local/bin). In the former case, you will need to remember to give the full pathname to run it. The examples here will assume that it is installed somewhere on your path.

Using htpasswd

To create a new user file and add the username "martin" with the password "hampster" to the file /usr/local/etc/httpd/users:

   htpasswd -c /usr/local/etc/httpd/users martin

The -c argument tells htpasswd to create new users file. When you run this command, you will be prompted to enter a password for martin, and confirm it by entering it again. Other users can be added to the existing file in the same way, except that the -c argument is not needed. The same command can also be used to modify the password of an existing user.

After adding a few users, the /usr/local/etc/httpd/users file might look like this:

martin:WrU808BHQai36
jane:iABCQFQs40E8M
art:FAdHN3W753sSU

The first field is the username, and the second field is the encrypted password.

Configuring the Server

To get the server to use the usernames and passwords in this file, you need to configure a realm. This is a section of your site that is to be restricted to some or all of the users listed in this file. This is typically done on a per-directory basis, with a directory (and all its subdirectories) being protected (Apache 1.2. will let you protect individual files). The directives to create the protected area can be placed in a .htaccess file in the directory concerned, or in a <Directory> section in the access.conf file.

To allow a directory to be restricted within a .htaccess file, you first need to ensure that the access.conf file allows user authentication to be setup in a .htaccess file. This is controlled by the AuthConfig override. The access.conf file should include AllowOverride AuthConfig to allow the authentication directives to be used in a .htaccess file.

To restrict a directory to any user listed in the users file just created, you should create a .htaccess file containing:

  AuthName "restricted stuff"
  AuthType Basic
  AuthUserFile /usr/local/etc/httpd/users

  require valid-user

The first directive, AuthName, specifies a realm name for this protection. Once a user has entered a valid username and password, any other resources within the same realm name can be accessed with the same username and password. This can be used to create two areas which share the same username and password.

The AuthType directive tells the server what protocol is to be used for authentication. At the moment, Basic is the only method available. However a new method, Digest, is about to be standardised, and once browsers start to implement it, digest authentication will provide more security than the basic authentication.

AuthUserFile tells the server the location of the user file created by htpasswd. A similar directive, AuthGroupFile, can be used to tell the server the location of a groups file (see below).

These four directives have between them tell the server where to find the usernames and passwords and what authentication protocol to use. The server now knows that this resource is restricted to valid users. The final stage is to tell the server which usernames from the file are valid for particular access methods. This is done with the require directive. In this example, the argument valid-user tells the server that any username in the users file can be used. But it could be configured to allow only certain users in:

  require user martin jane

would only allow users martin and jane access (after they entered a correct password). If user art (or any other user) tried to access this directory - even with the correct password - they would be denied. This is useful to restrict different areas of your server to different people with the same users file. If a user is allowed to access the different areas, they only have to remember a single password. Note that if the realm name differs in the different areas, the user will have to re-enter their password.

Using Groups

If you want to allow only selected users from the users file in to a particular area, you can list all the allowed usernames on the require line. However this means you are building username information into your .htaccess files, and might not been convenient if there are a lot of users, and . Fortunately there is a way round this, using a group file. This operates in a similar way to standard Unix groups: any particular user can be a member of any number of groups. You can then use the require line to restrict users to one or more particular groups. For example, you could create a group called staff containing users who are allowed to access internal pages. To restrict access to just users in the staff group, you would use

  require group staff

Multiple groups can be listed, and require user can also be given, in which case any user in any of the listed groups, or any user listed explicitly, can access the resource. For example

  require group staff admin 
  require user adminuser

which would allow any user in group staff or group admin, or the user adminuser, to access this resource after entering a valid password.

A group file consists of lines giving a group name followed by a space-separated list of users in that group. For example:

  staff:martin jane
  admin:art adminuser

The AuthGroupFile directive is used to tell the server the location of the group file. Note that the maximum line length within the group file in about 8000 characters (actually 8kB). If you have more users in a group than will fit within that line length, you can have more than one line with the same group name within the file.

Problems with Large Numbers of Users

Using htpasswd to create a text list of users, and maintaining a list of groups in a plain text file is relatively easy. However if the number of users becomes large, the server has a lot of processing to do to find a user's group and password details. This processing has to be done for every request inside the protected area (even though the user only enters their password once, the server has to re-authenticate them on every request). This can be slow with a lot of users, and adds to the server load. Much faster access is possible using DBM format files. This allows the server to do a very quick lookup of names, without having to read through a large text file. However managing DBM files is more complex. Apache Week will cover the use of DBM authentication in a future issue.

Other Ways of Storing User Details

While Apache by default can only access user details in plain text files, various add-on modules are available to allow user details to be stored in databases. Besides DBM format (available with the mod_auth_dbm module), user and group lists can be stored in DB format files (with mod_auth_db). Or full databases can be used, such as mSQL (with mod_auth_msql), Postgres95 (mod_auth_pg95) or any DBI-compatible database (mod_auth_dbi).

It is also possible to have an arbitrary external program check whether the given username and password is valid (this could be used to write an interface to check against any other database or authentication service). Modules are also available to check against the system password file, or to use a Kerberos system. See the feature on Adding Modules for more information.

Limiting Methods Differently

In the example .htaccess file above, the require directory is not given inside a <Limit> section. This is valid in Apache, and means it applies to all request methods. In other servers and most example .htaccess files, the require directive is given inside a <Limit> section, such as this:

  <Limit GET POST PUT>
  require valid-user
  </Limit>

In Apache it is better to omit the <Limit> and </Limit> lines, to ensure that the protection applies to all methods. However, this format can be used to limit particular methods. For example, to limit just the POST method, use

  AuthName "restrict posting"
  AuthType Basic
  AuthUserFile /usr/local/etc/httpd/users

  <Limit POST>
  require group staff
  <Limit>

Now only members of the group staff will be allowed to POST. Other users (unauthenticated) can use other methods, such as GET. This could be used to allow a CGI program to be accessed by anyone, but only authorised uses can POST information to it.

Restricting By Hostname or Username

One feature of the NCSA server is that is allows a request to be allowed if it comes from within a particular domain name, or if not, to ask for a valid username and password (using the satisfy directive). This is a combination of restricting by username and by the user's hostname. Unfortunately Apache currently cannot do this.

How WWW Authentication Works

The method used in HTTP for user authetication is quite simple. Since HTTP is a stateless protocol - that is, the server does not remember any information about a request once it has finished - the browser needs to resend the username and password on each request. Here is how it works.

On the first access to an authenticated resource, the server will return a 401 status ("Unauthorized") and include a WWW-Authenticate response header. This will contain the authentication scheme to use (at the moment, only Basic is allowed) and the realm name. The browser should then ask the user to enter a username and password. It then requests the same resource again, this time including a Authorization header which contains the scheme name ("Basic") and the username and password entered.

The server checks the username and password, and if they are valid, returns the page. If the password is not valid for that user, or the user is not allowed access because they are not listed on a require user line or in a suitable group, the server returns a 401 status as before. The browser can then ask the user to retry their username and password.

Assuming the username and password was valid, the user might next request another resource which is protected. In this case, the server would respond with a 401 status, and the browser could send the request again with the user and password details. However this would be slow, so instead the browser sends the Authorization header on subsequent requests. Note that the browser must ensure that it only sends the username and password to further requests on the same server (it would be insecure to send those details if the user moved onto a different server).

The browser needs to remember the username and password entered, so it can send them with future requests from the same server. Note that this can cause problems when testing authentication, since the browser remembers the first username and password that works. It can be difficult to force the browser to ask for a new username and password.

Security and Digest Authentication

While authentication does allow resources to be restricted to particular users, there are potential security issues. Some of these are:

  • Care must be taken to ensure that the resource is restricted against all methods. Use of <Limit GET>, for instance, leaves POST and other request methods unprotected.
  • The username and password are stored in a plain text file. While the password is encrypted, it is not completely safe against decryption, so the file should not be accessible to other users on the system. More importantly, it should not be placed under the document root where users from other sites could access it.
  • The username and password is as secure as any username/password system, in that end-users should not tells others their password, or write it down, or make it easily guessable.
  • The Basic authentication scheme transmits passwords across the Internet unencrypted, so they could be intercepted. The Digest method, see below, is intended to address this issue.

The Digest Authentication scheme will make the sending of passwords across the Internet more secure. It effectively encrypts the password before it is sent such that the server can decrypt it. It works exactly the same as Basic authentication as far as the end-user and server administrator is concerned. The use of Digest authentication will depend on whether browser authors write it into their products. Apache can already do Digest authentication, when compiled with the mod_digest module (supplied with the Apache distribution).

More Information

For more information about how user authentication works on the Internet, see the HTTP/1.0 and HTTP/1.1 documents, available from the Apache Week links page. Also available there is a link to the draft Digest Authentication specification.

For basic information about setting up user authentication, see the NCSA Tutorial (most of which also applies to Apache).

For modules which allow usernames, groups and passwords to be stored in database format files, or databases themselves, see this Apache Week feature on Adding Modules.


Comments or criticisms? Please email us at editors@apacheweek.com