Apache Week
   
   Issue 27, 9th August 1996:  

Copyright 1996-2005
Red Hat, Inc.

In this issue


Apache Status

Release: 1.1.1
Beta: None
Bugs in 1.1.1:

  • Language negotiation doesn't match languages against sub-languages, i.e. it treats en and en-US as completely different languages.
  • Doesn't support HTTP continuation headers

Apache NT?

There has been some discussion over whether to make a Windows NT release of Apache. A couple of groups have ported Apache to NT, but these have consisted of rewrites rather than integrating NT support into the current Apache code. At present there is no source code released for these ports.

Currently, Apache supports a range of different Unix systems, and OS/2. Supporting NT would be a logical extension, and would fit in well with plans for a multi-threaded version since NT is a multi-threaded operating system. However there are decisions to take about how the Apache source code should be structured to support multiple different platforms. Ideally, the code should be as similar as possible across all systems, with only changes to lower-level system-dependent parts.


The following items are under development for the next release of Apache.

Configuration file simplified

The process of configuring Apache has been simplified for the next release. It still involves editing the Configuration file, but this file has been re-written to make it easier. Previously, selecting an operating system involved finding the appropriate section in the file, then uncommenting one or more lines. Now that is reduced to simply uncommenting a line like 'PLATFORM=SOLARIS2'.

In addition, previously the various compile time options, such as 'STATUS' for extra status information, required making manual additions to the CFLAGS line. These have been replaced by simple 'Rule' lines. For example, to use the extra status information, the line 'Rule STATUS=yes' would be included in the Configuration file.

This new configuration system will be in place for release 1.2. After that, the next major release (2.0) will probably have a significantly different configuration procedure. This might include fully automatic detection of the operating system and its capabilities.

Debugging CGI Scripts

Problems with CGI scripts can be notoriously difficult to identify. To help make it easier, the next release of Apache will let the administrator log the input and output of the script.

The ScriptLog directive sets a log file to receive the debugging information. Each time Apache has a problem with a CGI program it will log all the relevant details such as the URL, CGI filename, request headers, POST data, script output and script error output. Additional directives can limit the total log file size, and the size of POST data logged.

Config Log to be Default

Apache currently comes with two modules for doing logging: the 'common log' module logs in standard Common Log Format (CLF) and is compiled in by default, while the 'config log' module makes the log file format customizable (but defaulting to working exactly like the common log module unless configured otherwise). From the next release the config log module will be the default.

With the config log module, the LogFormat directive is used to set a format for the log file (it defaults to CLF format). Any of the CLF elements can be logged, along with any incoming or outgoing header. In addition, the module can perform simple tests before logging some items: for example, it can be told to log referer headers only when the request failed.

Besides becoming the default log module, the config log has some addition things it can log: the host and port of the request, the duration of the request, the outgoing content-type, and a configurable format for the date and time.

New Default Modules

So far, there will be three major changes to the modules included in the next release of Apache:

The use of server-side-includes is covered in our feature

Executing CGI as Other Users

The next release of Apache will include the ability to execute some scripts as users other than the main server owner. At present, when a CGI is executed, it runs as the user specified by the User directive on the configuration files. While this is fine on small sites, when a site offers Web space to different users (such as with multiple virtual hosts), the use of single user means that any user can access (and potentially change) any other user's data. The way around this at present is to use a 'setuid' wrapper script (See Apache Week issue 18 for more information about wrapper scripts).

The next Apache release will include such a wrapper program. The server itself will also be updated to set the user that a script runs as in a couple of ways: firstly, the User directive can be used inside <VirtualHost> sections to set a user for that VHost, and secondly if a request comes in for a URL starting /~user the script will be run as the 'user' named in the URL. This also applies to other sub-processes, for example, commands run from server-side-includes. It is also planned to allow the user to be set for each directory in a future release, but this might not make it into Apache 1.2.

To enable this functionality, the Apache API will have a new function call, call_exec. Modules which run sub-programs (such as the CGI and includes module) now call this function to run the program as the correct user.

Encoding and Content Type Duplication

The extensions .Z and .gz usually represent the 'encodings' for Unix compress and gzip. Apache can be configured to set the encoding on the transmitted reply with AddEncoding. This lets browsers decode the file before handling it. However, the default mime.types file with Apache also includes entries for .gz and .Z scripts, so their content type will be set to the encoding scheme, which is wrong. This lines should be removed from mime.types, and will not be included in the next release.


Identifying Browser Capabilities

Each different browser implements a different range of HTML commands. Some extend HTML with their own additions. It is common to now see pages marked as 'designed for Netscape' or 'best viewed with Internet Explorer'. These pages are clearing indicating a browser preference and might be off-putting to people with other browsers. Even when designed for a single browser, different releases of the browser have different capabilities. Older versions of Netscape and MSIE do not support frames, tables or java, for instance.

Content providers who want their content to be acceptable on a wide range of browsers of various ages need to write pages carefully, and often cannot take advantage of the latest features. It would be better to have a way that the server could somehow know what browser is being used, and what its capabilities are, and tailor the HTML it sends to the browser appropriately.

Of course, one way of tailoring responses is to have links to different pages (e.g. "click here for a non-frames version"), but that is intrusive and implies that the user knows exactly what their browser can do. A better alternative is to get the server to automatically output the correct HTML, using either CGI or server side includes (SSIs). Both methods use the USER_AGENT HTTP header which says what browser is being used. For example, a perl CGI script might contain the code:

  if ($ENV{'HTTP_USER_AGENT'} =~ m|^Mozilla/[2-3]|) {
    $tables = 1; }

The trouble with this is that the knowledge about the browser capabilities needs to be repeated in every CGI script or SSI. As new browsers come out and others are updated, modifying every CGI or SSI which uses this would be tedious.

However, the next release Apache will make it easier for both CGI and SSIs to know what capabilities the browser has. An additional module will set environment variables based on user-definable rules which match the USER_AGENT. The administrator only has to maintain the knowledge about the browser capabilities in one place: the Apache config file. Then every CGI and SSI can use the environment variables to tailor their output, if desired.

For example, the following rules could be used:

  BrowserMatch ^Mozilla/[2-3] tables java

The first argument is a regular expression to match against the USER_AGENT. If it matches, the rest of the line is treated as enviroment variables to set (and these can be specified with values, for instance "html=3.2").

Now CGI scripts and SSI files can use the environment variable tables to determine if the browser supports tables. Using XSSI, it is easy to create tailored HTML:

  <!--#if expr="$tables" -->
    <table>
     <tr><td>Welcome to my Page!
     ...
    </table>
  <!--#else -->
   <h1>Welcome to my Page!</h1>
   ...
  <!--#endif -->

Creating Dynamic Content with Server-Side Includes

While standard HTML files are fine for storing pages, it is very useful to be able to create some content dynamically. For example, to add a footer or header to all files, or to insert document information such as last modified times automatically. This can be done with CGI, but that can be complex and requires programming or scripting skills. For simple dynamic documents there is an alternative: server-side-includes (SSI).

SSI lets you embed a number of special 'commands' into the HTML itself. When the server reads an SSI document, it looks for these commands and performs the necessary action. For example, there is an SSI command which inserts the document's last modification time. When the server reads a file with this command in, it replaces the command with the appropriate time.

Apache includes a set of SSI commands based on those found in the NCSA server. This is implemented by the includes module (mod_includes). An extension of the standard SSI commands is available in the XSSI module, which will be a standard part of the Apache distribution from the next release. XSSI adds the following abilities to the standard SSI:

  • Variables in commands: XSSI allows variables to be used in any SSI commands. For example, the last modification time of the current document could be obtained with <!--#flastmod file="$DOCUMENT_NAME" -->
  • Setting variables: the set command can be used within the SSI to set variables.
  • Conditionals: SSI commands if, else, elif and endif can be used to include parts of the file based on conditional tests. For example, the $HTTP_USER_AGENT variable could be tested to see the type of browser and different HTML codes output depending on the browser capabilities.

For details of how to use SSI in your HTML documents, see our feature on Using Server Side Includes


Ooops...

The feature on Apache and SSL in issue 25 contained the following errors:

  • Microsoft Internet Explorer 3 beta still cannot handle arbitrary certificate authorities
  • SSL security makes use of randomly generated symmetric keys as well the public key encryption
  • To use RSA in the US, you must to use the RSA libraries

Comments or criticisms? Please email us at editors@apacheweek.com