Apache Week
   
   Issue 28, 16th August 1996:  

Copyright 1996-2005
Red Hat, Inc.

In this issue


Apache Status

Release: 1.1.1
Beta: None
Bugs reported in 1.1.1:

  • Report of request "NULL" being logged in access log
  • Status module can give empty output on OSF
  • Status module can report a server start time of 1st Jan 1970
  • Problems kill -TERM on IRIX not killing all children
  • Can core dump if DNS lookup fails

Bugs fixed in next release:

  • Language headers and types match on sub-tags (e.g. en-US

The following items are under development for the next release of Apache.

Preventing Mismatched Modules

The module API has changed a couple of time since Apache was first released. This should not have affected module authors, since the API interface has remained largely the same. However, some internal data structures have changed, and this means that any pre-compiled modules might not work. If the module is re-compiled it should work ok. To prevent any problems, Apache will report if a module compiled with an old version of the API is being used, and will stop.

Fiddling with Content Negotiation

Content negotiation lets the server pick the best format of response to send back to the browser, based on what the browser says it can accept (See also the Apache Week feature on Content Negotiation). Unfortunately many current browsers do not implement this properly or fully. For example, Netscape Navigator lists the mime types that it can accept like this (this is slightly simplified):

  Accept: image/gif, image/jpeg, */*

The final */* catches any media type. Apache deals with the request by sending back what it thinks is the best match - for example, it will send back a GIF or JPEG version in preference to any other type (since explicitly listed types are preferred over others matched by */*).

However, if the various media types on the server have quality values associated with them, these will be used in preference to checking whether the type was listed explicitly. For instance, if the server has a particular resource in GIF and TIFF format, with the TIFF having source quality of 0.7 and the GIF having 0.3, the TIFF will be preferred. This is probably not what the browser wants, since it would list image/tiff explicitly if it could handle it. The browser should really be sending the type */* with a quality (or desirability) of less than the explicitly listed types, for example:

  Accept: image/gif, image/jpeg, */*; q=0.3

Here the variants which match the */* would have a desirability of 0.3, while those which match either the GIF or JPEG type would have desirability of 1.0.

A older proposal was to force the */* type on the accept line to have a low priority, but this could confuse browsers which do the correct thing (as in the second example Accept: line, above). A better method would be to give */* a lower priority automatically only if none of the Accept: header types include a quality factor.

Also, if the server has two (or more) representations that are equally good for sending back to the browser, it currently picks the one whose type is listed first on the Accept: header. This is not required by the protocol, and a better way of picking the best one to send back would be just to use the smallest (which will reduce transmission time and network bandwidth usage).

Multiple Configurable Log Files

The config log module currently allows the format of the log file to be customised. In the next Apache release, this will replace the common log module.

Apache also comes with two special log modules: mod_log_referer and mod_log_agent. These both log specific parts of each request: the first logs the referrer information (giving the page which the user clicked on to get the current page), and the second logs the user agent (browser) being used. Both of these log modules use a fixed format, and to change the format of either log file requires editing the module source code and recompiling.

Under development is a version of the configurable log module that allows multiple log files, each of which can be customised. This lets the server administrator add log files which log referrers and user agents without having to compile in the special log modules, and lets the formats be changed easily in the config files. If any other special log files are needed these can be added equally easily (for example, it is simple to add a log file which logs the Accept-Language: header to see what language preferences readers have).


Further Ahead: Filtering

In this section, we look at one of the possible features in Apache release 2.0: filters. There is no guarantee that any of this will actually make it into any particular version of Apache, but the ideas outlined below have been discussed by the Apache developers.

What is Filtering?

For each request, Apache identifies the source of the response, which could be a document on the disk, the output of a CGI, or the output of a module such as the server-side-includes module. Then the appropriate contents are sent straight back to the user. There is no ability to process the output further. For example, the output of a CGI script cannot be parsed for server-side-includes commands. Although Apache could be patched to allow specific cases such as this, a more general solution would be to implement arbitrary filtering of content.

Why Use Filtering?

At present, the source used to generate the content for a request is determined in one of a number of ways: direct from a file, processed by a module based on the file's extension, obtained from a CGI run from a given file name or based on the request method, or obtained from an internal handler. Each of these methods sends the result straight back to the browser. Filters would allow the output of any one of these to feed into any other content generator. For example, a CGI script could output server-side-include commands which the SSI module would process.

How it Could be Implemented

To implement filtering into Apache requires the code to be re-organised into two ways:

  • The file input/output routines need to be abstracted so that modules can operate on a stream without knowing where the input comes from or where the output is going.
  • The core code needs to be modified to allow multiple handlers to be called for one request, each taking the output of the previous handler as input. This is much easier if the server can execute these in parallel, using multi-threading. Multi-threading is planned for 2.0.

HTTP Protocol Version 1.1

A new version of the Hypertext Transfer Protocol (HTTP) is under final review by the Internet standards body. When released, HTTP/1.1 will be a major update from the current standard (called HTTP/1.0 and documented in RFC1945).

HTTP determines how browsers and servers communicate over the network. The features provided in the specification determine what facilities are available to browsers for obtaining web pages. The specification also includes details about how Web proxies and caches are supposed to work.

The next version of Apache, 1.2, will fully support the HTTP/1.1 standard, except for the proxy part. In our feature article this week, we look at what HTTP/1.1 adds to the current protocol and how this will affect servers and browsers.

Go to Apache Week feature on HTTP/1.1.


Comments or criticisms? Please email us at editors@apacheweek.com