Apache Week
   
   Issue 93, 28th November 1997:  

Copyright 1996-2005
Red Hat, Inc.

In this issue


Apache Status

Apache Site: www.apache.org
Release: 1.2.4 (Released 22nd August 1997) (local download sites)
Beta: 1.3b3 (Released 20th November 1997) (local download sites)

Apache 1.2.4 is the current stable release. Users of Apache 1.2.3 and earlier should upgrade to this version. The next release will be 1.3. A beta test release of 1.3 is available now for both Unix and Windows 95/NT systems.

Bugs fixed in 1.3b4

These bugs have been found and fixed in 1.3b4.

Because of the major differences between Windows and Unix, these are separated into bugs which affect Windows systems only, and other bugs (which may affect Windows as well). Unix users can ignore the bugs listed in the Windows section.

Windows-specific Bugs

  • Possible memory corruption related to use of the scoreboard.

Patches for bugs in Apache 1.2.4 may be made available in the apply to 1.2.4 directory on the Apache site. Some new features and other unofficial patches are available in the 1.2 patches directory. For details of all previously reported bugs, see the Apache bug database and known bugs pages. Also many common configuration questions are answered in the Apache FAQ.

Development has slowed down over the last couple of weeks to prepare for the release of Apache 1.3. Now that the first beta is out, Apache is in a "feature freeze" where no new features will be added. The only changes from now on will be bug-fixes.


Inside Apache on Windows

The last two betas of Apache 1.3 have been available for Windows NT and 95. The first was as source only, while a very early binary distribution is available for 1.3b3. This feature looks at how Apache on NT works, how it differs from Apache on Unix, how it can be installed and configured, and how modules can be written for it.

Apache on NT is still at an early stage, and may change before the final release. For the latest information, read the documentation on the Apache site (www.apache.org).

This feature will talk about "Apache NT" because Windows 95 is not a suitable operating system on which to run a web server. However, unless otherwise noted Apache NT will work on Windows 95 as well.

A Quick History

Support for Apache on NT was not planned for Apache 1.3. When Apache 1.2 was being developed, the plan was to do a major restructuring of the code after 1.2 to support various operating systems, and calling the next version 2.0. This would involve making the core of Apache independent of the underlying operating system, and providing an abstraction layer which mapped things Apache wanted to do onto the actual implementation in each supported operating system. It would also involve providing support for multi-threading, as well as the current multi-process mode of operation. Modern operating systems, including Windows, Solaris and Linux, would be able to take advantage of multi-threading. Doing all this correctly would take a long time, and at the same time a number of other fundamental changes would be implemented, such as restructuring the module callback system.

But soon after 1.2 was released, a full port of Apache to NT was supplied by an Apache user, Ambarish Malpani of valicert. This was a good opportunity to make Apache work on NT, since there was growing interest in using Apache on NT. So a new release, 1.3, was planned which would support Windows NT. It would also have some new features for both Unix and NT.

After 1.3 is finished, work will again start on version 2.0, which will have proper support for multiple operating systems and multiple process and thread models, as well as a new module callback interface and other important changes.

Multithreading

There are some obvious differences between any product on NT and the same product on Unix. For example, the NT version has to support the use of both long and short filenames and to worry about case insensitivity, while Unix filenames are unique. However the major area of operation that is different in Apache NT from Apache for Unix is that the former operates with multiple threads.

Using multiple threads on NT is required because NT does not support the multiple process model as used on Unix. There are two things that NT cannot do: firstly, it cannot duplicate a running process into two identical copies (on Unix, this is a fork), and secondly, it cannot give multiple processes access to the same incoming network socket (on Unix, the multiple processes all listen for incoming connections, with one process picking up each new connection). Because of these limitations, Apache on NT has to use a single process to handle all incoming requests.

Luckily, NT does support multiple threads, so it can use threads where the Unix Apache would use processes, with some slight differences. It creates a single "parent" thread, which listens for each new incoming connection. This thread then accepts that connection, but does nothing with it. Instead it puts it onto a pile of connections, then any one of a number of "worker" threads will each pick one connection off this pile and handle it. Once the request (or, in the case of a kept-alive request, a whole connection) is finished, the worker thread will return to see if there are any more outstanding requests on the pile being created by the parent thread.

This is roughly similar to the multiple-process model as used by Unix, with a few subtle differences:

  • On Unix, the parent process never has anything to do with the incoming requests, but on NT, the parent thread does some initial work on each new connection before a worker thread takes over.
  • If something serious goes wrong during the processing of a request, the process handling it may die (for example, if there is a bug in a third-party module). Once it dies, Apache will restart a new child process. On Unix, the death of a process will not affect any other requests in progress. On NT, if a process dies it will take down all the other threads currently running, leaving requests unfinished. This is an unavoidable behaviour when using threads instead of processes.
  • Because of the preceding problem, Apache on NT does still use multiple processes, and if the one currently running dies, another one will take over. This is intended to provide some more reliability in the face of unexpected bugs or module problems.

Finally it is worth noting that multithreading is not used on Unix in 1.3. This is partly because the implementation of multithreading in Apache 1.3 is very specific to Windows, and partly because multithreading is implemented differently on different Unix systems. The next release after 1.3 (probably 2.0) should support multithreading on various Unix platforms.

Installing and Starting Apache NT

Apache NT is a console application. This means it uses a text window when it runs. On NT (not 95) it can also be run as a service, which is the preferred method of operation.

When Apache is installed (either from a binary distribution, or by compiling), it will by default expect it's server root to be in the directory C:/APACHE. As with Unix, this can be changed when Apache starts with the -d command line option. If the filename has spaces in it (e.g. Program Files/Apache/ServerRoot), then enclose the argument in double quotes, e.g.

  apache -d "/Program Files/Apache/ServerRoot"

To install Apache as a service on NT, run it with the -i option. Then it can be started and stopped from the service manager, just like any other NT service. To remove it from the services list, run Apache with the -u option. Any problems installing, starting, stopping or removing the service will be logged to the error log (as defined by the ErrorLog directive). Note that errors will not be reported to the text window when you run Apache interactively. So always check the error log after running Apache with -i or -u.

Note that Apache NT does not use the registry at all to store configuration or path information. Apache is configured just like the Unix versions, using httpd.conf, srm.conf and access.conf files in the conf subdirectory under the server root directory. If this is not in the default location, C:/APACHE, then the -d option must be given when Apache starts (or, like on Unix, give the -f option and the full path to the httpd.conf file).

Windows does not support Unix-style signals, so you cannot get Apache to reread its configuration by sending it a HUP or USR1 signal. At present, the only way to get Apache to restart is to stop and start it manually.

Configuration Differences

When configuring Apache NT there are some differences from Unix. These include

  • In directory names (e.g. in ServerRoot, <Directory>, etc) always use Unix-style forward slashes to separate path components. Never use backslashes (\). If the path includes spaces, surround it with double quotes. Drive letters may be used at the start of paths, but if omitted Apache will assume the same drive that Apache was started from.
  • StartServers, like Unix, gives a number of processes to start. All apart from one are standby processes, so do not give a large number. 3 should be ok. MinSpareServers and MaxSpareServers are ignored.
  • The new directive ThreadsPerChild gives the number of worker threads to create within the currently active process. This determines the potential throughput of the server. Unlike Unix processes, then number of threads does not alter based on workload. A value of 25 to 50 should be okay here.
  • Piping error or access log messages to a child process are not supported.
  • Adding additional modules is easier with Apache NT. The module should be compiled into a DLL file and placed in the modules directory of the server root. Then the LoadModule directive can be used to load this module into Apache. This directive to load the module must occur before any directive which is defined within that module. The syntax of this directive is LoadModule structure DLL where the structure name is the internal module structure name, and DLL is the path to the DLL file, relative to the server root.

Most other features of Apache from Unix are supported on NT. This includes regular expressions, server-side includes, status and info modules, CGI, access restrictions, proxy module and so on (with the additional modules supplied as DLL files ready for use with LoadModule).

Apache NT Modules

The core Apache executable contains the modules compiled in by default into Unix versions of Apache. These are the following modules:

  • core_module: core features
  • mime_module: MIME types
  • access_module: access restriction by client address
  • auth_module: access restriction by username/password
  • negotiation_module: server-side content negotiation
  • includes_module: SSI
  • autoindex_module: directory index files
  • dir_module: directory indexes
  • cgi_module: CGI
  • userdir_module: per-user directories
  • alias_module: URL aliasing
  • env_module: setting environment variables
  • config_log_module: logging
  • asis_module: asis files
  • imap_module: server-side imagemaps
  • action_module: mapping extensions to handlers
  • setenvif_module: conditional setting of environment variables

In addition, Apache NT contains the following two NT specific modules:

  • dll_module: supports loading of other modules as DLL files
  • isapi_module: supports server extensions using ISAPI

This isapi_module provides an alternative method of extending the server, instead of using the Apache module API you can also use the ISAPI protocol

Configuring CGI

Using CGI programs, though, is slightly different from Unix. On Unix, a program is either run as a binary executable, or if it is a script, the shell runs it by looking first for a #! line to decide which interpreter to use. This allows scripts to be written in shell, perl, python, etc.

Using executable files for CGI (such as precompiled C programs) is identical on NT as on Unix. The file must be placed in a ScriptAlias directory, or the extension (e.g. .EXE) must be mapped onto the cgi-script handler type with AddHandler and the ExecCGI option must be enabled. Typical directives to enable .EXE as CGI within a particular directory are:

  AddHandler cgi-script exe
  <Directory c:/apache/htdocs>
    Options +ExecCGI
  </Directory>

Executing scripts is a little more difficult. Windows NT's command line interpreter (CMD.EXE, or COMMAND.COM on 95) does not support the use of additional script languages or the #! special sequence. Apache NT does provide support for #!. To use this, first enable CGI execution for your CGI files (e.g. use a .CGI extension, enabled as for the .EXE example above), then create your CGI script. On the first line put #! followed by the full path to the interpreter to use. For example, to use C:\BIN\PERL, use

  #!C:/BIN/PERL
  
  print <EOF:
  Content-Type: text/html

  <h1>My CGI Program in Perl!</h1>
  EOF

or whatever. Scripts to be executed as batch files do not need the #! line. Note that there is no space between the #! and the pathname of the interpreters. This feature may make it easier to copy scripts from a Unix server onto NT.

Using and Developing Modules with NT Support

Many modules written for Unix will work fine on Apache NT, possibly with minor changes for the differences between Unix and NT. However there may be a problem because of the multithreading nature of Apache NT. A multithreaded program has to take extra care to ensure that it does not corrupt its data because multiple threads are running with the same set of variables and local data. Modules written for Unix will probably not be designed to work when run multithreaded. If they are lucky, they will not use any of the C language features that are unsafe when run multithreaded, however if they do they may cause random and unpredictable results when run under Windows.

The main problem caused by multithreading is the use of static or global data (variables or memory) which could be updated by multiple threads. The Apache module API has been extended to cope with this. Global variables which can be updated by the module should be defined with the APACHE_TLS symbol, like this:

  APACHE_TLS int module_status;

"TLS" stands for "thread local storage". It ensures that each thread gets its own copy of the variable.

Besides threading, another difference between Unix and NT is that on Unix modules are compiled right into the Apache executable as object or library files. On NT modules are loaded from DLL files. The Apache executable needs to be able to access the module definition structure once it has been loaded. This is done with the use of MODULE_VAR_EXPORT, which makes any variable defined within the DLL module available to the Apache executable. You will normally only need to use this for the module structure itself. For example, here is the definition of mod_env's module structure:

  module MODULE_VAR_EXPORT env_module;

Going the other way, the DLL file needs to be able to access the internal Apache functions defined by the Apache module API. These functions are exported from the Apache executable with the API_EXPORT macro in the core apache code (you will not need to use this in modules). Only functions defined with this macro are available to module DLLs.

API_EXPORT, APACHE_TLS and MODULE_VAR_EXPORT are #define values, which are set to the correct values on Unix, so modules can be written which compile on both Unix and Windows.

Some Problems with Apache NT

There are still some problems with Apache on NT. It is at an early beta stage, and bugs are still being found. There are also some issues because it is the first NT version of a long standing Unix program, and it does not work the way a native NT application might be expected to. A couple of examples: it does not use the registry, and log messages are written to a file which is held open all the time making log rotation difficult.

This is also the first version of Apache to support multithreading, and there may be problems with the way that multiple threads are handled within the code, plus of course the inherent problems because threads can access each other's data. So modules have to be written to be thread-safe.


Apache in the News

Apache is number 10 in a C|Net feature on the top ten things to be thankful for. In the Builder.Com article top 10 things to be thankful for, Apache is commended for standing up to the "combined onslaught of Netscape and Microsoft". Netscape itself is at position 8 for standing up to Microsoft.


Comments or criticisms? Please email us at editors@apacheweek.com