Apache Week
   

Copyright 1996-2005
Red Hat, Inc.

Apache 1.2 API Guide :

For module authors, a comprehensive list of changes to the Apache module API.

First published: 6th June 1997

Introduction

Apache 1.2 is now out. Here we list all the module API changes compared to the API in Apache 1.1.3. Anyone who has written a module for Apache 1.1.3 or earlier should read this to see if the need to make modifications for it to work with 1.2. In any case, Apache adds many new features from HTTP/1.1, and modules might want to take advantage of them. See also our Guide to Apache 1.2

First published in Apache Week issue 44 (6th December 1996), last updated 6th June 1997.


API Changes

API Version

The module API version is now 19970526.

New Parse Headers Phase

A new phase of request processing is available, to allow modules to process the request headers early on in the request.

Defining and Processing Directives

The functions which handle directives should now return type "const char *" instead of "char *". If this is not done, compiling the module might result in type-mismatch warnings, although it will still work.

Directives can now be defined in more than one module as once. Each module is given the chance to handler the directive, and can decline it by returning DECLINE_CMD. This gives other modules the chance to handle the directive. This is used in Apache in mod_auth.c and mod_auth_dbm.c, which both support AuthUserFile, but only handle it if they recognise the file type argument.

Directives can now take up to three arguments, and can take optional arguments. The number of arguments is specified in the module's command table, with values such as TAKE2 (for two arguments). Possible values are now:

  • TAKE3: takes 3 arguments
  • TAKE12, TAKE23, TAKE123, TAKE13: takes a variable number of arguments (1 or 2, 2 or 3, 1 or 2 or 3, 1 or 3 respectively). The function called should declare arguments for the maximum number of argument the directive can take. Arguments not set on the directive will be passed to the function as NULL.

Finally, the cmd_parms structure has been updated (this is passed in as the cmd argument to directive handlers). A new 'cmd' element is now available, pointing at the directive's command table definition (command_rec).

Supporting New Request Methods

Apache now supports the additional OPTIONS and TRACE request methods. Two new defines are available for these methods, M_OPTIONS and M_TRACE. The request_rec's method element could be set to one of these. The handler can send an OPTIONS response using send_http_options() (although it could also decline the request, and let the default handler send the response). Handlers can also set the new allowed request_rec element to enable the creation of a proper Allow HTTP/1.1 header. This is done by shifting the M_ defines right by the appropriate amount. For example, to specify that GET and POST (only) are allowed for a particular resource, the following could be used:

  r->allowed = (1 < M_GET) | (1 < M_POST);

Reading PUT or POST data

The way that a module reads PUT or POST data has been completely changed. This is necessary to support HTTP/1.1, which can send this data in a 'chunked' encoding. Modules can request that they get the data after it has been 'dechunked', or they can get the raw data. Any module which handled PUT or POST data by using the old read_client_block() will need to be modified before it will compile with 1.2.

The way to read a request body in 1.2 involves several steps:

  1. Call setup_client_block() to prepare to handle the data. The second argument to this function tells Apache how to process the body (if at all). It can be one of: REQUEST_NO_BODY (issue a 413 error if any body is present), REQUEST_CHUNKED_ERROR (issue a 414 if the body was sent encoded), REQUEST_CHUNKED_DECHUNK (if body is chunked, process to remove the chunking), REQUEST_CHUNKED_PASS (pass on the chunks).
  2. Call should_client_block() when ready to read the data. This sends a "100 Continue" status to the client (new in HTTP/1.1) and tells the module whether it is ok to read the data.
  3. Repeatedly call get_client_block() to get the data (possibly all in one go, but possible also a bit at a time)

Returning Responses to Prevent Caching

A HTTP response can include headers to indicate to the client that this response should not be cached at all. In previous versions of Apache, this was done by setting the no_cache element of the request_rec. This also had the effect of always sending the response, even if a "304 Not Modified" response could be returned. Now a new element has been added, no_local_copy. When this is set, a 304 response will never be generated. Setting no_cache will send a response that cannot be cached.

Data Structure Changes

New in the request_rec

  • no_local_copy and no_cache replace 'no_cache' (type int)
  • request_time - time request was received (type time_t)
  • boundary - boundary string for multipart/byteranges (type char *)
  • range - range header text (type char *)
  • content_language deprecated. Use content_langauges array instead (array of char*)
  • allowed - set to allowed methods (returned on Allow: header by send_http_header()) (type int)
  • byterange - number of byte ranges (type int)
  • chunked - if sending chunked encoding (type int)
  • read_length (bytes read so far) (type long)
  • read_body (read_body can take REQUEST_NO_BODY, REQUEST_CHUNKED_ERROR, REQUEST_CHUNKED_DECHUNK, REQUEST_CHUNKED_PASS) - set by handler (type int)
  • clength - real content length (type long)
  • remaining - bytes left to read (type long)

There are some other elements used internally within Apache. In addition, the existing port element is now an unsigned int rather than a (signed) int.

New in the server_rec

  • send_buffer_size - sets the TCP send buffer size
  • addrs - list of addresses for this vhost (type server_addr *)
  • server_uid and server_gid contain the euid/egid to run suexec wrapper as (types uid_t, uid_t)

The server_rec no longer contains host_addr, host_port or virthost. Instead, the server could be responding to multiple server addresses, so a new array (addrs) is created, each type type server_addr. The server_addr_rec contains the IP address, port and name of the server.

Regular Expressions

Apache is now compiled with a regular expression library. Modules can use the function calls provided by this library to make use of regular expressions. Note that on systems which provide a stable and bug-free regular expression library, the one supplied with Apache is not used. The library is available in the src/regex directory of the Apache distribution. The only thing to note when using this regular expressions is that regsub() should not be used. This is because it returns a string allocated internally, not using Apache's pool allocation system. A new API function, pregsub() is provided instead which does the same as regsub(), but allocates space in the pool passed in as an argument.

Multiple Language Support

Resources can be associated with multiple resources. Typically, mod_mime obtains information about which languages a file is in from its extensions, but modules can also set the language of their response. Previously, the language was set as a string called content_language in the request_rec. That is still available for backwards compatibilty, but will only hold the last language that mod_mime set. To get all the language in a file, or to set a response with multiple languages, the new element content_languages should be used instead. This is an array (created using the standard Apache array functions such as make_array()), with each element being a "char *" string containing a language tag.

For example, if a module wants to output a response in English and German, it should set content_languages with:

  char **new;
  r->content_languages = make_array (r->pool, 2, sizeof(char*));
  new = (char **)push_array (r->content_languages);
  *new = "en";
  new = (char **)push_array (r->content_languages);
  *new = "de";

Other New API Functions

The following API functions are new in Apache 1.2, and have not already been mentioned above.

  • blookc() can be used to look ahead one character in a BUFF* stream.
  • call_exec() to run sub-programs, possible as a different user.
  • clear_table() to empty a table
  • construct_server() returns a string giving the "hostname:port" for a given hostname and port (:port is omitted if it is 80).
  • find_last_token() looks if a given token appears as the last part of a string.
  • find_token() looks to see if a given token exists in a comma-separated list of tokens
  • getword_white() available to get a word, skipping white space
  • is_table_empty() check if a table has any contents (this is a macro)
  • pregcomp() to preform a regular expression comparison
  • pregfree() to mark memory used by a regular expression comparison as available.
  • pregsub() is used after a regular expression match to substitute matching parts.
  • scan_script_header_err can be used instead of scan_script_header() to return error information from the headers
  • send_fd_length() sends a part of an open file.
  • send_header_field() sends a single header to the client.
  • set_flag_slot() sets an on/off flag in a module's config (complements existing set_string_slot()).
  • rflush() can be used when sending a response to force output to be flushed to the client.
  • table_do() to call a function for each item in a table

API functions that use a port number previously used a signed int and now use an unsigned int. File descriptors are now passed as long instead of int to functions such as pclosef() and note_cleanups_for_fd().

All the HTTP status codes have been renamed to start with HTTP_, and the new codes from HTTP/1.1 have been added. Macros are now available to check status codes, such as is_HTTP_REDIRECT(status)


Internal Changes

People writing modules might also be interested in how the core Apache code works. This list, provided for information only, is a summary of the major changes to the source code which have not been reported elsewhere (as new features, for example).

  • CookieLog now handled by mod_log_config
  • Code to do some of transparent connected negotiation (see #define HOLTMAN in mod_negotiation.c)
  • Configure updated to handle new simpler configuration file format
  • Date-related functions are now in util_date.c
  • mod_includes calls can_exec() for sub-processes
  • Modules can be compiled in but inactive. The compiled in modules are listed in preloaded_modules[] array, while the active modules are stored in prelinked_modules[].
  • Modules will be moving into the src/modules directory (only mod_proxy has moved so far)
  • Proxy code moved to src/modules/proxy directory, within the new modules directory
  • Regular expression library has been added in src/regex directory
  • Returns 100 Continue before reading request entity
  • Scoreboard now contains the name of the vhost processing the request.
  • The #define names for OS-specific functions have been simplified and made consistent: HAS_GMTOFF is now HAVE_GMTOFF, HAVE_SYS_SELECT_H and HAVE_SYS_RESOURCE_H and been added, and USE_* used to select preferred options on particular OSes (USE_FCNTL_SERIALIZED_ACCEPT; USE_FLOCK_SERIALIZED_ACCEPT; USE_LONGJMP)
  • The fd for each listener is stored to allow graceful restarts
  • To support graceful restarts, scoreboard records a 'generation' number
  • Various function arguments and return values are declared as const.

Comments or criticisms? Please email us at editors@apacheweek.com