Apache Week
   

Copyright 1996-2005
Red Hat, Inc.

O'Reilly Open Source convention in San Diego:

Apache Week visited the five day O'Reilly Open Source Conference in San Diego this week and found an overwhelming source of Apache information.

First published: 27th July 2001

O'Reilly Open Source Conference: Day 1

It is exactly a year ago that we had the pleasure of visiting Monterey California to report on the 4th O'Reilly Open Source software convention (Apache Week issue #208). When we managed to get invited back to San Diego in July 2001 we thought we'd been given the ideal assignment; we get to fly to California in July, avoiding the British rain, and spend a week right on the West Coast with other open source gurus and advocates. In fact with only one direct flight a day from England we were unsuprised to find a large number of delegates on the plane; wearing Penguin badges and snapping pictures of the clear views over Greenland with a variety of digital cameras.

To accommodate feeding over a thousand delegates, the conference had erected a huge tent outside the hotel with views overlooking the harbour. It was there we started off Monday morning with the complimentary breakfast. The conference was split over two buildings, with a 10-15 minute walk between the two. With 16 simultaneous tutorial sessions on the first day and with only two Apache Week staff we found it really hard to choose between the talks. We spoke to other delegates who had been similarly overwhelmed by the choice.

Apache Week has reported on the ApacheCon and O'Reilly conferences over the last few years, so this time we wanted to avoid the talks that were copies of ones we've already covered. We decided to mix Apache talks with others that seemed new or interesting.

AxKit

Matt Sergeant gave the first tutorial we visited on his XML application server for Apache, AxKit. AxKit performs a similar function to the Apache Cocoon project, but is written in Perl and C rather than Java. Matt even describes AxKit as "the C version of Cocoon". AxKit was born to as a way of collecting together the various Perl XML technologies and using them to deliver the same XML data in different formats. The use of XML allows for the separation of content, presentation, and logical site management.

The tutorial focussed on the various Perl XML tools available, the evolution of AxKit, and ways to use the result to power both static and dynamic sites. Matt highlighted some exciting and powerful features of AxKit: the intelligent compression of pages being returned to the client (gzip), the ability to parse and serve OpenOffice files on the fly, and AxPoint which powered his presentation by converting an XML outline to PDF.

AxKit allows any number of ways to process the XML for output; from the well known (but steep learning curve of) XSLT to XPathScript which has been designed to allow easy dynamic functionality and is also found within Cocoon.

Future plans for AxKit were covered, these included a port to Apache 2.0 and a complete Content Management System.

Perl for System Administrators

After the provided lunch we headed over to the Perl for System Administrators talk. The presenter, David Blank-Edelman, played music and danced around the hall to get into the mood for the tutorial. The talk had a heavy bias towards security, giving reasons why administrators should be paranoid and numerous stories and anecdotes about hacks and security vulnerabilities. David suggested some best practices that can help protect your scripts; for example there is no need to run a log analysis script as root. Other areas where users can overlook potential security problems are when appending to files, or creating temporary files in Perl. Although this talk was primarily about Perl, David made the important point that "a cutting sysadmin is platform agnostic", and his tips applied as much to sysadmin scripts as to CGI programs.

WebDAV and Apache

Also that afternoon, Jim Whitehead presented a tutorial on WebDAV and Apache. Jim, the chair of the IETF's WebDAV working group, began by giving a brief overview of authoring over HTTP, and gave examples of how collaborative web authoring can take place using WebDAV. The current state of client and server support was described, and an insight into some of the future extensions of the DAV protocol was given (including versioning, searching and access control). The talk continued giving a detailed description of the DAV protocol, explaining the support for properties, and the overwrite prevention mechanisms.

The tutorial finished up with a guide to setting up the WebDAV module for Apache, mod_dav, covering the basic operation of the module and the usual configuration issues. Jim noted that Apache 2.0 bundles mod_dav inside the source tree, making it easier to set up than Apache 1.3, where mod_dav must be compiled as an external module.

Film: Revolution OS

In the evening we took a coach to a local multiplex cinema for the west coast premier of the film "Revolution OS" by director J.T.S. Moore. The aim of the film was to document the history of the open source movement from Richard Stallman's founding of the GNU project, through the VA Linux IPO, to events taking place today. The film focussed on the key people responsible for a few of the historical turning points in the movement.

Early into the film, Eric Raymond said that "Apache was the killer app[lication]" and was responsible for the mass adoption of the Linux operating system. A number of other key people were interviewed including Brian Behlendorf from the Apache Project and Michael Tiemann from Red Hat.

We were impressed at the balance and accuracy of the film, especially the positive way the people interviewed were portrayed. The film would be interesting to engineers as well as outsiders.

At the end of the film the director took questions from the audience aided by Eric Raymond and Bruce Perens. They explained that the film took two years to make and was planned to be shown in the future at film festivals and other conferences.


O'Reilly Open Source Conference: Day 2

We kicked off the second day much as the first, spending our breakfast trying to decide amongst the 17 simultaneous tutorials. Amongst the sessions we didn't get to see was Ryan Bloom's "Writing an Apache 2.0 Filter" which was given to a small, but enthusiastic group of developers.

Introduction to Zope

We hear a lot of positive comments from people using the python-based Zope application server so decided to attend the tutorial "Introduction to Zope" given by Mike Homyack. Mike ran through what Zope is, and its architecture, telling us that "Zope is full Object-Orientated" and "really good at dynamic stuff". Zope has a built in server, z-server, that handles access to the internal content via a number of mechanisms including HTTP, FTP, and DAV. It is usual to let Zope handle all your web site content, but in most situations another server such as Apache or a reverse proxy such as Squid is placed in front in order to accelerate any static content. The main zope.org site itself uses Zope together with Apache; using Rewrite rules to proxy and cache requests to a Zope backend.

Zope currently has its own license but we were told that there was "motivation to give Zope some license like Python" to make it GPL compatible. Zope is in production use by some major companies including CBS New York.

Introduction to PostgresSQL

At the same time as the Zope tutorial, Bruce Momjian gave an introductory tutorial on the PostgresSQL database. Attendees received a complimentary copy of Bruce's book, which the tutorial was based upon. Only a small amount of database expertise was presumed so this talk was very open to beginners. The half-day session allowed many chapters of the book to be covered in reasonable detail, starting with the basic architecture of a database, how to input data, modify data, and make simple queries. The talk then progressed to describe the construction of more complex queries, joins, and how to utilize the relational database capabilities of PostgresSQL. Bruce also presented a follow-up tutorial in the afternoon, covering some of the more advanced features.

Tuesday afternoon

In the afternoon we visited a talk on "Secure Internet Servers and Firewalls with OpenBSD". Although not directly related to Apache, it was interested to see how much security had been added into the OpenBSD system by default. OpenBSD ships with an SSL-enabled version of Apache by default.

We were also lucky to catch the second of a pair of tutorials by Mark-Jason Dominus, entitled "Stolen Secrets of the Wizards of the Ivory Tower". In an enigmatic talk, a set of Perl programming techniques were described including Memoization, the use of iterators, and drew particular attention to closures and anonymous subroutines. The obscure title alludes to the LISP heritage of many of these ideas.

In the evening Larry Wall gave an entertaining and lightning talk on the new features in Perl 6. Larry's talk didn't touch on anything Apache related, so if you are interested read all about it in "The State of the Onion 5" at perl.com.


O'Reilly Open Source Conference: Day 3

Wednesday started as usual with the complimentary breakfast. With 14 simultaneous talks split across the two hotel blocks we spent most of our breakfast choosing which to visit. Four of the days tracks were dedicated to Perl, two to XML, and the remainder split across Tcl/Tk, Mozilla, mod_perl, Java, MySQL, Python, and Emerging Topics. The dedicated Apache track was due to start on Thursday. We noticed that the number of Perl tracks had shrunk slightly this year, with other open-source technology tracks becoming more prominent. In particular we were pleased to see the two XML tracks, something we said was missing from last year.

Before the keynotes of the day a short film was shown which was made up from interviews of the various conference attendees during the tutorial days. Tim O'Reilly appeared on stage and reminded the packed ballroom that we should "think the Internet" and think of "technologies such as Apache, PHP" and not just Linux.

Keynotes

Fred Baker, previous chair of the IETF, gave his keynote presentation titled "Will the next Internet generation still depend on open source?". He explained that although Linux was the only real technology that could threaten Windows and that successful open source is "all about getting good documentation and predictable quality". He welcomed the involvement of commercial interests in open source: "Once the open source technology has to be used by real people then real companies have to do code freezes and manage the development in a way that makes a quality product". He predicted that in the coming years we'll see more open source projects in partnership with the business world. Open source leads to rapid prototyping and exploratory code, with the business partnerships being able to productise them.

W. Phillip Moore from Morgan Stanley Dean Witter then took the stage to show "an open source success story on Wall Street". He showed why open source was important to their business, allowing them to tailor existing applications to their complex environment with a bit of Perl glue thrown in. MSDW are an enterprise class business that have decided to slowly migrate from using Sun hardware with Solaris to using commodity hardware and Linux, with Apache as their primary web server. They've also made contributions back to open source, and have been covertly submitting patches back into the community as well as funding open source development. "It all comes down to vendor risk management", he said, with proprietary software "you're placing a bet on the security of that company and the security of their product, a bet you're not always aware you're making". With open source this dependency is removed and it's possible to get enterprise level support for open source software from a number of vendors.

Open Source Strategies Summit

Also taking place at the convention was the O'Reilly summit on Open Source strategies, aimed at CTOs, CIOs, and CEOs who want to find out how to use open source as a strategic advantage. Although this summit was separate to the main conference we decided to take a look at the opening talk given by Tim O'Reilly, and the subsequent panel discussion with the economist Hal Varian, Brian Behlendorf, and Michael Olsen from Sleepycat.

To begin the session, Tim O'Reilly discussed the reasons underlying the success of the Internet and Open Source software, finding many common themes. The highlights were the emphasis on decentralisation, the combination of many small modules into large complex systems, and the ability to easily extend existing technologies - all important to the wide adoption seen in both arenas. By looking at current trends, Tim talked about some emerging projects which may prove key to the Next Generation Internet.

One of the biggest challenges for Open Source and Internet companies is the search for an appropriate business model. The panel discussion which followed the talk gave many interesting insights from those who have been successful in that search. Brian Behlendorf spoke about the need to identify which intellectual property is released freely, and which is "owned" by the company generating it. All speakers noted that embedded systems would be increasingly important.

Real World Performance Tuning

After lunch, Apache Software Foundation member Ask Bjoern Hansen gave a talk on how to use mod_perl in an efficient way. He explained that it is generally preferable to use mod_perl statically compiled into Apache instead of as a dynamic (shared object) module. However, by doing this you end up with a server that has a much larger memory footprint and since the majority of the time the server is dealing with buffering data to slow clients, this is wasted overhead.

The solution presented was to run a separate server that has mod_perl compiled into it behind a reverse proxy. Apache can also be used as this reverse proxy and can serve static content as well as cache the content created by the dedicated Apache+mod_perl server. In this way the memory usage can be decreased and performance increased.

The slides from the full presentation are available online.

Why SOAP sucks, Why SOAP rocks

There were a large number of talks throughout the conference on SOAP and XML-RPC. Matt Sergeant took a step back to examine what all the fuss was about in a short talk renamed "Why SOAP sucks, Why SOAP rocks". He started out by asking why we are using SOAP when we could use HTTP instead, since HTTP already has all the features that are normally needed, and more. Using HTTP natively allows caching and logging for example. The talk then showed how to do SOAP without SOAP; using mod_perl to control the URL space and using Perl HTTP modules for the transport. The current major advantage of SOAP is that modules such as the Perl SOAP::Lite module exist which allow applications to be developed quickly and easily. There currently is no simple library that would do the equivalent directly over HTTP.

Finally we were shown some services that are already doing the equivalent of a SOAP transaction without SOAP; such as the ability to get search results from Google in XML format (for example try http://www.google.com/xml?q=apacheweek). The slides to this talk are available online.

XML Content management

For the remainder of the afternoon we visited the XML track; in particular we were interested in XML application servers. The first session "XML Content management using XSLT, Schematron and Ant", showed one extensible way of serving XML content to browsers. Following that talk a panel discussion "XML-based Application Frameworks" took place. The basic idea of an XML application server is that you create all the content for your site in XML. The use of XML allows the separation of content from presentation, a useful extra abstraction layer. The XML content can come from static files, from a database, or be dynamically generated content from scripts. In its simplest form you take your XML content then apply a style-sheet to generate HTML for a browser. Application servers usually perform this style-sheet conversion on the fly, caching the results for speed. XSLT is one language that is used to transform XML data in this way. Tools also exist that will take XML and generate PDF, Postscript, presentations, (and more) on the fly.

The most well-known open source XML application server is Apache Cocoon, which relies on Java. Other solutions such as AxKit (C/Perl/mod_perl), Charlie (C/C++/Perl/mod_perl), and technologies such as Xerces/Xalan (Java), and Sablotron (Java), and LibXML/LibXSLT (C), are also available. Even scripting languages such as PHP now have their own XML solutions, although during his tutorial earlier in the week mod_perl guru Matt Sergeant said that the "PHP XML solutions are not very strong".

When the attendees were asked which application server they were using for their applications, the majority said they were using a system they developed themselves (home grown) from the underlying technologies. The rest were a pretty even split between the application frameworks listed. However, having such a wide choice of technologies and servers is no bad thing. As one panel member said "no matter what, if your content is in XML you win".

Pathologically Polluting Perl with Inline.pm

Brian Ingerson presented this talk on the award-winning Inline module (which only celebrated its 1st birthday a few days before the conference). Inline.pm allows programmers to embed code from a variety of programming languages directly inside a Perl script, from C, C++, and assembler through to Java and Python. Brian covered some of the advanced features available when using using embedded C, notably caching of compiled object files.

A demonstration was given showing some "one-liners" using Inline.pm, including an ASCII Mandelbrot set generator. The talk went on to discuss some of the different ways to use Inline.pm: replacing the traditional usage of XS and MakeMaker, and also explained how to extend the module to support new languages.


O'Reilly Open Source Conference: Day 4

Microsoft and Open Source

Wednesday had ended with a night of Mexican food and drink in the conference tent, followed by a party from Stonehenge. Even with all the free drink and food the night before, by 8.45am on Thursday the ballroom was packed for the much anticipated debate between Craig Mundie of Microsoft and Michael Tiemann of Red Hat.

The details of the debate has been covered in a number of other articles. However, we were interested in the comments with relevance to Apache made during the panel discussion. Craig Mundie stated that Microsoft's concern was not about open source but "about the GPL" as it "creates it's own closed community". Tim O'Reilly commented that University licenses (like the BSD License and Apache Software License) "give the best balance between freedom and the right to make money". Also on the panel was Apache Software Foundation member Brian Behlendorf, who said the Apache model has worked well to build up momentum. Although with the Apache license there are no obligations placed on commercial users, history has shown that the companies involved do re-invest and give back to the community.

Apache 2.0; where is it?

With the provocative title "Apache 2.0; where is it?", Ryan Bloom proved a popular start to the Apache track, with over 80 attendees packed in to hear his session. The aim of the talk was to cover what was new in Apache 2.0 but also answer the question of why Apache 2.0 is taking so long.

Ryan explained that since Apache is now so big there are "only three or four people who know 100% of Apache 2.0", and that fortunately he was one of them. The new features of 2.0 were then explained, stopping at Layered IO which is "the Holy Grail" of Apache. Ryan then gave a demonstration of Apache 2.0 acting as a POP3 server to show that it is easy to have Apache serve up other protocols as well as HTTP

Apache Week asked Ryan if he was correct in using the name "Apache 2.0" throughout his talk given that the Apache group have a number of other products and that the binary downloads have been renamed to "httpd". Ryan said that the name was officially "Apache httpd 2.0" but hinted that there was talk of changing the name to something other than httpd in the future.

To answer the question of Apache 2.0 availability Ryan said that he expected to see a full release "next year."

PostgreSQL & The Web

After attending the PostgreSQL tutorial on Monday, we decided to follow up with this talk from Gavin Roy, which gave a practical guide to using PostgreSQL in web applications. Gavin gave an overview of which web platforms could make use of a PostgreSQL database (for instance, PHP and Perl), and gave testimony to the product's reliability and performance in large scale web applications.

The talk proceeded to discuss the architecture of systems using a web server together with a PostgreSQL database, covering the advantages and disadvantages of using a single machine or two separate machines. Some tips on optimising performance in a production database were also given, emphasizing the use of database indices, and regularly vacuuming the database.

In closing, Gavin briefly covered security, authentication and authorization issues when using Postgres in a web environment.

mod_perl 2.0

To end the day we were expecting good things from Doug MacEachern's talk on "mod_perl 2.0". We were not disappointed as over 50 people packed into the last mod_perl session to hear a heavy technical talk about Apache 2.0 and mod_perl.

Doug showed Apache 2.0.22-dev working with both mod_ssl and Perl/mod_perl. This is perhaps the first demonstration of its kind, as mod_ssl is only just becoming usable in the Apache 2.0 tree. He continued and took a program that communicated entirely using stdin and stdout (in this case a NNTP server) and showed how it was easy to make this function as a Apache protocol handler. This allowed Apache to serve newsgroups to his news reader, whilst still allowing other filters to be included such as SSL and authentication.

Future plans for mod_perl 2.0 include the ability to write a MPM completely in Perl, and to continue with the Apache-TestKit, a package not tied to mod_perl that has been designed to test Apache. Doug said that there was still plenty left to do on mod_perl even though it currently seems stable and that there would be "probably a release of some sort at the end of the summer."

Web Security for Business: Introduction to mod_ssl

At the same time as the talk on mod_perl, Paul Weinstein was giving his popular introduction to mod_ssl in the Apache track. The history of mod_ssl for Apache 1.3 was discussed together with some of the decision making process for including mod_ssl in Apache 2.0. The slides to this talk are available online.

Extreme Programming and Open Source Software

In a pair of talks which attracted 60 people into a room designed for 40, the speaker known as "chromatic" described the basics of the Extreme Programming (XP) software development method, and in particular their application in the Open Source world.

The first talk gave an introduction to XP, its differences from more traditional software development, and the motivations behind the techniques it uses to promote the development of high quality software. The talk highlighted that the most important aspect of XP is the emphasis on writing unit tests, and also covered the principles of incremental change, and pair programming.

The room remained packed into the second half of the session, where chromatic discussed how XP can be used within Open Source software (OSS) development. Some elements of XP are already employed in many OSS projects, for instance, the tight feedback loop between users and developers. Many other XP techniques could also be usefully employed, but some, such as pair programming, were considered inappropriate in the majority of Open Source development.


O'Reilly Open Source Conference: Day 5

Last year, the conference sessions were held over just two days and we were pleased to see they were extended to a third day to fit in more presentations. Friday consisted of the extension of tracks from previous days together with tracks dedicated to PHP, Zope, and Open Source Speech.

After breakfast, Michael Tiemann was the moderator for the morning keynote looking at the "big hairy problems: open source challenges in the enterprise". The first speaker was from DreamWorks, the animation company behind such epics as Antz, Chicken Run, and now Shrek. He told us how DreamWorks were slowly switching thousands of machines from SGI to Linux giving them increased performance and value for money. When working on their strategy for adopting Linux they analysed six key factors: performance, scalability, stability, software, support, and transition.

W. Phillip Moore from Morgan Stanley Dean Witter took the stage and built upon his previous keynote. He explained that it was important that the enterprise customers have a support number they can call with problems, the ability to get fixes to existing problems, and the ability to get enhancements. He complimented Covalent and Red Hat specifically but said that there was a need to see more companies providing commercial support for open source software: "you need to know there is a 800 number and a staff of people that will be able to solve the problem."

Apache Portable Run-time: Why?

Ryan Bloom gave this talk on APR, the Apache Portable Run-Time, which began with a quick history lesson explaining how Apache 1.3 addressed portability issues, and how APR and Apache 2.0 grew out of that experience. Ryan explained what the initial goals for the library were, and showed how it provides an abstraction layer for commonly used operating system interfaces which has been ported to a range of 50 Unix platforms, BeOS, Windows, and OS/2.

The talk gave a breakdown of the different components which make up APR: from file and network I/O, memory handling, through to some of the more complex interfaces providing threading support. For each component an overview of the API was given, showing how it could be used in applications. Ryan also gave an insight into why various OS interfaces (such as POSIX) cannot be used portably, justifying the need for the abstraction layer which APR provides.

To give a more in-depth look at the API, the talk gave a walk-through of a code sample using the threading interface, and took a look at some of the test code present in APR which exercises most of the library's capabilities. Although APR's primary user is the Apache httpd server, the library is also used by a number of other projects such as Subversion.

Web Security for Business

Paul Weinstein closed off the afternoon with his talk all about private certificate authorities. The session showed the basics of how to create and then use a private certificate authority, then went into the more advanced details. The examples were based around the OpenSSL toolkit; showing which parameters to use on the OpenSSL command line, and how to integrate the certificates into Apache with mod_ssl. Finally the tricky subject of certificate revocation was covered. The slides to this talk are available online.

Exhibition

The vendor exhibition area was very popular with a large number of companies attending. We didn't find much information specific to Apache at the exhibition: NuSphere were giving out MySQL CDs that come with a packaged version of Apache, and Red Hat had some information on their Apache services. However there were plenty of free promotional t-shirts to add to our collection, as well as more of the flashing clear rubber bouncy balls we picked up last year from collab.net. Oh, and let's not forget the "Apache by night" Apache Week postcards of course.


Overall Impressions

Even if you were not interested in any of the other tracks there were plenty of talks and tutorials relevant to Apache users, although a number of them were direct copies or updates of talks given at previous Apache conferences such as ApacheCon 2001.

Apache Week talked to a large number of the attendees of the conference and the overall impression was very positive. One attendee said that "the keynotes alone were worth the trip". We were also particularly impressed by the child care facilities; allowing conference speakers and participants to bring their families and enjoy a mini holiday in San Diego. The night-time activities and the food was also excellent. The only complaint we heard repeated by a number of attendees was that lunch was not included on Friday, even though there was a full day of sessions.

With 802.11b wireless internet connectivity to most of the conference rooms it was hard to escape from work; and with five intensive days packed with new material we found ourselves tired and in need of a holiday by the end of the week. Next time we'll bring our swimming trunks and sun cream.

Please note that although Apache Week is an O'Reilly Network affiliate, O'Reilly had no editorial control over this review of their conference, even though they did give us free beer. Apache Week will give you our unbiased opinion of all the conferences we attend that have things of interest to Apache users and developers. For more coverage of the rest of the conference visit the O'Reilly Network web site.


This feature brought to you by: Mark J Cox, Joe Orton
Comments or criticisms? Please email us at editors@apacheweek.com