NAME
    Apache::HeavyCGI - Framework to run complex CGI tasks on an
    Apache server

SYNOPSIS
     use Apache::HeavyCGI; # see eg/ directory of the distribution
                           # for a complete example/template

WARNING UNSUPPORTED ALPHA CODE RELEASED FOR DEMO ONLY
    The release of this software is only for evaluation purposes to
    people who are actively writing code that deals with Web
    Application Frameworks. This package is probably just another
    Web Application Framework and may be worth using or may not be
    worth using. As of this writing (July 1999) it is by no means
    clear if this software will be developed further in the future.
    The author has written it over many years and is deploying it in
    several places, e.g. http://www.stadtplandienst.de,
    http://netlexikon.akademie.de and really soon on
    http://pause.perl.org too. It has turned out to be useful for
    him. YMMV.

    There is no official support for this software. If you find it
    useful or even if you find it useless, please mail the author
    directly.

    But please make sure you remember: THE RELEASE IS FOR
    DEMONSTRATION PURPOSES ONLY.

DESCRIPTION
    The Apache::HeavyCGI framework is intended to provide a couple
    of simple tricks that make it easier to write complex CGI
    solutions. It has been developed on a site that runs all
    requests through a single mod_perl handler that in turn uses
    CGI.pm or Apache::Request as the query interface. So
    Apache::HeavyCGI is -- as the name implies -- not merely for
    multi-page CGI scripts (for which there are other solutions),
    but it is for the integration of many different pages into a
    single solution. The many different pages can then conveniently
    share common tasks.

    The approach taken by Apache::HeavyCGI is a components-driven
    one with all components being pure perl. So if you're not
    looking for yet another embedded perl solution, and aren't
    intimidated by perl, please read on.

  Stacked handlers suck

    If you have had a look at stacked handlers, you might have
    noticed that the model for stacking handlers often is too
    primitive. The model supposes that the final form of a document
    can be found by running several passes over a single entity,
    each pass refining the entity, manipulating some headers, maybe
    even passing some notes to the next handler, and in the most
    advanced form passing pnotes between handlers. A lot of Web
    pages may fit into that model, even complex ones, but it doesn't
    scale well for pages that result out of a structure that's more
    complicated than adjacent items. The more complexity you add to
    a page, the more overhead is generated by the model, because for
    every handler you push onto the stack, the whole document has to
    be parsed and recomposed again and headers have to be re-
    examined and possibly changed.

  Why not subclass Apache

    Inheritance provokes namespace conflicts. Besides this, I see
    little reason why one should favor inheritance over a using
    relationship. The current implementation of Apache::HeavyCGI is
    very closely coupled with the Apache class anyway, so we could
    do inheritance too. No big deal I suppose. The downside of the
    current way of doing it is that we have to write

        my $r = $obj->{R};

    very often, but that's about it. The upside is, that we know
    which manpage to read for the different methods provided by
    `$obj-'{R}>, `$obj-'{CGI}>, and `$obj' itself.

  Composing applications

    Apache::HeavyCGI takes an approach that is more ambitious for
    handling complex tasks. The underlying model for the production
    of a document is that of a puzzle. An HTML (or XML or SGML or
    whatever) page is regarded as a sequence of static and dynamic
    parts, each of which has some influence on the final output.
    Typically, in today's Webpages, the dynamic parts are filled
    into table cells, i.e. contents between some `<TD></TD>' tokens.
    But this is not necessarily so. The static parts in between
    typically are some HTML markup, but this also isn't forced by
    the model. The model simply expects a sequence of static and
    dynamic parts. Static and dynamic parts can appear in random
    order. In the extreme case of a picture you would only have one
    part, either static or dynamic. HeavyCGI could handle this, but
    I don't see a particular advantage of HeavyCGI over a simple
    single handler.

    In addition to the task of generating the contents of the page,
    there is the other task of producing correct headers. Header
    composition is an often neglected task in the CGI world. Because
    pages are generated dynamically, people believe that pages
    without a Last-Modified header are fine, and that an If-
    Modified-Since header in the browser's request can go by
    unnoticed. This laissez-faire principle gets in the way when you
    try to establish a server that is entirely driven by dynamic
    components and the number of hits is significant.

  Header Composition, Parameter Processing, and Content Creation

    The three big tasks a CGI script has to master are Headers,
    Parameters and the Content. In general one can say, content
    creation SHOULD not start before all parameters are processed.
    In complex scenarios you MUST expect that the whole layout may
    depend on one parameter. Additionally we can say that some
    header related data SHOULD be processed very early because they
    might result in a shortcut that saves us a lot of processing.

    Consequently, Apache::HeavyCGI divides the tasks to be done for
    a request into four phases and distributes the four phases among
    an arbitrary number of modules. Which modules are participating
    in the creation of a page is the design decision of the
    programmer.

    The perl model that maps (at least IMHO) ideally to this task
    description is an object oriented approach that identifies a
    couple of phases by method names and a couple of components by
    class names. To create an application with Apache::HeavyCGI, the
    programmer specifies the names of all classes that are involved.
    All classes are singleton classes, i.e. they have no identity of
    their own but can be used to do something useful by working on
    an object that is passed to them. Singletons have an @ISA
    relation to the Class::Singleton manpage which can be found on
    CPAN. As such, the classes can only have a single instance which
    can be found by calling the `CLASS->instance' method. We'll call
    these objects after the mod_perl convention *handlers*.

    Every request maps to exactly one Apache::HeavyCGI object. The
    programmer uses the methods of this object by subclassing. The
    HeavyCGI constructor creates objects of the AVHV type (pseudo-
    hashes). If the inheriting class needs its own constructor, this
    needs to be an AVHV compatible constructor. A description of
    AVHV can be found in the fields manpage. An Apache::HeavyCGI
    object usually is constructed with the `new' method and after
    that the programmer calls the `dispatch' method on this object.
    HeavyCGI will then perform various initializations and then ask
    all nominated handlers in turn to perform the *header* method
    and in a second round to perform the *parameter* method. In most
    cases it will be the case that the availability of a method can
    be determined at compile time of the handler. If this is true,
    it is possible to create an execution plan at compile time that
    determines the sequence of calls such that no runtime is lost to
    check method availability. Such an execution plan can be created
    with the the Apache::HeavyCGI::ExePlan manpage module. All of
    the called methods will get the HeavyCGI request object passed
    as the second parameter.

    There are no fixed rules as to what has to happen within the
    `header' and `parameter' method. As a rule of thumb it isf
    recommended to determine and set the object attributes
    LAST_MODIFIED and EXPIRES (see below) within the header()
    method. It is also recommended to inject the the
    Apache::HeavyCGI::IfModified manpage module as the last header
    handler, so that the application can abort early with an Not
    Modified header. I would recommend that in the header phase you
    do as little as possible parameter processing except for those
    parameters that are related to the last modification date of the
    generated page.

  Terminating the handler calls or triggering errors.

    Sometimes you want to stop calling the handlers, because you
    think that processing the request is already done. In that case
    you can do a

     die Apache::HeavyCGI::Exception->new(HTTP_STATUS => status);

    at any point within prepare() and the specified status will be
    returned to the Apache handler. This is useful for example for
    the Apache::HeavyCGI::IfModified module which sends the response
    headers and then dies with HTTP_STATUS set to
    Apache::Constants::DONE. Redirectors presumably would set up
    their headers and set it to
    Apache::Constants::HTTP_MOVED_TEMPORARILY.

    Another task for Perl exceptions are errors: In case of an error
    within the prepare loop, all you need to do is

     die Apache::HeavyCGI::Exception->new(ERROR=>[array_of_error_messages]);

    The error is caught at the end of the prepare loop and the
    anonymous array that is being passed to $@ will then be appended
    to `@{$self->{ERROR}}'. You should check for $self->{ERROR}
    within your layout method to return an appropriate response to
    the client.

  Layout and Text Composition

    After the header and the parameter phase, the application should
    have set up the object that is able to characterize the complete
    application and its status. No changes to the object should
    happen from now on.

    In the next phase Apache::HeavyCGI will ask this object to
    perform the `layout' method that has the duty to generate an
    Apache::HeavyCGI::Layout (or compatible) object. Please read
    more about this object in the Apache::HeavyCGI::Layout manpage.
    For our HeavyCGI object it is only relevant that this Layout
    object can compose itself as a string in the as_string() method.
    As a layout object can be composed as an abstraction of a layout
    and independent of request-specific contents, it is recommended
    to cache the most important layouts. This is part of the
    reponsibility of the programmer.

    In the next step HeavyCGI stores a string representation of
    current request by calling the as_string() method on the layout
    object and passing itself to it as the first argument. By
    passing itself to the Layout object all the request-specific
    data get married to the layout-specific data and we reach the
    stage where stacked handlers usually start, we get at a composed
    content that is ready for shipping.

    The last phase deals with setting up the yet unfinished headers,
    eventually compressing, recoding and measuring the content, and
    delivering the request to the browser. The two methods finish()
    and deliver() are responsible for that phase. The default
    deliver() method is pretty generic, it calls finish(), then
    sends the headers, and sends the content only if the request
    method wasn't a HEAD. It then returns Apache's constant DONE to
    the caller, so that Apache won't do anything except logging on
    this request. The method finish is more apt to being overridden.
    The default finish() method sets the content type to text/html,
    compresses the content if the browser understands compressed
    data and Compress::Zlib is available, it also sets the headers
    Vary, Expires, Last-Modified, and Content-Length. You most
    probably will want to override the finish method.

    head2 Summing up +-------------------+ | sub handler {...} | +--
    ------------------+ | (sub init {...}) | |Your::Class |---
    defines------>| | |ISA Apache::HeavyCGI| | sub layout {...} | +-
    -------------------+ | sub finish {...} | +-------------------+

                                            +-------------------+
                                            | sub new {...}     |
     +--------------------+                 | sub dispatch {...}|
     |Apache::HeavyCGI    |---defines------>| sub prepare {...} |
     +--------------------+                 | sub deliver {...} |
                                            +-------------------+

     +----------------------+               +--------------------+
     |Handler_1 .. Handler_N|               | sub header {...}   |
     |ISA Class::Singleton  |---define----->| sub parameter {...}|
     +----------------------+               +--------------------+

                                                                           +----+
                                                                           |Your|
                                                                           |Duty|
     +----------------------------+----------------------------------------+----+
     |Apache                      | calls Your::Class::handler()           |    |
     +----------------------------+----------------------------------------+----+
     |                            | nominates the handlers,                |    |
     |Your::Class::handler()      | constructs $self,                      | ** |
     |                            | and calls $self->dispatch              |    |
     +----------------------------+----------------------------------------+----+
     |                            |        $self->init     (does nothing)  | ?? |
     |                            |        $self->prepare  (see below)     |    |
     |Apache::HeavyCGI::dispatch()| calls  $self->layout   (sets up layout)| ** |
     |                            |        $self->finish   (headers and    | ** |
     |                            |                         gross content) |    |
     |                            |        $self->deliver  (delivers)      | ?? |
     +----------------------------+----------------------------------------+----+
     |Apache::HeavyCGI::prepare() | calls HANDLER->instance->header($self) | ** |
     |                            | and HANDLER->instance->parameter($self)| ** |
     |                            | on all of your nominated handlers      |    |
     +----------------------------+----------------------------------------+----+

Object Attributes
    As already mentioned, the HeavyCGI object is a pseudo-hash, i.e.
    can be treated like a HASH, but all attributes that are being
    used must be predeclared at compile time with a `use fields'
    clause.

    The convention regarding attributes is as simple as it can be:
    uppercase attributes are reserved for the Apache::HeavyCGI
    class, all other attribute names are at your disposition if you
    write a subclass.

    The following attributes are currently defined. The module
    author's production environment has a couple of attributes more
    that seem to work well but most probably need more thought to be
    implemented in a generic way.

    CAN_GZIP
        Set by the can_gzip method. True if client is able to handle
        gzipped data.

    CAN_PNG
        Set by the can_png method. True if client is able to handle
        PNG.

    CAN_UTF8
        Set by the can_utf8 method. True if client is able to handle
        UTF8 endoded data.

    CGI An object that handles GET and POST parameters and offers the
        method param() and upload() in a manner compatible with
        Apache::Request. Needs to be constructed and set by the user
        typically in the contructor.

    CHARSET
        Optional attribute to denote the charset in which the
        outgoing data are being encoded. Only used within the finish
        method. If it is set, the finish() method will set the
        content type to text/html with this charset.

    CONTENT
        Scalar that contains the content that should be sent to the
        user uncompressed. During te finish() method the content may
        become compressed.

    DOCUMENT_ROOT
        Unused.

    ERROR
        Anonymous array that accumulates error messages. HeavyCGI
        doesn't handle the error though. It is left to the user to
        set up a proper response to the user.

    EXECUTION_PLAN
        Object of type the Apache::HeavyCGI::ExePlan manpage. It is
        recommended to compute the object at startup time and always
        pass the same execution plan into the constructor.

    EXPIRES
        Optional Attribute set by the expires() method. If set,
        HeavyCGI will send an Expires header. The EXPIRES attribute
        needs to contain an the Apache::HeavyCGI::Date manpage
        object.

    HANDLER
        If there is an EXECUTION_PLAN, this attribute is ignored.
        Without an EXECUTION_PLAN, it must be an array of package
        names. HeavyCGI treats the packages as Class::Singleton
        classes. During the prepare() method HeavyCGI calls HANDLER-
        >instance->header($self) and HANDLER->instance-
        >parameter($self) on all of your nominated handlers.

    LAST_MODIFIED
        Optional Attribute set by the last_modified() method. If
        set, HeavyCGI will send a Last-Modified header of the
        specified time, otherwise it sends a Last-Modified header of
        the current time. The attribute needs to contain an the
        Apache::HeavyCGI::Date manpage object.

    MYURL
        The URL of the running request set by the myurl() method.
        Contains an URI::URL object.

    R   The Apache Request object for the running request. Needs to be
        set up in the constructor by the user.

    REFERER
        Unused.

    SERVERROOT_URL
        The URL of the running request's server-root set by the
        serverroot_url() method. Contains an URI::URL object.

    SERVER_ADMIN
        Unused.

    TIME
        The time when this request started set by the time() method.

    TODAY
        Today's date in the format 9999-99-99 set by the today()
        method.

  Performance

    Don't expect Apache::HeavyCGI to serve 10 million page
    impressions a day. The server I have developed it for is a
    double processor machine with 233 MHz, and each request is
    handled by about 30 different handlers: a few trigonometric,
    database, formatting, and recoding routines. With this overhead
    each request takes about a tenth of a second which in many
    environments will be regarded as slow.

BUGS
    The fields pragma doesn't mix very well with Apache::StatINC.
    When working with HeavyCGI you have to restart your server quite
    often when you change your main class. I believe, this could be
    fixed in fields.pm, but I haven't tried. A workaround is to
    avoid changing the main class, e.g. by delegating the layout()
    method to a different class.

AUTHOR
    Andreas Koenig <andreas.koenig@anima.de>. Thanks to Jochen
    Wiedmann for heavy debates about the code and crucial
    performance enhancement suggestions. The development of this
    code was sponsered by www.speed-link.de.