HTMLPP - By Pieter Hintjens <ph@imatix.com>

NAME

    htmlpp - HTML preprocessor

SYNOPSIS

    perl htmlpp.pl document-file

DESCRIPTION

    Htmlpp is a preprocessor for HTML files, and is intended to simplify the
    task of maintaining large sets of HTML documents.  You provide htmlpp
    with a document that is a mix of HTML-tagged text and htmlpp commands.
    Htmlpp generates a set of HTML files from that document.

    During this process, htmlpp replaces symbols, reads include files, builds
    tables of contents, and generally does a lot of otherwise tedious and
    error-prone manual work.  To use htmlpp you should be happy writing HTML
    with a simple text editor.

    Htmlpp replaces symbols in command lines and HTML text.  You can specify
    a symbol in various ways:

    $(name)
        Inserts the symbol 'name'.  If the symbol is not defined (see .define
        command below) you get an error message.

    $(*name)
        Inserts an anchor for the symbol 'name'.  This is shorthand for:
        <A HREF="$(name)">name</A>.

    $(*name="label")
        Inserts an anchor for the symbol 'name', with label as specified.
        This is shorthand for: <A HREF="$(name)">label</A>.

    You can define symbols in terms of symbols: $($(name)) is quite okay, if
    you know what you are doing.  Htmlpp inserts symbols in the above order,
    so it will translate all $(name)'s before looking at $(*name)'s.

    Symbols are of three types: htmlpp provides various symbols when building
    certain blocks like the table of contents; you can define symbols using
    the .define command; and you can define symbols using the .build anchor
    command.

    Htmlpp provides these standard symbols for use at any point in the
    document:

    $(DATE)
        The current date, formatted as an 8-character string: YY/MM/DD.

    $(TIME)
        The current time, formatted as an 8-character string: HH:MM:SS.

    $(INC)
        A counter, which starts at 1 and is bumped-up each time you refer to
        it.  Also, it prints in hexadecimal.  I use this to number filenames,
        in the .page command.

    $(PAGE)
        After a .page command, this holds the page filename, exactly as
        specified in the .page command.

    $(TITLE)
        After a .page command, this holds the page title.  It is nice to use
        this in the header .block.

    Unless you use .ignore pages, these symbols are available in header and
    footer blocks (you can use them elsewhere, but will get warnings):

    $(FIRST_PAGE)
        The filename for the first page of the document.

    $(LAST_PAGE)
        The filename for the last page of the document.

    $(NEXT_PAGE)
        The filename for the next page of the document.

    $(PREV_PAGE)
        The filename for the previous first page of the document.

    $(FIRST_TITLE)
        The title for the first page of the document.

    $(LAST_TITLE)
        The title for the last page of the document.

    $(NEXT_TITLE)
        The title for the next page of the document.

    $(PREV_TITLE)
        The title for the previous first page of the document.

    A htmlpp command starts with a dot, in column 1.  These are the commands
    that htmlpp understands (values like <this> indicate arguments; you don't
    type "<" or ">"):

    .define <symbol> [<value>]

        Define a symbol with the specified value.  The symbol name can
        consist of letters, digits, -, ., and _.  The value is everything
        else up to the end of the line.  If you omit the value, the variable
        is undefined.  You can redefine a variable as often as you like
        simply by repeating the .define command.  Use lowercase for your own
        symbols.  Predefined htmlpp symbols are uppercase.  Case is
        significant.

    .include <filename>

        Start reading from the specified file.  You can nest .include files
        as much as you like.  Htmlpp checks for circular references.  If the
        same file was already included earlier, htmlpp ignores the command,
        like the Perl 'require' operator.

    .include <filename>!

        Include the file in any case, like a C #include directive.

    .page <filename>=<title>

        Start writing a new HTML file.  The title is required.  At any point
        after the .page, you can refer to $(PAGE) and $(TITLE) for the
        current file name and title.  For instance, you'll often see this:
        <H1>$(TITLE)</H1>.

    .ignore header

        Ignore the next header line as far as the table of contents is
        concerned.  This is good for headers like <H2>Table of Contents</H2>.

    .ignore header <level>

        Ignores all headers with level greater or equal to level.  This is
        useful if a section has a lot of H3 and H4's that you don't want in
        the table of contents.  Use .ignore header 99 to re-include all
        further headers.

    .ignore pages

        Ignore all .page commands except to pick-up the page titles.  Use
        this when you want to create a super-document.  When you use .ignore
        pages, htmlpp also ignores the .build toc and .build index commands.
        So, if you want a table of contents, do the .build toc before you say
        .ignore pages.

    .block <blockname>

        Define a piece of HTML text to be output as part of a .build command.
        You can end the .block with an .end command or another .block command.
        Htmlpp understands these block names:

        header
            Output at the start of each new HTML page; i.e. whenever you use
            a .page command.

        footer
            Output at the end of each HTML page.

        toc_open
            Output at the start of a .build toc block (see below), and
            whenever htmlpp decides to indent a new level.

        toc_entry
            Output for each entry in the table of contents.  Use these
            symbols: $(TOC_HREF) - the local URL for the file and section;
            $(TOC_TITLE) - the title for the section, taken from the header
            line.

        toc_close
            Output whenever htmlpp decides to outdent a level, and at the end
            of the table of contents.

        dir_open
            Output at the start of a .build dir block (see below).

        dir_entry
            Output for each entry in a .build dir block.  Use these symbols:
            $(DIR_HREF) - URL for the file; $(DIR_NAME) - the filename, left-
            justified; $(DIR_EXT) - the file extension, always put into
            lowercase; $(DIR_SIZE) - the file size, right-justified;
            $(DIR_DATE) - the file date; $(DIR_TIME) - the file time.

        dir_close
            Output at the end of a .build dir block (see below).

        index
            Output for each entry in a .build index block.  Use these
            symbols: $(INDEX_PAGE) - the filename; $(INDEX_TITLE) - the file
            title.

        anchor
            Output whenever you use a .build anchor.  Use this symbol:
            $(ANCHOR) - name of anchor.

    .end

        End the previous .block.  You can end a .block with an .end or a
        further .block command.  Any other command within a .block is a
        syntax error.

    .build toc

        Build table of contents for document.  Htmlpp scans the document and
        all include files once to collect titles (<Hn>...</Hn>) and once to
        create the HTML pages.  Titles (<Hn>...</Hn>) must be entirely on a
        single line, or htmlpp will not find them.  You can manage the
        contents of the table of contents through the .ignore header command.
        You will normally use a .build toc at the start of a document.

    .build dir <directory> [<filespec>...]

        Build directory listing as specified.  The .build dir command only
        works if you mirror the server directory on some local disk that
        htmlpp can access.  This is a Good Idea in any case.  Before you can
        use .build dir you must define LOCAL and SERVER.  I define these like
        this:

            .define LOCAL   I:
            .define SERVER  http://www.imatix.com

        The directory must be relative to either of these two.  It should
        start with '/' but not end with '/'.  You can specify zero or more
        filenames or wildcards (htmlpp accepts * and ?, according to UNIX
        rules).  If you specify no filespecs, htmlpp assumes you mean '*'.
        The filespecs can include PERL regular expressions: place the
        filespec between double quotes.

    .build index

        Build file index for document.  This is basically a list of all pages
        in the document with their titles.  You may find a use for this; I
        put it in for completeness.

    .build anchor <anchor-name>

        Build an anchor definition.  This is real useful.  Basically you do a
        .build anchor somename in a document, then do a $(*somename) or
        $(*somename="label") anywhere in any other document.  Htmlpp saves
        anchor symbols in the file anchor.def; otherwise anchor symbols are
        treated much like normal .define'd symbols.  One difference: anchor
        symbols and normal symbols do not share the same namespace; if you
        .define a symbol with the same name as the anchor symbol, the
        .define'd symbol takes precedence.  If you undefine the symbol, the
        anchor symbol reappears by magic.  This may or may not be useful, but
        it is the way it works.  If you change the file structure of your
        document, run everything through htmlpp *TWICE*, so that all anchor
        references can get really solidly updated.

SEE ALSO

    The most recent version of htmlpp is located at http://www.imatix.com.
    You can download htmlpp from: http://www.imatix.com/pub/htmlpp/htmlpp.zip.

    The Libero documentation is one example of htmlpp in action.  Download
    the archive http://www.imatix.com/pub/htmlpp/lrdoc.zip to get hold of a
    set of text and definition files for htmlpp.

    If you want to change htmlpp, get a hold of the camel book (Programming
    Perl), and the Libero documentation.

BUGS

    None that I know of.  Htmlpp does not scan over line breaks, so all
    commands and symbols must fit onto a single line.

AUTHOR

    Pieter Hintjens <ph@imatix.com>.  Version 1.0 written 3 April, 1996.

COPYRIGHT

    Distributed according to the GNU General Public License.  Copyright
    (c) 1996 Pieter Hintjens.

PORTABILITY

    Uses Perl 4, should be portable to Perl 5.  Does not use any system-
    specific features, so will run on any box (it was developed on MS-DOS).
    When processing large documents under MS-DOS I use Big Perl (bperl).
