                   INSTALLATION AND MAINTENANCE OF GN
                         Version 0.6

INSTALLING THE SOFTWARE

1.  Get the file gn.0.X.tar.Z and uncompress it and untar it to make
the gn source directory hierarchy.  The top level contains 4
directories: gn, mkcache, uncache, and docs.

2.  Edit the file "config.h" in the top level directory.  You should
enter the host name of the computer on which you plan to run gn and
the complete path name of your gopher data directory.  If you want to
run at a port other than 70 also edit the DEFAULTPORT entry.  Other
customizations are possible but should not be needed.

3.  In the directory gn-0.X do a "make" to produce the server "gn" and
the two utilities "mkcache" and "uncache".  The utility mkcache
produces "cache files" for use by the server (it is described below)
and uncache is used to convert from the Minnesota server gopherd to gn.
If you want a C compiler other than cc you will need to edit the
Makefile in each directory.  The binary gn is the server and can be
installed anywhere you choose.  The binaries mkcache and uncache are
utility programs for maintainers and should be installed somewhere in
your path, e.g. /usr/local/bin.

4. You must setup to run gn under inetd.  There are surely variations
on how this works from system to system so you may need to look at the
man page for inetd.conf(5).  Here's how it works under many systems,
e.g. Suns: Edit the file /etc/services and create the line

     gn  70/tcp

(or replace 70 by the port you wish to use).  Then edit the file
/etc/inetd.conf and insert the line

gn    stream    tcp nowait    nobody    /full/path/for/gn    gn

After the last gn you can have optional arguments to turn on logging
or use a different data directory (see the man page gn.8).

It is important to run gn as "nobody" (the fifth field in the
inetd.conf line above) or some other user with no no access
privileges.  It should *never* be necessary to run gn as root and to
do so would be a serious mistake for maintaining security.  Every
attempt has been made to make gn as secure as possible, however, no
program accessible to remote users on the internet can be assumed
perfectly secure.

After editing the inetd.conf and services files you should find the
process id number of the inetd process and do the command "kill -HUP
pocess_id#".  This must be done as root.  If you have never done this
before get someone who has to help you.


SETTING UP THE DATA DIRECTORY

1.  In each directory of your data hierarchy create a file called menu
with one item for each file or directory you want gn to publish.
Items in this file have the format like the following:

Name=This will be on the client menu
Path=0/path/relative/to/data_directory
Type=0
Host=yourhost.yourU.edu
Port=70

The Name field must be first, the Path field consists of a single 
character designating a gopher type (see the gopher protocol), followed
by the path name of the file or directory relative to the top level
data directory.  The Type, Host, and Port fields are optional.  If they
are not present, the Type will be taken from the first character of
the Path field and the Host and Path fields will be those specified
in config.h (or on the command line of mkcache).  For more details see
the man page mkcache.1 and the sample menu files in each directory of
the source hierarchy.

2.  After the menu files have been created you must run the mkcache
program to produce a .cache file.  This can be done once for each
directory or once in the top data directory with the "-r" option to
make all the .cache files for the hierarchy.  You might want to look
a .cache file to see what it is like.

LIMITING ACCESS TO YOUR GN HIERARCHY

If you have opted to limit access to your gopher there are two ways to
do this.  For the first you use the "-a" option to gn (in the
inetd.conf file).  This will limit access to the server to those
clients with an IP address or subnet address listed (and not excluded)
in the file .access in the root data directory.  The format of the
.access file is one address per line, each line consisting of an IP
address like 129.111.222.123 or a subnet address like 129.111.222 or
129.111.  In case a subnet address is listed, any client with an IP
address beginning with that subnet address will be allowed access.

You may also list the domain names of the machines using wildcards provided
the machines all have proper PTR domain name records.  To allow access to
all machines under nwu.edu, use the line *.nwu.edu.  Note that this
will not allow access to a machine called nwu.edu if it exists.  One would
need to add in the record nwu.edu to allow access.

You can also exclude IP addresses or domain names by prefixing them
with an '!', so if .access contained only the lines

!boombox.micro.umn.edu
*

Access would be permitted to every machine *except* boombox.  Likewise

!129.111
*

would allow access to everyone except those on subnet 129.111.  It
is important to note that in determining access gn reads the .access
file only until it finds a match (with or without '!') and then 
quits.  So if .access consisted of the two lines

*
!129.111

then access would be granted to everyone since the * comes first and
it matches everyone.

The "-A" option is similar to the -a option except access is allowed
on a per directory basis.  Each client request is processed by first
looking for a .access file in the directory containing the requested
item and comparing the IP address of the client with the addresses in
this file.  If no .access file exists in this directory, one is sought
in the parent directory and then if necessary the parent of the
parent, etc. up to the root data directory.  If no .access file is
found by this process access is allowed to all clients provided the
item requested exists in a .cache file.

It is possible with gn to attain even finer access discrimination than
on a per directory basis, though it is somewhat cumbersome to do so.
Nevertheless if you have a need to make certain menu items visible
(and accessible) to a select group of hosts, this is possible.
Details on how to do it are in the last section of the document on
decoupling menus from the file system (/docs/decoupling).

SEARCHING A NUMBER FILES

Gn has the builtin capability to offer menu items allowing the
user to create a small menu consisting of those items on a 
large menu whose files contain a particular search term.  Thus
in the example gn server on hopf.math.nwu.edu there is a top 
level menu item which is called "Full text search of gn documentation."
When a user selects this item the client will prompt for a search 
term and then return a menu of all the available files in the
documentation directory which contain the search term.  The searches
are case insensitive.  In fact they use grep like regular expressions
(see the UNIX man page for grep(1)).

This feature is enabled by putting two entries in appropriate
menu files.  The first looks like this:

Name=Documentation for the gn server
Path=1s/docs

This is just a minor modification of the normal entry for the
directory "docs" which contains the documentation files we want to
make available by gn.  The only difference is that the path is
"1s/docs" instead of "1/docs".  The "s" that we are making the
directory searchable, i.e. giving permission to run grep like searches
of all the files which are listed in the menu file for the docs
directory. Of course, docs must be a directory in the gn hierarchy and
it must contain a .cache file allowing files to be served by gn
in the usual way.  

 As yet, however, we have no menu item for the search.  This is
achieved with the second menu item:

Name=Full text search of gn documentation
Path=7g/docs

The "7" indicates that this is a search type and the "g" indicates
that this is a "grep" type search.  In this example docs is a
directory in the gn root directory, but it could be lower in which
case we would have the line Path=7g/foo/bar/docs.  Both of these menu
items might, as in this example, go in the same menu, but this is not
necessary.  You might, for example, want to have the "Full text search
of gn documentation" item occur in the menu which lists the
documentation files, rather than menu which lists the documentation
directory.  Or you might want to have it occur in both menus, which is
fine.  This menu item can occur in any menu of your gn hierarchy.
But, of course, the "Documentation for the gn server" menu item is
corresponds to the physical directory "docs", so it must be in the
menu corresponding to the directory containing "docs".  In this
example that is the gn root directory.

These searches are fairly efficient because gn contains its own
regular expression matching routines rather than externally calling
the grep function.  Regular expressions which the user can enter as
search terms are essentially the same as those allowed by grep (see
the man page grep(1)) with the addition of the special character ~
which matches word boundaries.


SERVING AND SEARCHING STRUCTURED FILES

Gn has the capability to serve a single large file consisting of a
number of sections so that each section appears to the client as a
separate file with its own title.  This is a generalization of the
"mailfile" feature available on the Minnesota server.  To use this
feature requires two additional fields in the menu file, called
"Separator" and "Section".  These are regular expressions as in grep
which are used to match lines which will be used as separators of the
parts of the large file and lines which will be used for menu section
items.  Thus for a mail file one would use the lines

Separator=^From<space> 
Section=^Subject:

The first line, which should have a literal space at the end not the
word <space>, means that sections (in this case mail messages) are
separated by lines starting with From and a space.  The ^ matches the
start of a line and the space is necessary because some lines begin
with From and a colon.

Here's another example.  This document consists of sections with
section headings lines written all in caps.  Since I want to make a
menu with each section a separate item I use the following entry in my
menu file

Name=Installation/Maintenance Guide Sections
Path=1m/docs/Install
Separator=^[A-Z][A-Z" ]*$
Section=^
Type=1

The Separator is ^[A-Z][A-Z "]*$.  This matches any line starting with
a letter from A to Z (i.e. caps) followed by any number of characters
which are between A and Z or equal to space or the quotation mark, and
then the end of the line.  This describes the section headings of this
document.  I need the initial [A-Z] so blank lines won't be matched.

When the separator field is matched a new section is started which
will have its own menu item.  The title of the menu item is determined
by the Section regular expression.  In fact the section is searched,
starting with the separator line, for a match for this second regular
expression.  When a match is found, everything on the line *after* the
matching pattern is taken as the title.  Thus for mail everything
after the word "Subject:" becomes the title.  In the example of this
document, the expression ^ matches the beginning of the separator line
so that whole line becomes the menu title.  To see this in use gopher
to hopf.math.nwu.edu and look in the documentation directory for this
document.

Another example of how this might be used is for a directory.  If a
file consists of entries like

Name: Franks, John
Address:  Department of Mathematics, Northwestern University
Phone: 708-491-5548
etc., etc.

then Separator=^Name: and Section=^Name: would give a menu with
an item    

	1.  Franks, John

which when selected would would give the multiline record with my
name, address, etc.  In this example it would be even better to use
the search feature for structured files

In the example above the "1m" at the beginning of the Path field
indicates that this is a structured file.  It is Type 1 because to
clients it will look like a directory.  If we add an additional menu
entry like

Name=Search Installation Guide
Path=7m/docs/Install
Type=7

which is Type 7 and has a path beginning with "7m" the client will
prompt the user for a search term which can be a regular expression.
The gn server will return a menu with only those sections containing a
match for the regular expression.  Thus for the directory example if
the user searched for Northwestern she would get only those directory
entries containing that word.

Here's how this works.  When mkcache is run with a menu file
containing the "1m" entry above it produces the regular .cache file
but also produces another file (in this case called Install..cache)
which is a cache file for the sections of the file Install specified
in this menu item.  The lines in this cache file contain the menu
titles obtained from the file by matching regular expressions and
contain a selector which designates a range of bytes corresponding to
a section of the document.  Gn knows how to serve a single section of
document when given one of these byte range selectors.

Since the file Install..cache was made when the item with path
1m/docs/Install was encountered we it is not necessary to remake when
the item with path 7m/docs/Install is reached.  We signal this by
omitting the Separator and Section fields from this menu item.  If
these fields were in both items the cache file Install..cache would be
made twice and the one done last would take effect if there was a
difference in the regular expressions given.  Of course if the regular
expressions are omitted from both then the cache file will not be made
and attempts to access either item will result in an error
(cryptically reported as "Access denied").  For this reason Whenever, an
item of type 1m or 7m with no regular expressions is encountered by
mkcache, a warning message is printed.

It is easy to effectively use two different separator regular
expressions or two different section expressions for the same file.
You might for example want to have a mail file with menu by subject
and another menu by author.  To do this you must make a UNIX link (see
the man page ln(1)) to give the mail file an additional name and use
the two different names in the menu file Path entries.  This is
necessary so the cache files created will have different names.

The two regular expressions for the separator and the menu titles are
not put in to the selector string.  Thus they are not available to the
client to change.  This has a slightly unfortunate side effect when
uncache is used to produce a menu file.  Since there is no information
about these regular expressions in the .cache file there is no way for
the uncache program to put it in the menu file it makes and they must
be added by hand.  This is the only way that uncache fails to be a 
complete inverse for mkcache.

Note: All regular expressions given as search terms and all lines in
which a match is sought are converted to lower case before the
matching is attempted.  This has the (desirable) effect of making all
searches case insensitive.  By contrast the regular expressions used
to define separators and menu lines are case sensitive.  Regular
expressions which can be used for the separator and section strings
are essentially the same as those allowed by grep with the addition of
the special character ~ which matches word boundaries.  To give
special characters (including ^ ~ [ ] ( ) * . \ and $) their regular
meaning they must be escaped with a \.

SETTING UP A "SEARCH ALL MENUS" ITEM

A builtin feature of gn is the ability to have a menu item which
when selected prompts the user for a search term and returns a 
"virtual menu" of all menu items which contain that term.  In fact
such an item can occur at any level and return either all matches
from all menus on that server or all matches at or below some 
chosen level.  

Here's how to set it up.  Create an entry like this in the menu file
where you want the search item to occur.

Name=Search all menus on this server
Type=7
Path=7c/.cache
Host=your.gn.host.edu
Port=70

(If you want the search to cover only those items in directory
/foo/bar, then the path line should be Path=7c/foo/bar/.cache) now run
"mkcache" to translate the new menu file to a .cache file and you are
done.  The Type, Host and Port lines are optional -- if they are
omitted mkcache will use the default value or the value supplied on
the command line.  When you change any of the menus in your server and
remake the .cache files gn will automatically reflect this in menu
searches.  There is a maximum depth which gn will search into the gn
hierarchy.  It's value can be changed by editing the config.h file and
re-compiling.


COMPRESSED FILES

If you wish you can keep files on your server in a compressed format
and uncompress them on the fly as a client requests them.  You need
a program to compress the files and a companion program to decompress
them.  I recommend "gzip" and "zcat" from the GNU project.  They are
considerably more efficient than the UNIX standard "compress."

When configuring "gn" for compilation, be sure to set the 
#define DECOMPRESS in the file config.h to the path name of
the program which will decompress the files you have compressed.
The default value for this is "/usr/local/bin/zcat".  Another
possibility would be "/usr/ucb/uncompress -c".

If the file you want to make available is "rootdir/dir1/bigfile,"
first you must compress it with the compress command which will
replace it with the file bigfile.gz or bigfile.Z.  You then make
a menu entry like the following (assuming bigfile was a text file
and you have produced bigfile.gz).

Name=All the text in Bigfile
Path=0Z/dir/bigfile.gz
Type=0

The key here is the 'Z' which is the second character of the Path
field.  It indicates that the file is compressed.  The Path would
start with "0Z" (that's zero Z) for any compressed text file.
It doesn't matter how the file was compressed or whether its name
is bigfile.Z or bigfile.gz or something else.  You have already
told gn how to decompress the file by specifying the DECOMPRESS
program in config.h.

Of course, if bigfile is a binary the Path field would be
9Z/dir/bigfile.gz and the Type would be 9.  For a sound file
Path=sZ/dir/bigfile.gz, Type=s, etc.  Files of types 0, 4, 5, 9, s,
and I can be compressed.  Structured files (type 1m) cannot be
compressed.

You might want to let users download the file in compressed format.
You could give them the option by having the menu item as above
with Path=0Z/dir/bigfile.gz and also having a menu item

Name=Bigfile in compressed format
Path=9/dir/bigfile.gz
Type=9

Note that the Type=9 since compressed files are binaries (even though
bigfile is text) and there is no 'Z' as the second character of the
Path, because now we do not want to decompress.  Also note that two
versions of bigfile show up on your menu (text and compressed binary)
but there is only one file bigfile.gz on your disk.


SERVING THE OUTPUT OF A PROGRAM OR SCRIPT

Sometimes it is convenient to have the server return the output
of a program or script.  This capability is built into gn.  Assuming
you have a program in a file "prog" which returns some text you
can make its output be an item on your server's menu with a menu
entry like

Name=Program output
Type=0
Path=exec0::/dir/prog
Host=your.gn.host.edu
Port=70

The phrase "exec" says to run the program "prog" which must be
executable by the gn userid (probably "nobody").  The "0" after the
exec says this is a text file.  exec can return most types, including
0 (text), 1 (menus), 9 (binaries), s (sound), I (image).  To specify a
type the single character type is appended to the word exec in the
path.  Thus if you wanted to return the output of a program which is
in the format of a sound file you would have an entry like

Name=Image program output
Type=I
Path=execI::/dir/prog
Host=your.gn.host.edu
Port=70


The pair of colons in the path can contain arguments to the program.
The arguments are primarily for the use of "Forms" (see below) and if
you want to run a program which takes arguments it is better to wrap it
in a shell script.  For security reasons none of the characters

	; ` ' | \ * ? - ~ > < ^ ( ) [ ] { } $ / or \ 

are allowed in the arguments to programs.  Thus, if you want to run a
command like "prog -u <somefile", you must create a script like

	#!/bin/sh
	exec /fullpath/prog -u </fullpath/somefile

and make this script be what gn executes.

It would be nice to have the client query the user for a word or
phrase and have this passed to the program as an argument.  Unfortunately
the gopher protocol designers chose not to allow this.  There is a 
workaround however which involves the use of Interactive Forms (see below).
A simple form with a single "Field" entry will give the functionality
of running a script and passing it an argument entered by the user.

INTERACTIVE FORMS AND A DATA BASE INTERFACE

As of version 0.6 gn supports an interactive form facility designed
to be used, for example, as a front end to a data base query program.
Here's how it works.

You must provide a perl script, shell script, C program or whatever
which takes some number of arguments and based on those arguments
performs any task you wish, including writing some text to standard
output.  The arguments might be the fields for a data base query and
the output the results of the query.  Or the arguments might be a
name, address, e-mail address, and phone number, date and time for
users who want to make a dinner reservation at your restaurant.
For security reasons none of the characters

	; ` ' | \ * ? - ~ > < ^ ( ) [ ] { } $ / or \ 

are allowed in the arguments to programs. 

For the following example we assume that such a program exists and
it is in the file "rootdir/dir/script".  In the menu file for the
directory rootdir/dir you should place whatever entries you want
together with an entry like the following

##################################
Form=Fill out this form
Path=1form/dir/script

	Name=Select an item to enter information
	Path=0/dir/instructions

	Field=Your name:

	Field=Your email address:

	Choose=Your favorite color:
	Choices=Red	Green	Blue	Black	White

	Done=Done: submit this form

Endform=
##################################


When used the client client will be presented with a menu item 
"Fill out this form".  When it is selected the client gets a menu like

	1. Select an item to enter information
	2. Your name:
	3. Your email address:
	4. Your favorite color:
	5. Done: submit this form

If the user picks 1. he gets the contents of the text file
/dir/instructions as a text document.  If he picks 2. he is prompted for a
string and enters his name.  The next menu looks the same as above but
fills in the entered item, i.e. the new line 2 looks like

	2. Your name: [John Doe]

If 4. is picked the user is presented with a menu of items (all the
choices listed above in the menu file).  

The user can change any entered values, but when he is satisfied
he selects 5. and the shell script (or perl script or C program)
in the file /dir/script is executed with arguments the name,
email address, and color entered by the user in that order.

You can try this yourself by gophering to hopf.math.nwu.edu 70
and selecting the the item "Experimental interactive form".  It
will send you an email message.  There is also a calendar server
example you can try.  The scripts which the server uses can be
viewed there.

Some comments on the format of the menu entry are in order:

The first line is

	Form=Fill out this form

It says we are starting a form and whatever is after the "=" sign
will appear on the menu, just like the Name= field for an 
ordinary entry.  The second line

	Path=1form/dir/script

gives the Path field for this item.  It must start with "1form" and
then give the path relative to the gn root of your program which
will do the search, send the e-mail, or whatever.

At this point one can put any number of entries in the form menu.
I like to indent for readability but this is optional (all leading
whitespace is discarded by mkcache).  The menu entries are of four
types and can occur in any order.  Here is a description of each type.

I.  A normal link entry given with Name, Path, Type, Host and Port
    fields.  If, as above, the Host and Port fields are omitted the
    default values set when you compiled gn are used.  As with ordinary
    link entries, if the Type field is omitted the first character of the
    Path entry is used.

II. The second kind of entry is a Field to be filled in by
    the user.  For example, the menu above contains the line

	Field=Your name:

    This creates the menu item "3. Your name:" as described above.
    A default value for this Field can be specified by making the
    entry 

	Field=Your name:<TAB>John Doe

    The menu item will then look like "3. Your name: [John Doe]".  You
    might also want to put instructions like "optional" or "required"
    as default values for fields.  Notice that the use of the <TAB>
    character (that's just the tab, don't put in the angle brackets)
    as a separator means you cannot put tabs in your Field menu entries!

III. The third possibility for a form entry is a Choose item followed
    by a Choices item.  Thus the two lines 

	Choose=Your favorite color:
	Choices=Red	Green	Blue	Black	White
    in the menu above cause gn to produce an item on the menu labeled
    "Your favorite color:" which when selected offers the client user
    a list of choices to pick from to set the value of this field.  As
    With the Field entry a default value can be set by making this entry

	Choose=Your favorite color:<TAB>none

    for example.  The default value need not be one of the choices.  The list
    of choices in the Choices line are separated by tabs, i.e. this line
    really is

	Choices=Red<TAB>Green<TAB>Blue<TAB>Black<TAB>White

    All the choices for each Choose item must be on one line in the
    menu file.  This Choices line must be the first (non-blank) line after
    the Choose line.

IV. The final form menu entry is Done entry.  The one above creates the
    menu line

	Done: submit this form

    When selected the script is executed with the values of the arguments
    as they are at that time.  Up until that time it is possible for the
    user to change her mind and re-enter any field or choice.  It is not
    currently possible to have two done entries which execute different
    scripts -- maybe in the future :)

Finally you signal the end of a form with the line

	Endform=

The "=" is necessary, but it shouldn't be.

By default a Form returns text when the "Done" entry is selected.
However, it is possible to return any of the types 0 (text), 1
(menus), 9 (binaries), s (sound), or I (image).  To do this set the
field   "Returntype" to the desired single character value for the
type you want.  Thus a form which returns an image could have a menu
entry like

Form=Select an Image
Path=1form/dir/script
Returntype=I

	Choose=Pick your favorite:
	Choices=img1.gif	img2.gif	img3.gif

	Done=Send selection

Endform=

And the script could be 

#!/bin/sh
cat /complete/image/dir/$1

The first and only argument to this script would be the name
of the image file chosen by the user and the script would just
cat it.  The Returntype=I line says it is an image.


DECOUPLING THE GN AND FILESYSTEM HIERARCHIES

It is possible to do this, but generally recommended only if you have
a reason to do so and are fairly familiar with how gn works and the
syntax of .cache files.  Information on how to do it can be found in
the file docs/decoupling


THANKS

I would like to thank the many people who have aided in the creation
of the gn package, either through writing code or finding and fixing
bugs.  They include Earle Ake, Henry Cejtin, Paul DuBois, Jishnu
Mukerji, Marko Nordberg, Stephen Trier, Ed Vielmetti, and Rico Tudor.


John Franks 	Dept of Math. Northwestern University
		john@math.nwu.edu









