                   INSTALLATION AND MAINTENANCE OF GN
                         Version 1.0

INSTALLING THE SOFTWARE

1.  Get the file gn.0.X.tar.Z and uncompress it and untar it to make
the GN source directory hierarchy.  The top level contains 4
directories: GN, mkcache, uncache, and docs.

2.  Edit the file "config.h" in the top level directory.  You should
enter the host name of the computer on which you plan to run GN and
the complete path name of your gopher data directory.  If you want to
run at a port other than 70 also edit the DEFAULTPORT entry.  Other
customizations are possible but should not be needed.  If you are
using a SysV version of UNIX you will need to edit the toplevel
Makefile also.

3.  In the directory gn-1.X do a "make" to produce the server "GN" and
the two utilities "mkcache" and "uncache".  The utility mkcache
produces "cache files" for use by the server (it is described below)
and uncache is used to convert from the Minnesota server gopherd to GN.
If you want a C compiler other than cc you will need to edit the
Makefile in each directory.  The binary GN is the server and can be
installed anywhere you choose.  The binaries mkcache and uncache are
utility programs for maintainers and should be installed somewhere in
your path, e.g. /usr/local/bin.

4. You must be setup to run GN under inetd.  There are surely
variations on how this works from system to system so you may need to
look at the man page for inetd.conf(5).  Here's how it works under
many systems, e.g. Suns: Edit the file /etc/services and create the
line

     gn  70/tcp

(or replace 70 by the port you wish to use).  Then edit the file
/etc/inetd.conf and insert the line

gn    stream    tcp nowait    nobody    /full/path/for/gn    gn

After the last GN you can have optional arguments to turn on logging
or use a different data directory (see the man page gn.8).

It is important to run GN as "nobody" (the fifth field in the
inetd.conf line above) or some other user with no no access
privileges.  It should *never* be necessary to run GN as root and to
do so would be a serious mistake for maintaining security.  Every
attempt has been made to make GN as secure as possible, however, no
program accessible to remote users on the internet can be assumed
perfectly secure.

After editing the inetd.conf and services files you should find the
process id number of the inetd process and do the command "kill -HUP
pocess_id#".  This must be done as root.  If you have never done this
before get someone who has to help you.


SETTING UP THE DATA DIRECTORY

1.  In each directory of your data hierarchy create a file called menu
with one item for each file or directory you want GN to publish.
Items in this file have the format like the following:

Name=This will be on the client menu
Path=0/path/relative/to/data_directory
Type=0
Host=yourhost.yourU.edu
Port=70

The Name field must be first, the Path field consists of a single 
character designating a gopher type (see the gopher protocol), followed
by the path name of the file or directory relative to the top level
data directory.  The Type, Host, and Port fields are optional.  If they
are not present, the Type will be taken from the first character of
the Path field and the Host and Path fields will be those specified
in config.h (or on the command line of mkcache).  For more details see
the man page mkcache.1 and the sample menu files in each directory of
the source hierarchy.

2.  After the menu files have been created you must run the mkcache
program to produce a .cache file.  This can be done once for each
directory or once in the top data directory with the "-r" option to
make all the .cache files for the hierarchy.  You might want to look
a .cache file to see what it is like.

LIMITING ACCESS TO YOUR GN HIERARCHY

If you have opted to limit access to your gopher there are two ways to
do this.  For the first you use the "-a" option to GN (in the
inetd.conf file).  This will limit access to the server to those
clients with an IP address or subnet address listed (and not excluded)
in the file .access in the root data directory.  The format of the
.access file is one address per line, each line consisting of an IP
address like 129.111.222.123 or a subnet address like 129.111.222 or
129.111.  In case a subnet address is listed, any client with an IP
address beginning with that subnet address will be allowed access.

You may also list the domain names of the machines using wildcards provided
the machines all have proper PTR domain name records.  To allow access to
all machines under nwu.edu, use the line *.nwu.edu.  Note that this
will not allow access to a machine called nwu.edu if it exists.  One would
need to add in the record nwu.edu to allow access.

You can also exclude IP addresses or domain names by prefixing them
with an '!', so if .access contained only the lines

!boombox.micro.umn.edu
*

Access would be permitted to every machine *except* boombox.  Likewise

!129.111
*

would allow access to everyone except those on subnet 129.111.  It
is important to note that in determining access GN reads the .access
file only until it finds a match (with or without '!') and then 
quits.  So if .access consisted of the two lines

*
!129.111

then access would be granted to everyone since the * comes first and
it matches everyone.

The "-A" option is similar to the -a option except access is allowed
on a per directory basis.  Each client request is processed by first
looking for a .access file in the directory containing the requested
item and comparing the IP address of the client with the addresses in
this file.  If no .access file exists in this directory, one is sought
in the parent directory and then if necessary the parent of the
parent, etc. up to the root data directory.  If no .access file is
found by this process access is allowed to all clients provided the
item requested exists in a .cache file.

It is possible with GN to attain even finer access discrimination than
on a per directory basis, though it is somewhat cumbersome to do so.
Nevertheless if you have a need to make certain menu items visible
(and accessible) to a select group of hosts, this is possible.
Details on how to do it are in the last section of the document on
decoupling menus from the file system (/docs/decoupling).

SEARCHING A NUMBER FILES

Gn has the builtin capability to offer menu items allowing the
user to create a small menu consisting of those items on a 
large menu whose files contain a particular search term.  Thus
in the example GN server on hopf.math.nwu.edu there is a top 
level menu item which is called "Full text search of GN documentation."
When a user selects this item the client will prompt for a search 
term and then return a menu of all the available files in the
documentation directory which contain the search term.  The searches
are case insensitive.  In fact they use grep like regular expressions
(see the UNIX man page for grep(1)).

This feature is enabled by putting two entries in appropriate
menu files.  The first looks like this:

Name=Documentation for the GN server
Path=1s/docs

This is just a minor modification of the normal entry for the
directory "docs" which contains the documentation files we want to
make available by GN.  The only difference is that the path is
"1s/docs" instead of "1/docs".  The "s" that we are making the
directory searchable, i.e. giving permission to run grep like searches
of all the files which are listed in the menu file for the docs
directory. Of course, docs must be a directory in the GN hierarchy and
it must contain a .cache file allowing files to be served by GN
in the usual way.  

 As yet, however, we have no menu item for the search.  This is
achieved with the second menu item:

Name=Full text search of GN documentation
Path=7g/docs

The "7" indicates that this is a search type and the "g" indicates
that this is a "grep" type search.  In this example docs is a
directory in the GN root directory, but it could be lower in which
case we would have the line Path=7g/foo/bar/docs.  Both of these menu
items might, as in this example, go in the same menu, but this is not
necessary.  You might, for example, want to have the "Full text search
of GN documentation" item occur in the menu which lists the
documentation files, rather than menu which lists the documentation
directory.  Or you might want to have it occur in both menus, which is
fine.  This menu item can occur in any menu of your GN hierarchy.
But, of course, the "Documentation for the GN server" menu item is
corresponds to the physical directory "docs", so it must be in the
menu corresponding to the directory containing "docs".  In this
example that is the GN root directory.

These searches are fairly efficient because GN contains its own
regular expression matching routines rather than externally calling
the grep function.  Regular expressions which the user can enter as
search terms are essentially the same as those allowed by grep (see
the man page grep(1)) with the addition of the special character ~
which matches word boundaries.


SERVING AND SEARCHING STRUCTURED FILES

Gn has the capability to serve a single large file consisting of a
number of sections so that each section appears to the client as a
separate file with its own title.  This is a generalization of the
"mailfile" feature available on the Minnesota server.  To use this
feature requires two additional fields in the menu file, called
"Separator" and "Section".  These are regular expressions as in grep
which are used to match lines which will be used as separators of the
parts of the large file and lines which will be used for menu section
items.  Thus for a mail file one would use the lines

Separator=^From<space> 
Section=^Subject:

The first line, which should have a literal space at the end not the
word <space>, means that sections (in this case mail messages) are
separated by lines starting with From and a space.  The ^ matches the
start of a line and the space is necessary because some lines begin
with From and a colon.

Here's another example.  This document consists of sections with
section headings lines written all in caps.  Since I want to make a
menu with each section a separate item I use the following entry in my
menu file

Name=Installation/Maintenance Guide Sections
Path=1m/docs/Install
Separator=^[A-Z][A-Z" ]*$
Section=^
Type=1

The Separator is ^[A-Z][A-Z "]*$.  This matches any line starting with
a letter from A to Z (i.e. caps) followed by any number of characters
which are between A and Z or equal to space or the quotation mark, and
then the end of the line.  This describes the section headings of this
document.  I need the initial [A-Z] so blank lines won't be matched.

When the separator field is matched a new section is started which
will have its own menu item.  The title of the menu item is determined
by the Section regular expression.  In fact the section is searched,
starting with the separator line, for a match for this second regular
expression.  When a match is found, everything on the line *after* the
matching pattern is taken as the title.  Thus for mail everything
after the word "Subject:" becomes the title.  In the example of this
document, the expression ^ matches the beginning of the separator line
so that whole line becomes the menu title.  To see this in use gopher
to hopf.math.nwu.edu and look in the documentation directory for this
document.

Another example of how this might be used is for a directory.  If a
file consists of entries like

Name: Franks, John
Address:  Department of Mathematics, Northwestern University
Phone: 708-491-5548
etc., etc.

then Separator=^Name: and Section=^Name: would give a menu with
an item    

	1.  Franks, John

which when selected would would give the multiline record with my
name, address, etc.  In this example it would be even better to use
the search feature for structured files

In the example above the "1m" at the beginning of the Path field
indicates that this is a structured file.  It is Type 1 because to
clients it will look like a directory.  If we add an additional menu
entry like

Name=Search Installation Guide
Path=7m/docs/Install
Type=7

which is Type 7 and has a path beginning with "7m" the client will
prompt the user for a search term which can be a regular expression.
The GN server will return a menu with only those sections containing a
match for the regular expression.  Thus for the directory example if
the user searched for Northwestern she would get only those directory
entries containing that word.

Here's how this works.  When mkcache is run with a menu file
containing the "1m" entry above it produces the regular .cache file
but also produces another file (in this case called Install..cache)
which is a cache file for the sections of the file Install specified
in this menu item.  The lines in this cache file contain the menu
titles obtained from the file by matching regular expressions and
contain a selector which designates a range of bytes corresponding to
a section of the document.  Gn knows how to serve a single section of
document when given one of these byte range selectors.

Since the file Install..cache was made when the item with path
1m/docs/Install was encountered we it is not necessary to remake when
the item with path 7m/docs/Install is reached.  We signal this by
omitting the Separator and Section fields from this menu item.  If
these fields were in both items the cache file Install..cache would be
made twice and the one done last would take effect if there was a
difference in the regular expressions given.  Of course if the regular
expressions are omitted from both then the cache file will not be made
and attempts to access either item will result in an error
(cryptically reported as "Access denied").  For this reason Whenever, an
item of type 1m or 7m with no regular expressions is encountered by
mkcache, a warning message is printed.

It is easy to effectively use two different separator regular
expressions or two different section expressions for the same file.
You might for example want to have a mail file with menu by subject
and another menu by author.  To do this you must make a UNIX link (see
the man page ln(1)) to give the mail file an additional name and use
the two different names in the menu file Path entries.  This is
necessary so the cache files created will have different names.

The two regular expressions for the separator and the menu titles are
not put in to the selector string.  Thus they are not available to the
client to change.  This has a slightly unfortunate side effect when
uncache is used to produce a menu file.  Since there is no information
about these regular expressions in the .cache file there is no way for
the uncache program to put it in the menu file it makes and they must
be added by hand.  This is the only way that uncache fails to be a 
complete inverse for mkcache.

Note: All regular expressions given as search terms and all lines in
which a match is sought are converted to lower case before the
matching is attempted.  This has the (desirable) effect of making all
searches case insensitive.  By contrast the regular expressions used
to define separators and menu lines are case sensitive.  Regular
expressions which can be used for the separator and section strings
are essentially the same as those allowed by grep with the addition of
the special character ~ which matches word boundaries.  To give
special characters (including ^ ~ [ ] ( ) * . \ and $) their regular
meaning they must be escaped with a \.

USING HTML -- GN AS A WWW SERVER

Starting with release 1.0, the GN sever became a multi-protocol
server. It will accept either gopher requests or HTTP requests and
respond appropriately.  To the maintainer this takes place
automatically with no action necessary or his or her part.

For those not familiar with it HTTP stands for Hyper Text Transfer
Protocol and it is the underlying protocol used by WWW (World Wide
Web) browsers such as the Mosaic family.  Gopher and HTTP each have
some advantages not shared by the other.  Making GN a multi-protocol
server is an attempt to let us have our cake and eat it too.  

While it is correct that as soon as you start up GN you are serving
documents via HTTP, in order to take advantage of some of the 
really nice features, like images in menus, you do have to put
some information in your menu file.

HTTP is a protocol designed for use with HTML (Hyper Text Markup
Language) and the usual HTTP server consists of a collection documents
written in this markup language with internal hypertext links between
them rather than any menus, as such.  The GN server works with HTTP
clients by translating menus into HTML and serving them in accordance
with protocol that HTTP browsers understand.

In addition you can, of course, create your own HTML documents and
make them available on your server.  You can learn about the format
of an HTML document from an online beginners guide by Marc Andreesen
at  http://www.ncsa.uiuc.edu/demoweb/html-primer.html  (This is a URL
or Universal Resource Locator which says the document is available
via HTTP at www.ncsa.uiuc.edu in the file demoweb/html-primer.html).
This document is an excellent introduction to HTML documents and 
gives references for further reading. 

Once you have created a document you can serve it with GN by giving
it a file name ending in .html and making it available in the usual
way as a text document.  For example,

	Name=A Sample of Hypertext Markup
	Path=0/dir/dir2/sample.html

If this document is viewed with an HTTP browser it will be displayed
with the capabilities of that browser (i.e. nicely formatted in the
ways prescribed by your HTML document).  If it is viewed by a gopher
client the HTML source, i.e. the unformatted document with markup
tags, will be displayed.  If you want to create two versions of a
document -- one in plain text and the other in HTML, this is easily
handled by GN.  Simply give the plain text file a name, say "sample,"
and use the name "sample.html" for the HTML version.  Then use the
plain text name, but with a Path starting with "0h" (that's zero h).
For example a menu entry like

	Name=A Sample of Plain/Hyper text
	Path=0h/dir/dir2/sample

will provide the file "sample" to gopher clients and "sample.html" to
HTTP clients.  One note of warning: don't name the plain file
"something.txt", because the .txt suffix indicates to HTTP clients
that this is *not* an HTML file and it will do the wrong thing (the
client, gopher or HTTP, only sees the plaintext file name.)

Adding HTML text to menus works slightly differently.  You simply include
the source in the menu file beginning with the keyword "httpText="
on a line by itself and ending with the keyword "endText=" on a line
by itself.  Here is an example from the main menu of the GN server
at hopf.math.nwu.edu.  It illustrates how to put graphic images into
a menu.

	httpText=
	<title>The GN Server</title>
	<img src="http://hopf.math.nwu.edu:70/I/image/fract2.gif">
	<p>
	This is the home of the GN Gopher/HTTP server.  It contains
	documentation on GN, the source, and several examples of how GN
	can be used.  To get the source distribution select the compressed
	tar file listed below.
	<p>
	endText=

	Name=Announcement of GN version 1.0
	Path=0/announce-1.0

	etc.

After the keyword httpText=, the first line creates a title for the
document.  All HTML "tags" which do the markup are contained in angle
brackets <>.  The line starting <img src=...  says to insert the
graphic image on hopf at port 70 with Path=I/image/fract2.gif at this
point in the document.  The tag <p> indicates a paragraph break.  See
the document mentioned above for more details on HTML.  Any HTML text
can be inserted in this way in a menu.  There can be multiple
insertions and they can be anywhere in the text.  

If you use the keyword Text= in place of httpText= then GN will serve
the text to HTTP clients exactly as with httpText=, but will also put
the text (with all HTML tags deleted) in the gopher menus using the
'i' or comment type supported by many clients.  For the gopher clients
no text formatting is done.  The lines will have the same length they
do in your menu file.

One important note: normally HTML allows hypertext references to be
relative to the current document.  Thus on most servers one could have
said simply <img src="/image/fract2.gif"> rather than the complete URL
as above.  This is possible with GN only if both the document
containing the reference and the document referred to are of the same
"gopher type", e.g. 0 for text, 1 for directories, etc.  Thus to refer
to the image in the file fract2.gif which is type 'I', the complete
URL starting with http: is necessary.  Equally good would be <img
src="gopher://hopf.math.nwu.edu/I/image/fract2.gif">.  Here the ":70"
specifying the port is not necessary (but is ok) since the default
port for gopher is 70.

Remote links are slightly problematical for GN.  If a link to a remote
server is made in the usual way by specifying Name, Path, Type, Host
and Port then the GN server assumes by default that this is a link to
a server capaple of dealing only with the gopher protocol and will
present it as such.  The determination of whether or not a link is
remote is done at the time that mkcache is run and a link is
considered remote unless the the Host and Path fields in the menu are
omitted or agree exactly with the default values as specified on the
mkcache command line or at compile time in the file config.h.  For
this reason it is important that whenever you run mkcache you specify
the host on the command line, unless you have placed that name in the
config.h file as HOSTNAME.  

Of course, you may know that a remote link is running the GN server
and therefore capable of handling HTTP requests as well as gopher
requests.  In this case, to allow HTTP clients to get the best link,
simply use "GNLink=" instead of "Name=" in your menu file.  For
example, a link to the Northwestern University Math server would look
like:

	GNLink=Northwestern University Mathematics Department
	Path=1/
	Type=1
	Host=gopher.math.nwu.edu
	Port=70

Of course, it is also possible to handle links to servers that can only 
handle HTTP requests.  This is done by placing them in the menu as
HTML documents, bracketed by the httpText= and endText= keywords.

Finally, it is possible to use GN as server to serve only HTTP clients
and have no menus.  Well, there would have to be one menu, the root
menu, but it could contain nothing but HTML surrounded by the keywords
httpText= and endText=.  This document could have hypertext links to
other HTML documents which in turn have hypertext links, etc.  It 
will still be necessary to create "dummy" menu files in each directory
with the Path of each of the HTML files and to run mkcache to create
.cache files.  This is for security reasons.  


SETTING UP A "SEARCH ALL MENUS" ITEM

A builtin feature of GN is the ability to have a menu item which
when selected prompts the user for a search term and returns a 
"virtual menu" of all menu items which contain that term.  In fact
such an item can occur at any level and return either all matches
from all menus on that server or all matches at or below some 
chosen level.  

Here's how to set it up.  Create an entry like this in the menu file
where you want the search item to occur.

Name=Search all menus on this server
Type=7
Path=7c/.cache
Host=your.gn.host.edu
Port=70

(If you want the search to cover only those items in directory
/foo/bar, then the path line should be Path=7c/foo/bar/.cache) now run
"mkcache" to translate the new menu file to a .cache file and you are
done.  The Type, Host and Port lines are optional -- if they are
omitted mkcache will use the default value or the value supplied on
the command line.  When you change any of the menus in your server and
remake the .cache files GN will automatically reflect this in menu
searches.  There is a maximum depth which GN will search into the GN
hierarchy.  It's value can be changed by editing the config.h file and
re-compiling.


COMPRESSED FILES

If you wish you can keep files on your server in a compressed format
and uncompress them on the fly as a client requests them.  You need
a program to compress the files and a companion program to decompress
them.  I recommend "gzip" and "zcat" from the GNU project.  They are
considerably more efficient than the UNIX standard "compress."

When configuring "GN" for compilation, be sure to set the 
#define DECOMPRESS in the file config.h to the path name of
the program which will decompress the files you have compressed.
The default value for this is "/usr/local/bin/zcat".  Another
possibility would be "/usr/ucb/uncompress -c".

If the file you want to make available is "rootdir/dir1/bigfile,"
first you must compress it with the compress command which will
replace it with the file bigfile.gz or bigfile.Z.  You then make
a menu entry like the following (assuming bigfile was a text file
and you have produced bigfile.gz).

Name=All the text in Bigfile
Path=0Z/dir/bigfile.gz
Type=0

The key here is the 'Z' which is the second character of the Path
field.  It indicates that the file is compressed.  The Path would
start with "0Z" (that's zero Z) for any compressed text file.
It doesn't matter how the file was compressed or whether its name
is bigfile.Z or bigfile.gz or something else.  You have already
told GN how to decompress the file by specifying the DECOMPRESS
program in config.h.

Of course, if bigfile is a binary the Path field would be
9Z/dir/bigfile.gz and the Type would be 9.  For a sound file
Path=sZ/dir/bigfile.gz, Type=s, etc.  Files of types 0, 4, 5, 9, s,
and I can be compressed.  Structured files (type 1m) cannot be
compressed.

You might want to let users download the file in compressed format.
You could give them the option by having the menu item as above
with Path=0Z/dir/bigfile.gz and also having a menu item

Name=Bigfile in compressed format
Path=9/dir/bigfile.gz
Type=9

Note that the Type=9 since compressed files are binaries (even though
bigfile is text) and there is no 'Z' as the second character of the
Path, because now we do not want to decompress.  Also note that two
versions of bigfile show up on your menu (text and compressed binary)
but there is only one file bigfile.gz on your disk.


SERVING THE OUTPUT OF A PROGRAM OR SCRIPT

Sometimes it is convenient to have the server return the output
of a program or script.  This capability is built into GN.  Assuming
you have a program in a file "prog" which returns some text you
can make its output be an item on your server's menu with a menu
entry like

Name=Program output
Type=0
Path=exec0::/dir/prog
Host=your.gn.host.edu
Port=70

The phrase "exec" says to run the program "prog" which must be
executable by the GN userid (probably "nobody").  The "0" after the
exec says this is a text file.  exec can return most types, including
0 (text), 1 (menus), 9 (binaries), s (sound), I (image).  To specify a
type the single character type is appended to the word exec in the
path.  Thus if you wanted to return the output of a program which is
in the format of a sound file you would have an entry like

Name=Image program output
Type=I
Path=execI::/dir/prog
Host=your.gn.host.edu
Port=70


The pair of colons in the path can contain arguments to the program.
The arguments are primarily for the use of "Forms" (see below) and if
you want to run a program which takes arguments it is better to wrap it
in a shell script.  For security reasons none of the characters

	; ` ' | \ * ? - ~ > < ^ ( ) [ ] { } $ / or \ 

are allowed in the arguments to programs.  Thus, if you want to run a
command like "prog -u <somefile", you must create a script like

	#!/bin/sh
	exec /fullpath/prog -u </fullpath/somefile

and make this script be what GN executes.

It would be nice to have the client query the user for a word or
phrase and have this passed to the program as an argument.  Unfortunately
the gopher protocol designers chose not to allow this.  There is a 
workaround however which involves the use of Interactive Forms (see below).
A simple form with a single "Field" entry will give the functionality
of running a script and passing it an argument entered by the user.


DECOUPLING THE GN AND FILESYSTEM HIERARCHIES

It is possible to do this, but generally recommended only if you have
a reason to do so and are fairly familiar with how GN works and the
syntax of .cache files.  Information on how to do it can be found in
the file docs/decoupling


THANKS

I would like to thank the many people who have aided in the creation
of the GN package, either through writing code or finding and fixing
bugs.  They include Earle Ake, Henry Cejtin, Paul DuBois, Jishnu
Mukerji, Marko Nordberg, Stephen Trier, Ed Vielmetti, and Rico Tudor.


John Franks 	Dept of Math. Northwestern University
		john@math.nwu.edu









