
                 USENET Sources Archiver             

                 @(#)README	1.1 6/1/89

             Copyright (c) 1989, by Kent Landfield.

   Permission is hereby granted to copy, distribute or otherwise 
   use any part of this package as long as you do not try to make 
   money from it or pretend that you wrote it.  The copyright 
   notice must be maintained in any copy made.
  
   If you make modifications to this software that you feel 
   increases it usefulness for the rest of the community, please 
   email the changes, enhancements, bug fixes as well as any and 
   all ideas to me. This software is going to be maintained and 
   enhanced as deemed necessary by the community.
  		
			-Kent+
  		   uunet!ssbell!kent
  
------------------------------------------------------------------
                       DISCLAIMER
------------------------------------------------------------------
Use of this software constitutes acceptance for use in an AS IS 
condition. There are NO warranties with regard to this software.  
In no event shall the author be liable for any damages whatsoever 
arising out of or in connection with the use or performance of this 
software.  Any use of this software is at the user's own risk.
-------------------------------------------------------------------


When made, this package currently contains 3 executables:

	o  rkive    - a USENET newsgroup archiver,
	o  article  - print formatted news article header information, and
        o  ckconfig - an rkive configuration file check program.


This package was initially designed for archiving comp.sources.all newsgroups.
It does however, support archiving of non-moderated, non-sources newsgroups.


                        -----
                        rkive 
                        -----

rkive reads a configuration file to determine such things as:

	o where the news directory resides,
	o where each newsgroup is to be archived,
	o the type of archiving to be done for each newsgroup,
	o the ownership and modes of the archived members,

as well as additional optional features such as:

        o which users/accounts to mail the archived member information to,
	o the location and format of log files, 
	o the location and format of index files,
	o the compression program to use (if desired). 

It is intended that rkive be run by cron on a daily basis. In this manner,
software is archived and available for retrieval from the archives on the
day it reaches the machine.  It allows for the archives to be managed by
the same or different people (or accounts).  It supports the building
of indexes for later review or to interface to the netlib type of mail
retrieval software. It also supports mailing notifications of the archiving
to a specified list of users or aliases. The indexes and log file formats
are specifiable by the person configuring the rkive configuration file.

-------------------------------------------------------------------
The following defines are possible. Please note that the Directory
Creation defines are specified in Makefile while the rest are specified
in rkive.h.

***************************
rkive.h - General Defines
***************************

-D REDUCE_HEADERS   :   Archived article header reduction code.
	Disk space is saved by removing header lines that have no 
	further use after the article is stored in the archive.
	As currently defined, all headers *except* for From:, Newsgroups:, 
	Subject:, Message-ID: and Date are removed if this is defined.
	The list of headers to be saved can be added to or reduced by
	modifying the table "hdrs" in news_arc.c.

-D SUBJECT_LINE     :   Specify that the local mailer has -s option
        such as /usr/bin/mailx or /usr/ucb/Mail.

*************************************
Makefile - Directory Creation Defines
*************************************

-D HAVE_MKDIR       : use the mkdir() function in the system library. 
	(AT&T 5.2 or earlier systems are probably out of luck..)

-D USE_SYSMKDIR     : have rkive system off /bin/mkdir.
	(not recommended for *real* use...)

If you do not define either, the function makedir() will create the 
directory itself. I suggest that if you do not have mkdir() in your
system libraries, use the builtin if you can. *Please* verify you can
use it *first*.

---------------------------
Archive Member Compression:
---------------------------
If you wish to have your archived articles compressed you may do so by
specifying the disk path to the compression program as the value for
COMPRESS in the rkive configuration file. It is important that *if* you
use a compression program other that "compress" or "pack" that you add
a an entry to the compression routine table just above the function
suffix() in news_arc.c. Currently, this program recognizes just ".z" and 
".Z" suffixes.

----------------
REPOST Handling:
----------------
Warning:
	Repost handling is not a configurable parameter within the 
	rkive configuration file at this time.

ADD_REPOST_SUFFIX define added.
    This define allows the administrator to configure the software to
    add "-repost" (or whatever is defined in REPOST_SUFFIX) to the
    end of all files that are marked as REPOST by the newsgroup moderator.
    The suffix is added prior to compression. This feature should only be 
    configured/exist on systems whose filename limits are greater than 14.

MV_ORIGINAL define added.
    This define allows the administrator to configure the software to
    move the original article into a "originals" directory in the 
    problems directory. The inbound reposted article is placed into 
    the archive in the correct position.

If neither define is specified then the inbound article is placed into 
the archive in the correct position only if the initial article is not 
in the archive.  Otherwise the reposted article is placed in the problems 
directory as normal duplicate articles are now.

-----------------
PATCHES Handling:
-----------------
rkive supports the new Auxiliary header "Patch-To:". The Patch-To: line
will exist for articles that are patches to previously posted software. 
The Patch-To: line only appears in articles that are posted, "Official", 
patches. The initial postings would not contain the Patch-To: auxiliary 
header line.

Auxiliary Headers For Patch Postings:

	Submitted-by: Kent Landfield <kent@ssbell.UUCP>
	Posting-number: Volume 23, Issue 14
->	Patch-To: Volume 22, Issue 122
	Archive-name: rkive/patch1

There are two different types of handling with regards to patches. 

	Package     - This type of archiving of patches places the patches
                      in the same directory that the initial source was
                      posted to. This type of archiving is only available
                      to newsgroup archives that are using Archive-Name
                      archiving.
                   
	Historical  - This type of archiving patches is done by sites that 
                      want to place the the patches in the volume/issue in 
                      which the patch originally arrived.

Archive recognizes that the Patch-To: line indicates the article is 
a patch.  For Archive-Name archiving which has specified "Package" 
patches archiving in the configuration file, rkive puts the article 
into the directory that contained the initial posting (volume22/rkive). 
For Archive-Name that has not specified Package archiving or for 
Volume/Issue archiving, the article would still be labeled as
volume23/rkive/patch01 or volume23/v23i014 respectively.

rkive also writes a .patchlog file in the BASEDIR for the newsgroup
that is used to track patches to originally posted software. The
.patchlog is going to be used for the "random software downloader :-)"
so that complete software packages (sources and patches) can be requested
from sites that do not use combined Archive-Name and Package archiving.
The format of the .patchlog file is:
#
# Patchlog for comp.sources.whoknows
#
# Path To         Initial  Initial     Current Current 
# Patchfile       Volume   Issue       Volume  Issue
#
bb/patch01          22     105           23    77
            or if volume issue format..
v47i022             22     105           23    77

-------------------------
Article Header Reduction:
-------------------------
Articles that are stored just as they arrived on your system are potentially
wasting disk space. Certain rfc822/rfc1036 header lines are of little use
after the article is archived.  If you wish to have the headers "trimmed" 
when the file is archived, assure that REDUCE_HEADERS is defined. Currently 
all header lines that are *not* either;

    From:, Newsgroups:, Subject:, Message-ID:, and Date:

will be removed. This can produce a savings of as much as 200 to 500 
bytes per archived article.

See news_arc.c if you wish to add or subtract header lines to be kept.
The modifications need to be made to the hdrstokeep table just above the
keep_line() function.

---------
Security:
---------
rkive sets the ownership, group and modes on the archived members according
to the information specified in the configuration file. Currently though,
rkive uses the default umask for creating the log and index files.

rkive will not archive files outside of the BASEDIR specified in the 
configuration file so a "prankster" can not do nasty things to your
system files by having an Archive-name line like:
	Archive-name: ../../../../../../etc/passwd

It will also not overwrite duplicate files. They are stored underneath
the problems directory specified in the configuration file. The admin 
is alerted to the fact and it then becomes a manual cleanup problem.

                        -------
                        article 
                        -------

Article allows you to view the article headers in much the same manner
that you use a printf statement.  This was initially done for debugging
purposes but I quickly found that it was extremely useful in dealing
with news articles in general. It works great in shell scripts to view
articles that need to be read.... Also super for perusing the archives
directly and generating indexes to the archives in *many* different 
ways...:-)

                        --------
                        ckconfig 
                        --------

This program is used by the admin to verify just how rkive will 
interpret the variable specifications in an archive configuration
file. If you have problems, it will bomb out when it encounters
the problem. Not real smart but it does the job..

------------------------------------------------------------------------
This software set was developed under an archiving model similar to
that maintained currently on uunet. It was intended that the archiving
facilities were more of a "site" facility and not an individuals
facility. (That is unless the individual owned the site :-)). I have
not tried to use rkive for maintaining a private (many on a single machine)
archive. There does not seem to be any reason why it would not work. It
just hasn't been done. rkive will accept an rkive.cf file specified on the
command line so it would be possible for an individual to have their own
mini archive directory structure. This is not recommended if the site is
doing archiving since the software will store multiple copies thus wasting
more disk space than it is worth.  Aside from that, if someone does try it,
let me know how it turns out. :-) :-)
------------------------------------------------------------------------
Credits:
--------
I have to give credit to where credit is do.

I used the code in header.c of the News 2.11 as the basis of ideas for
dealing with the article headers. The code I have written is not the same
but most of the concepts and some of the flow control resulted from reviewing
how it was "suppose to be done". (rfcs only go so far.. :-)) For that I
thank rick adams and the authors of news for the excellent code to study
from.. :-)

I would also like to thank my beta testers for the headaches of dealing
with me, with forcing different ideas on me at a time when I was "almost"
willing to listen :-) and for the many different "full redistribution of 
sources" everytime I had a new version. Specifically I want to thank 
eric@amperif (Eric Johnson) and denny@mcmi (Dennis Page) for putting up 
with me.. :-)
------------------------------------------------------------------------

Please read all the directions below before you proceed any 
further, and then follow them carefully.  

                    --------------
                     Installation
                    --------------

This package uses Doug Gwyn's directory access routines posted in
comp.sources.unix/volume9 (with the bug fix as well). You may need
to get a copy if you don't already have one and your system does 
not support POSIX Compatible directory access routines.

1)  Take the time to format and read the man pages prior to continuing.
		make man | less/pg/more

2)  Review/modify rkive.h to make sure system defines are correct.  

3)  Determine the method for directory creation and edit the Makefile
    accordingly.

4)  make

    This will attempt to make the software in the current directory.

5)  Put rkive, ckconfig, and article into a public directory 
    (normally /usr/local/bin), and put a template of the rkive
    configuration file  (if one does not exist) into a library directory
    (normally as /usr/local/lib/rkive.cf).  Place the man pages in the
    appropriate man directories for your site.


6)  I have set up an account for the source archives.  This is not really 
    necessary but is a personal preference. Archive needs to be run as 
    root *if* you do not have the mkdir () and wish to use the builtin
    since it needs to use mknod() to create directories.

    ---x--x--x  1 root archive    43048 Apr  9 16:38 /usr/local/bin/rkive
    ---x--x--x  1 src  archive    14836 Apr  9 16:38 /usr/local/bin/article
    ---x--x--x  1 src  archive    27448 Apr  9 16:38 /usr/local/bin/ckconfig
    -r--r--r--  1 src  archive     6173 Apr  9 16:40 /usr/local/lib/rkive.cf

7)  Re-read the manual entry for rkive.1 and rkive.5.

8)  Modify the template rkive configuration file to reflect the local
    archive conditions. ckconfig should be used in order to check the
    information that you have just entered/modified in the rkive.cf file. 

9)  VERY IMPORTANT! If you have a problem, there's someone else out there 
    who either has had or will have the same problem.  Please send all 
    patches, ideas, etc to kent@ssbell (or uunet!ssbell!kent) so that I 
    can continue to improve the functionality and portability of this 
    package. 
