EXPIRE ====== Syntax: EXPIRE Some news articles have to be deleted from time to time, or all those articles will overflow your disk. Most of the time, it's useful to delete the oldest articles. This is the task of the program EXPIRE which takes its parameters from the file EXPIRE.DEF (this file should be in the same directory as NEWS.DEF, that's the directory given in CONFIG.SYS as directory for news articles). Each line of EXPIRE.DEF contains a set of data for one single news group or a whole set of news groups (or it is an empty line or a comment introduced by '#'). EXPIRE reads the lines of this file until it finds a match. If one set of date shall be valid for one news group (or group of news groups) and another for some set of news groups which contains the first one, the line for the more special news group(s) must appear in the file before the line for the more general case. Each line starts with the name of a news group. If a name ends with a '*', the line is valid for all news groups starting with that partial name. NB: This is NOT the pattern matching algorithm described in RNEWS(1) because the EXPIRE program is much older. Name and set of data are separated by a '|', and data itens are separated by spaces. Each data item consists of a number and single key letter. If no matching line is found for a news group, _all_ lines that don't have a news group name and separator (only data items) are used. In contrast to the other lines (search stops after a match) these lines are all used by EXPIRE one after the other. A line '*|...' matches ALL newsgroups and therefore terminates the search in EXPIRE.DEF. # example: news.group.name|20a 48h 30c 10k news.group.*|100a 7d50a3d # and now as second example my own EXPIRE.DEF: # (from a time when I only had two megabytes for news :-) # # there are big discussions in sub.config -> big limit of # 300 articles) sub.config|300a # the group junk may be deleted almost immediately (0a :-) # if the system is left to itself whithout an operator watching, # it might be better to delete articles only after at least half a # day. To be sure, there are two limits: 200 articles and max. # 10 KB per article junk|200a 12h 10c # some 'high-traffic' newsgroups: # minix only without big source postings comp.os.minix|200a 22h 10c # culture 'normal', but not as many days as other groups soc.culture.*|200a 22h 10c # and now the data for _all_ the other newsgroups: # normally 3 days should suffice # just to be sure, no more than 200 articles 200a 70h 20c # I always use 2 hours less than whole days to enable EXPIRE # to delete the articles just before the next poll at some time # exactly n days later. Possible key letters: a: Number of [A]rticles which are to remain. The most recent n articles are kept. But it does not matter whether these articles really exist or whether there are gaps in the sequence of articles (e. g. by deleting them with some other condition). EXPIRE only looks at the article numbers. With this method, no directories or single articles have to be searched, resulting in the fastest method for EXPIRE which should always be used as the first method if several methods are to be used for a newsgroup (the number of articles will be reduced dramatically before applying other slower methods). k: [K]ilobytes which are to remain The most recent n kilobytes are kept. For this method, EXPIRE computes the disk space needed for the articles (that's the space which is used on disk, counted in clusters, and _not_ the size of the articles itself which may only partially fill a cluster). EXPIRE sums up the sizes of the files starting at the most recent file and keeps all files including the file which pushes the total size over the given limit. All older files are deleted. Because all sizes must be extracted from the directory, this method is rather slow and should be used as one of the last conditions in the list (if used at all). h: [H]ours which are to remain Articles which were received by RNEWS during the last n hours before calling EXPIRE are kept. As sizes, directories have to be searched for the time stamps. Therefore this method, too, is somewhat slow. But the search is started at the oldest article (smallest article number) and in most cases only a small part (#) of all articles will be searched until a file is found which is recent enough to be kept. All other articles (numbers between that article and the highest article number) are kept (hint for experts: because the time is rounded, "1h" will keep articles which are newer than 2h, including articles which arrived 1 hour and 59 minutes ago). All articles which have higher numbers than the found article are kept no matter what timestamp they have. This will be important if some article has been modified by an editor or by some other means (which is definitely not done, is it?). (#) small part = (time since last EXPIRE run) / (number n) If EXPIRE is called once a day and articles are kept for 50 days, only 1/50 = 2% have to be searched each time d: (D)ays which are to remain This method is almost the same as the above ('h'). Instead of giving n days, it would have the same effect to give n*24 hours. (Hint for experts: only hours are rounded, resulting in articles of age "n days plus 59 minutes" to be kept, but articles of age "n days plus 1 hour" to be deleted.) (Another hint: This is older than the 'h' method.) c: [c]ut long articles There are some low-volume newgroups for which a rather small threshold for the 'k' method would be sensible. But if a small value for 'k' is given, a single _big_ article might result in all of the other articles being deleted. With this method, it is possible to delete those articles which are way too big for that group. With this method, it is possible to delete big source postings in groups like comp.os.minix which has many normal articles, but sometimes, there are three or more big articles (with a volume of about equal in size to the rest of the whole group) in a row. This method requires the search of _all_ entries in a directory and therefore takes longer than any of the other methods. But to be of any use, it should not be used as last method, but immediately before a check with method 'k'. If both 'h' (or 'd') and 'c' are used, 'k' probably will have no further effects and should be ommitted to save time. The following method is NOT available (yet). I don't know whether someone might find them useful. Please feel free to comment on these and other methods you would like to haveby writing mail to anson@akb.in-berlin.de or martini@heaven7.in-berlin.de. s: delete short articles As a counterpiece to method 'c' this method deletes all those articles which are shorter than a limit. This could be useful if short tests (in testgroups) should not be kept (a reflector could take care of them :-), but lists of failed articles or other interesting stuff (articles longer than n KB) could be kept. Same applies to newsgroups with much traffic which only consists of 'me too' postings (eg. discussions in story, binary or source groups). This method would delete discussions from those groups, but not the real postings. AUTHOR: Andreas K. Bewersdorff contributed heavily to the code of this program, as well as to the documentation and even to the translation. But I did do _some_ work on all of those as well :-) --Martin