Documentation for dbl_kill v1.20 10.8.90 Public-Domain - see Point 3 --------------------------------------- 1. What's this program for ? ---------------------------- Dbl_kill is a FIDONET/The-Box/QBBS/ACS-compatible double-message-killer, that means if there are some identical msgs in one of your areas dbl_kill will mark all of them but one as deleted. 2. Usage of the program ----------------------- If you call the program with option -h, you will get a help screen, that tells you what you can do with several options: dbl_kill -h <-- enter this from a command line interpreter I won't describe this options here because you can read that in this help screen 8-) 3. Public Domain ---------------- This Program is PD, that means it costs you *nothing*, you can use and copy it (but always including this text !!!) like you want, but you MUST NOT modify this text or the program. If you want to say 'Thank You!' because it works fine or '@%#%+~/ !!!' because it doesn't, you can send a message (or error report) to: Thomas Waldmann FidoNet address: 2:509/27.1 4. Disclaimer ============= You use this program at YOUR *OWN* RISK ! This program was tested in 2 FidoNet-Nodes and 1 Point for some versions and worked fine - so that I suppose that I have removed most bugs (incl. these from Turbo-C ;-), but remember: There's always one bug more! That means, if your system crashes, thousands of messages are send to nirwana and your computer explodes - and all this through a misty bug in this program, this is *YOUR* problem - not mine. 5. System environment expected by the program --------------------------------------------- Although very UNcritical (other programs need *MUCH* more), here is the memory and (hard-)disk usage of the program: Memory usage: statically: 60KB dynamically: minimum: msg_count_in_largest_area*10Bytes + two_largest_msg_txts, but please: Don't try this ! best: because of the built in header/msg-cache, the program works faster, if the *.HDR-File and some of the msgs of each area fit in memory (Hard-)Disk usage: additional space on disk only needed, if not enough memory for shortening of logfile available. Then there must be Kbytes free space on disk ! Also the few bytes written into the logfile should be available! Environment Variable and file access (access r=read, w=write): Environment variable needed: MAILER (r) MAILER-path must contain the files: "areas.bbs" (r) "#areas.bbs" (r - only if -a option not set) "tb.cfg" (r) "acs.cfg" (r - only if -a option set) Further: logfile (r/w, path from entry "statuslog" in "tb.cfg", e.g. "system.log") echolist file (only if -a option set - path from "echolist" in "acs.cfg") all *.HDR files from "areas.bbs" (r/w) all *.MSG files from "areas.bbs" (r) If you are still reading and interested in the algorithm used by the program - read on (warning: C-Pseudo-Code contained!)... 6. Algorithm used by the program -------------------------------- Here are a few introducing words: The program uses (for speed reasons) hashcode(*)-like values of the message- headers - these values are calculated only one time for all msgs in an area, and then all comparisons for double msgs are made on these values. If the values are the same, there is a high probability that the msgs are double, but to be sure, the msg-headers and the msg itself are looked through for differences. If there's only one difference, the msgs aren't double and therefore will not be deleted. I think this algorithm is very safe - it kills a lot, but not too much! (*) a hashcode is a typical value for a data object, similar to a checksum, used by hashing-algorithms [in this program no hashing is made, I only took the name...] The program was written with (Atari ST) Turbo C v1.1, an ANSI C-Compiler, which produces fast && compact code. The source code of this program is written in pure ANSI C and should be easily portable. If there's anyone out there, who wants to port it to other machines, contact me ! Return codes of dbl_kill v1.20: ------------------------------ #define OK 0 /* no error - all ok */ #define EDIRNF -1 /* File areas.bbs not found */ #define EENVNF -2 /* env_var MAILER not found */ #define ELENNF -3 /* File #areas.bbs not found */ #define ECFGNF -4 /* File tb.cfg not found */ #define ELOGNFC -5 /* statuslog file not found */ #define ESTLNF -6 /* STATUSLOG entry not found */ #define ELOGNC -7 /* Logfile not creatable */ #define ETMPNC -8 /* temporary file not creatable */ #define EMALLOC -9 /* malloc failed */ #define EELNF -10 /* ECHOLIST entry not found */ #define EELFNF -11 /* ECHOLIST file not found */ Here is the pseudo-C-code of the program, derived from the original program text - best to be read from the end. Almost all things concerning file-access, logfile and screen message output, error handling, dynamical memory allocation,'C'-implementation etc. have been left out here. /*** dbl_kill v1.20 ***/ /* hashcodes were the same, now to be sure: */ is_double(message1,message2){ if(-d option set) if(destaddr1!=destaddr2) return FALSE; if(messagehdr1.to!=messagehdr2.to) return FALSE; if(messagehdr1.from!=messagehdr2.from) return FALSE; if(messagehdr1.topic!=messagehdr2.topic) return FALSE; /* now compare msg-txt without SEEN-BY, PATH and program-infos */ if(messagetxt1_up_to_"\n---" != messagetxt2_up_to_"\n---") return FALSE; /* else ... */ return TRUE; } kill(message){ set flag 'DELETED' in header; } scan_area(area,length){ if(length_of_headerfile!=length or option -a or -e set){ calc_hashcodes_of_all_msgheaders; /* look for double msgs */ for(all pairs message1,message2 in this area, excluded the ones containing a deleted msg) if(hashcode[message1]==hashcode[message2]) if(is_double(message1,message2)) kill(message2); /* ^^^^^^^^ the newer one */ } } dbl_kill(){ path=get_environment_variable(MAILER); dirfile=path+"areas.bbs"; lenfile=path+"#areas.bbs"; cfgfile=path+"tb.cfg"; open_logfile(value of entry statuslog in cfgfile); if(option -a set) for(all areas in echolist file) scan_area(area,dontcare) else for(all areas in dirfile){ length=get_length_from_lenfile; scan_area(area,length) } close_logfile(); } main(){ if(option -s is set) shorten_logfile(); dbl_kill(); } 7. Versions of the program -------------------------- V1.00 First spreaded version V1.01 Forget this (trash) version, delete it !!! V1.02 Bug (2 bombs) in dynamical memory allocation of V1.00 fixed (this bug appeared only if you had a lot of msgs with identical headers [more than free memory], because the memory malloc'd for these msgs wasn't free'd after use) V1.10 -a option for ACS-System compatibility now - uses ECHOLIST file instead of #areas.bbs -e option for scan of Every area V1.20 much faster now, uses internal header/msg-cache and modified algorithm -d option added for usage on nodes: in the netmail area of nodes are many double msgs - the msgs for the points ! they only differ in the destination address and should *not* be deleted. The -d option includes a test for different dest. addresses in the double-test-routine of dbl_kill. IF YOU ARE A NODE, PLEASE INCLUDE THE MAIL-AREA IN YOUR AREAS.BBS NOW AND USE THE -d OPTION, IF POSSIBLE. THIS PREVENTS THAT DOUBLE MSGS ARE SENT TO YOUR POINTS !!! 8. Sorry for my English ! -------------------------