July 31, 1991 By Mike Bilow, N1BEE AX.25: N1BEE @ KA1RCI.RI.USA.NA Internet: mikebw@idsvax.ids.com Amprnet: n1bee@n1bee.ampr.org [44.104.0.20] Standard post: Forty Plantations, Cranston, RI 02920-5554, USA For GRINOS/GRINOS-S KA9Q 910618/PA0GRI 910714v1.7e/N1BEE 910731v0.60 All users of GRINOS/GRINOS-S are invited to join the grinos@n1bee mailing list. This is what I use to distribute advisories and updates. There is no requirement that you actually use GRINOS in order to join the mailing list. Just send a message to me and you will be added. Users accessible to me via the Amprnet will receive their updates that way, and others via the conventional packet BBS forwarding system. I. KA9Q/PA0GRI changes Aside from a minor bug fix in IP routing (which no one even noticed), there are no changes from PA0GRI 910625v1.7d to 910714v1.7e, other than an attempt to make the FTP client a little more helpful by warning if no user name or password is supplied, rather than simply disconnecting you. The watchdog timer fix that has been in the N1BEE release for several versions has now been cycled back into the PA0GRI release, so the "official" code is now used for initializing the watchdog timer. The ongoing Internet discussion about the N1BEE fix for the temporary file problem in Borland C has resulted in the PA0GRI version of this fix being considered non-functional, and Gerard himself has advised that his code not be used. Therefore, the original N1BEE code (the same as was in N1BEE 910713v0.50) has been substituted, which is the current recommended consensus from TCP-Group. II. N1BEE changes A. Bug fixes and default modifications There are major changes in this release from N1BEE, most relating to the internals of the memory allocator, stacks, and multitasker. 1. Stack sizing Substantial reliability improvements should be seen in this version, particularly in hosts which handle large numbers of disk accesses, such as mail servers. The timer daemon stack has been increased from 1024 bytes to 2048 bytes, which should fix the main culprit resonsible for system crashes. The network daemon stack has been increased from 1536 bytes to 2048 bytes, and the remote server stack has been increased from 768 bytes to 1024 bytes. 2. Minimum heap kludge A problem that has plagued GRINOS from the beginning, crashes caused by a shell making more core memory inaccessible, has a new but grossly inefficient kludge. The problem is not really with GRINOS, but with the MS-DOS memory allocation scheme. There is a "breakpoint" which divides the GRINOS heap (memory allocated by MS-DOS to GRINOS, whether used in GRINOS at the moment or not) and the MS-DOS core (memory unallocated by MS-DOS). If GRINOS needs memory, it first tries to satisfy that request out of its own available heap space. If that fails, GRINOS then makes a request to MS-DOS for more core, thereby enlarging its heap space. When a shell is executing (because of the GRINOS "shell"/"!" or "mail" commands), MS-DOS will deny any requests for more core, since memory after the GRINOS heap is owned by the shell. The fix for this -- and I don't defend it -- is that GRINOS will now allocate and immediately deallocate some memory (default 12K) before shelling out, guaranteeing a minimum of available heap space while the shell is executing. The disadvantage to this, and it is serious, is that GRINOS never returns its heap to core. If the GRINOS heap grows sufficiently fragmented, it is conceivable that successive shell actions will grab increasing amounts of core, reaching a point where no space is left to run anything (such as the mailer) when shelled out. Because of the two-edged nature of this situation, I have created a new command, "mem minheap", which sets the amount of memory that will be requested and immediately freed before shelling out. I have found 12K to be fairly conservative in preventing crashes, and this is the default. I would suggest dropping this to 8K in systems which are very pressed for memory. However, this parameter has no effect until an actual shell command is issued, and most systems, to be stable, should have at least 12K of available heap when running, anyway. 3. Memory debugging By default, complete memory statistics are now dumped into the log on any memory allocator failure, such as "invalid free" or "invalid pointer". This should assist me in collecting reports of memory allocation problems. This feature requires that ordinary logging be enabled, or it is disabled. Dumping of memory statistics to the log can be independently disabled even if the ordinary log is enabled (although I have no idea why anyone would do this) with the new "mem debug" command. B. RSPF support Motivated mostly by significant interest from KC4TWU and the North Carolina group, I have been trying to get the bugs out of RSPF. The Rhode Island group did some experiments several months ago using another version of NOS, before GRINOS was available, and grew fairly disgusted. RSPF is a relatively complex protocol, and, while I am willing to accept that the protocol itself is sound, there are a lot of odd implementation details. For example, we found in testing that it is necessary to have what would otherwise be redundant manual routes. Here is my old routing table for n1bee.ampr.org: route add [44.104]/16 dr0a [44.104.0.2] route add default dr0a [44.56.0.64] To run RSPF with the standard hop quality of 8, I decided to set these routes to have a metric of 12. This works fine: route add [44.104]/16 dr0a [44.104.0.2] 12 route add default dr0a [44.56.0.64] 12 Unfortunately, kz1f.ampr.org, who was also running RSPF, has a route in his table for [44.56]/16 with a quality of 8. After the added hop, his route showed up in my table with a quality of 16. Since his route was a 16-bit matched route, while mine was a default (0-bit matched) route, my station would send [44.56.x.x] frames to him instead of to the switch, which I can reach directly. I had to add the following to prevent this: route add [44.56]/16 dr0a [44.56.0.64] 12 Presumably, if the switches were also running RSPF, then this problem would have been fixed automatically, since I would have received a higher quality route from the switch RSPF broadcasts. Here is the RSPF installation code in my AUTOEXEC.NOS file (most users will want to change "dr0a" to "ax0"): arp add [44.255.255.255] ax25 QST-0 route add default dr0a [44.56.0.64] 12 route add [44.56]/16 dr0a [44.56.0.64] 12 route add [44.104]/16 dr0a [44.104.0.2] 12 ifconfig dr0a broadcast [44.255.255.255] rspf maxping 3 rspf interface dr0a 8 1 rspf rrhtimer 900 rspf timer 900 rspf suspecttimer 2000 rspf message "N1BEE testing RSPF" A switch or router, instead of an end user, should probably use a much larger horizon: "rspf interface dr0a 8 16", for example. End users should stick with horizon of 1, according to the protocol. There is a change in the route command as of N1BEE 910731v0.60 for RSPF support at switches. Previously, there was no way to add a direct route with a metric, other than a 32-bit match, since the metric follows the gateway field, and the gateway field is blank. This became apparent when I was trying to write an AUTOEXEC.NOS file for switch.w1cg-9 to use RSPF. For a 32-bit matched route, this is not a problem, since the gateway field can be filled with the host's own address, using that host as a gateway (like "route add [44.104.0.23]/32 ax0 [44.104.0.23] 12"). To fix this, I have created a new keyword, "direct", which may be used as an explicit placeholder for the gateway field: route add [44.104]/16 ax0 direct 12 The "direct" keyword has a few complications. First, it is case sensitive, as are most commands in NOS, and it cannot be abbreviated. The reason for enforcing this exactness is to prevent compatibility problems if someone defined a valid hostname simlar to the keyword, like "di.princess.palace.uk" or "direction.future.plato.edu". Second, in the unlikely event that there is an exact match for the keyword in the DOMAIN.TXT file, that match will OVERRIDE the use of the keyword, and be interpreted as a gateway address, again for compatibility reasons. (Note that "direct.ampr.org" would be an exact match if domain suffix is set to "ampr.org.") C. RIP support Since no one seems to be using RIP, support for it has been removed from the small version. III. Future plans Although I have promised a lot of things, particularly the option to use reverse video instead of high intensity in ttylink screens for the benefit of LCD users such as Justin, KA1ULT, this just has not been as high a priority as getting the memory allocator working and the stacks tuned. Barring disaster, this should be the last GRINOS release for quite some time, perhaps as long as a month. I think that GRINOS has now acheived a level of stability that makes its use very reasonable in all but the simplest systems. The ever-growing memory size problem is becoming a concern, especially for those who want to run GRINOS is a DesqView window. I am actively pursuing two approaches over the long term. The first is to examine what would be involved in a DesqView-specific version of GRINOS, as suggested by K5ZC; I don't hold out very high hopes for this. The second is to implement code overlays in NOS, which could reduce its memory requirements by half. Overlaying code in multitasking program is very difficult, but it has been done; Roy Engehausen, AA4RE, sent me the details of how he managed to do it in Turbo Pascal for his soon-to-be-released BB program. I have also had some inquiries over time, the most recent from Matt, KA1THM, about accessing the second port on a dual-port Kantronics TNC. If anyone has this running under existing versions of NOS, I would like to know about it. If you can't get it to work, but have a need for it, I would like to know about that, too. EOF