Harry O's HTML Suite for Pilot

I've been working on an HTML parser and browser for the Pilot for about two months now, and had a number of people beta testing it for about five weeks.  This is the first public release of the product, which includes the following ... 

You can download everything as a single ZIP file, or grab the HTML parser, browser,  PointCast parser and this document separately.

Note: the parser executables that come with the standard HTML suite are 32-bit applications. They will not run under Win 3.1, nor under DOS 6.x or earlier.

Here are 16-bit versions of the parser zipped and unzipped. Here are 16-bit versions of the PointCast parser zipped and unzipped.

To pull down the unzipped versions you may need to hold down the shift key when you click on the link.

Unfortunately, the 16-bit versions are limited to normal 8.3 file names. There is nothing I can do about that, I'm afraid.

Release Notes

Here are some notes on recent changes.

A Preliminary Macintosh Version of the Parser

Here's an initial shot at a Macintosh version of the parser.

If you have problems because your browser tries to read these .sit files as text, just hold down the shift key when you click on the links ... at least that's what works on PCs :-).

To run the program you need to put a file named "parser.ini" in the same directory as the executable. That file should have the same form as an indirect file as described below.

I know this isn't particularly "Mac-ish", but at least it gives Macintosh users a chance to try the parser. I'll organise something better sometime soon. Rick Bram has already offered to give me a hand, so if I don't get a chance to learn enough about Mac programming to produce something, I'm sure he'll come to the rescue.

Current Version Numbers

As of 1415 January 12th, 1997 Australian Eastern Standard Time, the latest versions are: parseNN.exe (1.26), browser.prc (0.58) and pcpNN.exe (1.15).

Unfortunately, the Macintosh version of the parser is still 1.14, until I can get over to Matt's and compile it again. Sorry, guys, I'll try to work out a better way of keeping this up to date, but at the moment you're just going to be a couple of versions behind.

Example Documents

There are a number of example documents here, too, that you can download.  There are pre-parsed versions of: this document, my home page, and the help for CrossBow. NOTE: You may have to hold the shift key down when you click on these links.

Of course, the classic document, courtesy of Neil Weisenfeld, is the Pilot FAQ!

Here's a new document, too. It's an episode guide to the X Files. It suffers from the 500 line limit, but I'll fix both the limit and the format of the document very soon.

Finally, here's the US Constitution!

What's Supported?

The parser is not working quite 100% yet. However, it successfully parses most HTML documents. Some tags are unsupported at this point, most notably <PRE> and <DL>. I hope to remedy this situation very soon. 

Obviously, the fonts are limited to those available on the Pilot. In many cases, I've simply used emboldening or underlining to differentiate some of the fonts and heading levels. Later, I intend to add reverse video and allow the user to specify, when parsing, how each font size and header level should be displayed. 

As you can tell from the browser's opening screen, I'm working on getting categories going. 

What's Specifically Not Supported?

Due to limitations in the size, resolution and monochrome nature of the Pilot's display, I currently have no intention of supporting graphics, frames or tables. I may change my mind on some of these as time goes on, but that is unlikely to happen in the short term.

What Does It Cost?

The downloadable, demo version of the browser is freeware, but is time-limited, in that you can only use it for 15 minutes per day. Apart from that, all features that will be in the release version are available. The HTML and PointCast parsers have no limitations, other than those due to them not being finished yet. 

The idea is to let you see how well the browser will work with the documents you wish to display. Obviously, I'm interested in hearing about any problems you see with any of these tools. Do not hesitate to send me some mail.

I intend to differentiate the demo from the release in some respects. For example, the demo version will never have categories, and will probably always have the current 500 line document size and 200 anchor limits.

If you will be browing for less than 15 minutes per day, I'm happy for you to use the demo version forever. However, to obtain a non-time-limited version, I'm afraid you'll have to pay some money. 

I am asking $US20 per copy, which I think is reasonable, considering the amount of time and effort I've put into this project. Obviously, the market will determine the price in the longer term.

At present, the way to register is to send a cheque to a friend of mine if you are in the US, or directly to me if you are in Australia.  If you live somewhere else, you can choose whichever suits you best.  Eventually, I will probably register with Kagi, so people can pay by credit card and First Virtual accounts.

The addresses where you can send your cheque are as follows ...

Jeannine Hammersley
3852 Perie Lane
San Jose, CA, 95132
USA

OR

Harry Ohlsen
GPO Box 4281
Sydney, NSW, 2001
AUSTRALIA

If you're sending a cheque to Jeannine, make it out in her name, otherwise make it out to me.
 
Be sure to include an e-mail address, so I can mail you the .prc file.
 
Apart from a direct payment, if you're an author of some Pilot shareware, I'd be happy to do a registration swap with you. If you consider that the browser is worth as much as a copy of your program, or some set of your programs, drop me a line and I'm sure we can do a deal.  I'm very keen to ensure that high quality shareware keeps getting written for the Pilot!

How Do You Use the HTML Parser?

If you type "parser",  with no other parameters, you will see a detailed usage message that explains how to invoke the parser correctly. Basically, the command line format is ...
where multiple documents can be specified in one command, includng separate titles for each. The -c option turns on compression and the -o option specifies where the output should be written (the default is "html.pdb").

An example might be 'parser -c -t "Fred's Document" fred.html'.

 
If you have a large list of documents you parse often, or if the command will not fit into DOS's line length limitations,. you can specify a file, instead, using the following  syntax ...
The indirect file can contain lines of the following four forms ...

Note that if you have directories or file names with embedded spaces, you will need to quote them. For example, ...

where OUTPUT represents the path to the file that should be created FILE is the path to the HTML file, and TITLE is the equivalent of the -t command-line option. Any lines beginning with '#' are considered comments and ignored.

I intend to produce a detailed document explaining the usage of the various components sometime in the future. However, an example of an indirect file might be ...

output docs.pdb
compress on
document c:/html/fred/index.html "Fred's Document"
compress off
document c:/html/mary/index.html "Mary's Document"

in which case you would end up with a database file called "docs.pdb", containing two documents entitled "Fred's Document" and "Mary's Document".

How Do You Get HTML Into Your Pilot?

The parser generates a file called "html.pdb", which you can install in the same way you install new programs, such as the browser. Eventually, I want to change this to be a lot smarter, but at present that's the best I can offer.

What Does the PointCast Parser Do?

PointCast is a news delivery service.  It downloads current news in various user-selected categories as HTML files.
The purpose of the PointCast parser is to read the HTML files downloaded by PointCast, generate an index file and run it through the HTML parser.  This creates a PDB file containing all of the news, which can be downloaded to your Pilot.
In order for PCP to run the HTML parser, it must be somewhere on your path.
Obviously, the news downloaded by PointCast can become quite large.  You will probably want to use compression on it.
To use PCP, you provide a file called "pcp.ini" that contains the following five possible types of line ...

where DIRECTORY represents a directory that contains PointCast HTML files and CATEGORY reprents the name you want the document to have when it is downloaded to your Pilot.

DATABASEFILE is where the news will end up, and HISTORY is the number of days back PCP will look for news files. The default is to only grab news that was downloaded to your PC today.

For example ...

compress on
output news.pdb
days 2

c:/"programs files"/pointcast/news "General News"
c:/"program files"/pointcast/sport "Sports News"

Would generate a database called news.pdb that contains two documents, one called "General News", containing all the news from c:/programs files/pointcast/news that was downloaded to your PC during the past two days, and the other called "Sports News" containing the last two days' worth of news from c:/program files/pointcast/sport.

Note, also, that pcp32 expects the parser to be called parse32.exe, and PCP16 expects it to be called parse16.exe.

You can provide a file called pcp.flt in any or all of your news directories, that specifies key words for PCP to look for in the title's of news items. If you have such a file then only items whose titles contain one of the phrases will be included.

Placing a "-q" on the command line will cause PCP to display the title of each item that is being considered for inclusion, followed by a question mark ('?'). If you type "y" at this prompt, the item will be included, otherwise it won't.

I intend to write a much nicer front end for PCP at some later stage.  At the moment, though, I think the command-line and indirect file based system is quite adequate for most purposes.  The only thing I really want to do is provide a facility for specifying more precisely what news items the user wants downloaded, using such things as the timestamp on each item.

How Can Pilot Software Authors Use the Browser?

If you want to write documentation for your Pilot application in HTML then I can provide you with a subroutine that will allow you to link to my browser and then have the browser return control to you once the user has finished reading it.

This solves a couple of problems that I've seen people on the mailing list complain about.  Firstly, the documentation can be deleted once the user is familiar with the application.  Secondly, you can provide nice hypertext-rich help, rather than very simple text boxes.

Note that your application must be such that you remember the state it was in at shutdown and return to that state the next time it is started up.

The reason for this is that I have had no success allowing the browser to be called as a subroutine; it's just too large!  Most programs I've seen tend to keep this kind of state information, so I don't think this should be a great problem to anyone.

If you're interested in doing this, just drop me a line.

Known Bugs

The HTML parser has a number of known bugs, that I'm working on sorting out. In particular ...
The browser also has a number of known bugs that I'm working on ...

Acknowledgements

Pat Beirne (MakeDoc and GCM) and Rick Bram (Doc and ZIP) kindly provided me with source code that allowed me to use the same compression as Doc. Thanks guys.

A number of people have been beta testing the browser for the last month or so.  In particular, Adam Deaves has done a sterling job of pointing out bugs in the parser.  Thanks a lot Adam!  Andy Tane was another particularly enthusiastic tester. 

Obviously, Pat and Rick have been  beta testers, along with Matt Peterson (Agenda, PAL, CalcHack and a host of games), Rick Flower, Greg Hewgill (Co-Pilot and Jump), David Gerdes (XWord), Nathan Howell, Dan Hartman and Joe McDonald.  I'm indebted to all of you! 

Matthew Robertson was very helpful in getting the Macintosh version of the parser running, especially since I don't have a Mac compiler!

If I've missed anyone out here, I apologise sincerely.  Send me some mail and you'll end up in the acknowledgements immediately! 

As promised, all of my beta testers will receive a free copy of the non-time-limited browser. 

Now ... The Legal Stuff

The entire Pilot HTML Browser Suite is Copyright 1996/97 by Industrial Software Engineering. 

The demonstration version is provided as shareware, with the provision that no attempt is made to by-pass the time limitations. 

The non-time-limited version of the HTML browser is to be considered commercial software and cannot be distributed without the express permission, in writing, of Industrial Software Engineering (contact harryo@ise.com.au). 

You may not upload the non-time-limited version of the browser to any on-line service, commercial or private, or post it to any archive, web or FTP site without the express written permission of Industrial Software Engineering. 

You may not distribute the non-time-limited version on any CD-ROM collection, floppy or other electronic media without the express written permision of Industrial Software Engineering. 

The non-time-limited version may only be used on one Pilot at a time. Just to make that clear: if you would like each member of your sales team to have a copy then you have to buy one copy for each of them.

Unauthorised use or duplication of the non-time-limited version is strictly prohibited. 

Industrial Software Engineering and its employees cannot be held responsible for any damages incurred, directly or indirectly, as a result of the use of this product. 

This includes, but is not limited to, hardware damage, software loss, data loss, loss of time or income. We make no warranties, express or implied, regarding this product or any other. 

The use of the Pilot HTML Suite is at your own risk. 

Doesn't that make you feel so much more comfortable :-)?  It does me!