
surf14.zip      Surf v1.4: Free Win95 web browse/analyze robot

        README   TXT         7.324  05-16-98  2:46p README.TXT
        SURF     EXE       194,560  04-12-98 10:40p SURF.EXE
        SURF     CPP       176,376  04-12-98 10:40p SURF.CPP

  ==============================
  1. The purpose of the program.
  ==============================

Surf v1.4 is a text oriented web browser robot driven by URL lists.

An URL list can be started by parsing a link page or a bookmark file,
or perhaps by parsing all HTML files in your Netscape cache directory.
Surf can also prepare a query to send to a battery of search engines.

Surf will use http to fetch text/html resources listed in the URL list,
creating a directory full of web pages. These pages are parsed and text
is word wrapped within 60 columns. Most HTML tags are removed except
for good anchors (links). An HTML base tag is added to allow the later
resolution of relative links in order to locate the original web site.

Surf will only fetch those URLs that you have selected. You select URLs
by editing the list file, and removing an asterisk character from those
URLs that sound interesting. Surf annotates the unfetched URLs, keeping
the longest cleaned-up anchor text that was associated with each URL.

Already fetched URLs are kept in the URL list annotated by three quality
digits (log of sentence, word and link counts) and sorted by net quality.
A local path\filename is shown and 100 bytes of most-used uncommon words.

Surf can output a clickable HTML file of all the fetched URLs showing
the most-used word lists. See the author's home page URL for examples.

Additional usage comments are in the source code.

Special requirements: Windows 95.

You have to be logged in to your Internet Service Provider, as you must
be for Netscape or Internet explorer, whenever SURF is to fetch pages.

  ===========================================================
  2. If installation is required, how to install the program.
  ===========================================================

Execute the Windows 95 MS-DOS Prompt to get a command line.

Place the SURF.EXE file somewhere in your execution "path",
(as shown by "SET") or place SURF.EXE in the current directory.

Type SURF to see its usage message.

Type SURF with some arguments to use it.


  =====================================================================
  3. The status of the program (Public Domain, Freeware, or Shareware).
  =====================================================================

 * Copyright (C) 1998 Glenn Scheper. This program is free software;
 * you can redistribute it and/or modify it under the terms of the GNU
 * General Public License as published by the Free Software Foundation;
 * either version 2 of the License, or any later version.
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details. You should have
 * received a copy of the GNU General Public License along with
 * this program; if not, write to the Free Software Foundation,
 * Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * GNU General Public License: <URL:http://www.fsf.org/copyleft/gpl.html>.

  ============================================================
  4. The distribution status of the program.  Example: freely
     distributable or limitations on distribution.
  ============================================================

This program is freely distributable.


  ===================================================================
  5. How to contact the author in the event of questions or problems.
  ===================================================================

 * Contact SURF's author Glenn Scheper at <URL:mailto:scheper@hughes.net>.
 * Download SURF.CPP and SURF.EXE at <URL:http://hughes.net/~scheper/>.

 More timely response can be had at e-mail address: gscheper@gentech.com


  =============================
  6. Here are some usage ideas:
  =============================

  Work at the Windows 95 MS-DOS prompt.
  Make a clean directory for each subject, and work in that directory.

  I like to organize all the directories under this top directory:

        md \surf


  1) To fetch the SURF author's home page, say:
        
        cd \surf
        md glenn
        cd glenn
        surf list -f http://hughes.net/~scheper/

  That will create a single file in the current directory ending in .HTM
  which contains text for you to read with an editor, and the good links.

  It will also create a file LIST containing all the URLs of the fetched
  pages, and of the unfetched good links listed in those pages.

  Use any text editor to remove each asterisk (*) character that appears
  on the anchor text line after any links that sound interesting to you.

  Then you can fetch all of those pages, just by saying:

        (Did you edit list, removing some or all asterisks?)
        surf list

  At any time, stop the process with any keystroke.  Meanwhile, you
  may open another MS-DOS window to read the files that SURF brought.

  When there are no more URLs fetch, SURF will tell you and stop.
  Then, you can repeated go deeper into the web, as follows:

        <edit list, removing some or all asterisks>
        surf list


  2) To start research from someone's big bookmark page that you find:

  Suppose you ran across a bookmark page in Netscape, like mine,
  "The 2000 highest scoring web pages that I've surfed (500 Kb)",
  which is located at:

        http://hughes.net/~scheper/bestlink.htm

  Copy and paste that URL from Netscape's location field into Notepad.
  (Note that the URL must start at the beginning of a line, otherwise
  SURF would interpret that line as annotation instead of an URL.)
  Save that as a new file, perhaps as the file named "\surf\later".

  Go to the MS-Dos prompt, go into a new directory, and surf it:

        cd \surf
        md best
        copy later best\list
        del later
        cd best
        surf list
        <edit list, removing some or all asterisks>
        surf list


  3) To start research from your own Netscape bookmark page:

        cd \surf
        md mine
        surf list -a "\Program Files\Netscape\Navigator\bookmark.htm"
        <edit list, removing some or all asterisks>
        surf list
        

  4) To take another look at all the anchor text and the links that
  you hadn't noticed (you'll be surprised!) in all the web pages
  that you have recently browsed and are cached in Netscape:

        cd \surf
        md cache
        cd cache
        surf list -a "\Program Files\Netscape\Navigator\Cache\*.htm"
        <edit list, removing some or all asterisks>
        surf list

  5) To start research by means of a query to several search engines:

        cd \surf
        md alien
        cd alien
        surf list -q ufo alien
        surf list
        <edit list, removing some or all asterisks>
        surf list

  ========
  The end.
  ========
