This README file consists of three sections.  The first section describes in a 
little more detail what the programs "collect" and "generator" do.  The second 
section describes the installation of the collection package.  The last section
describes a sample collection run.

In the release note posted, we discussed the importance of estimating loss-
rate in trace collection.  If you do not have the equipment necessary for
estimating loss-rate, you can ignore all mentions of the "generator" program in
the rest of this file; we would still encourage you to collect data which we 
will use to support conclusions drawn from primary traces.  

Jeff Mogul and Vern Paxson notified us of the ability of tcpdump to log dropped
packets on Ultrix and BSD+BPF systems.  You can turn tcpdump logging on and 
ignore all mention of "generator" in the rest of this file.

1. DESCRIPTION
==============

"Collect" is a shell script which invokes the tcpdump program to collect 
network packet headers.  We are interested only in the IP and TCP headers of 
packets with their TCP SYN, FIN, or RST flag set.   We use these packets to 
denote the beginning and end of a TCP "conversation." To collect packets, 
tcpdump puts the ethernet interface in promiscuous mode.  It passes the packets
through a user-supplied filter and collects only those packets that come 
through.  (We'd like to thank Vern Paxson at LBL for his tcpdump help).

The collection routine must run on a machine connected to the same ethernet 
segment as your site's Internet gateway, so that all Internet packets can be 
observed.

Since tcpdump is not loss-less, we have written the "generator" program to 
estimate the loss-rate experienced during collection.  Generator sends a data 
stream between two internet hosts, by "pinging" a pre-specified port of a 
remote host; these packets are discarded by the remote host.  The interarrival 
times of the ping packets are exponentially distributed.  This in-band data 
stream is sampled by the collection routine to estimate the loss-rate.  
Needless to say, the path from the machine where "generator" is running to the 
target machine where it is sending packets to has to traverse the ethernet 
segment where "collect" is running, and they both have to be on the same subnet
(See Figure 1).

Loss-rate is highly dependent on the collection machine's CPU utilization, thus
we suggest that the collection be run on a dedicated workstation.  Furthermore,
since SunOS's nit interface cannot collect packets generated by the local 
machine, the generator program has to be run on a machine separate from the one 
running collect (See Figure 1).

             ======================================
                               ||
                               ||
                             +----+
                             | GW |
                             +----+
                                |
                                |
             -------------------------------------------------
                 |                       |                 |
                 |                       |                 |
            +---------+              +-------+         +---------+
            |generator|              |collect|         |generator|
            | source: |              | host: |         | target: |
            |   kos   |              |caldera|         |  xanadu  |
            +---------+              +-------+         +---------+

                Fig. 1: Example collection run setup.


2. INSTALLATION
===============
In the same directory as this README file, you should see the following files:

     178 Makefile
    2570 collect*
    2171 collect.1
    1399 generator.1
    9883 generator.c

The files "collect.1" and "generator.1" are the manual pages for "collect" and
"generator," respectively.  

"Makefile" is the makefile for "generator."  You need to run make(1) to create 
the "generator" program.

"Collect" is the shell-script that calls tcpdump.  If you have tcpdump-2.0
installed, then your installation is complete.  If you don't have tcpdump
installed, however, you need to ftp it from the same place you got this 
collection package.  Once you ftp-ed it, you have to run make(1) in the
tcpdump directory to create it.

3.COLLECTION
============

There are two steps to a collection run:

1. If you are running on SunOS and want to do loss-rate estimation, run the 
   "generator" program at the "gen_src" machine (Ultrix and BSD+BPF systems
   allow tcpdump to log dropped packets.  Make sure you turn loggin on.):

       gen_src% nohup generator -h gen_target &

   Example:

       kos.usc.edu% nohup generator -h xanadu.usc.edu &

   Note:
   * The "generator" program should be run on a machine separate from the data 
     collection machine (nit interface cannot collect packets generated by the 
     local machine).
   * The source/target pair should be selected so that both are on the same
     sub-net the packets generated will traverse the ethernet segment on which
     the collection program is running.
   * This program takes very little CPU time, it should not impact the system 
     load of the machine it is run on.
   * If you want to collect traces for more than the default 24 hours, use the 
     -d option to specify the corresponding running time for "generator."
   * If you have to kill "generator" before its duration runs out, use 
     "kill -1" to give it time to clean up and write out its statistics.
    
    
2. Run the "collect" program at the "collect" machine, which should be
   different from the "gen_src" machine:
  
       collect# nohup collect -p gen_src gen_target localnet_number &

   Example:

       caldera.usc.edu# nohup collect -p kos.usc.edu xanadu.usc.edu 128.125.51 &

   Note:
   * The tcpdump executable (version 2.0) must be in your search path.
   * Tcpdump dumps the headers of packets on a network interface that
     match the boolean expression.  Under SunOS: You must be root to invoke 
     tcpdump or it must be installed setuid to root.  Under Ultrix: Any user 
     can invoke tcpdump once the super-user has enabled promiscuous-mode 
     operation using  pfconfig(8).   Under BSD: Access is controlled by the
     permissions on /dev/bpf0, etc.
   * The localnet_number is used to specify the filter for determining
     internetwork traffic, i.e. packets between hosts that have this number as 
     part of their IP address will be considered local area network traffic and
     not captured.  For a class A net only the first octet need be specified
     (i.e. 26), for class B two octets (i.e. 128.125), and for a class C 
     three octets (i.e. 192.12.100).  
   * You need to specify the "-p gen_src gen_target" option only if you
     are doing loss-rate estimation under SunOS w/ nit interface.
   * "Collect" is setup to run for 24 hours, however, we will appreciate it
     if you could run it for a longer duration; and if you do, don't forget
     to run "generator" for the corresponding period.
   * Depending on the load of your network, disk-space requirement for a 
     24-hour run should not be more than 20MB.

The collected packet headers are saved in the file <hostname-date> on the 
collection machine.  The statistics from "generator" is saved in the file 
<hostname> on the generator machine.  Please contact us to arrange for 
dropping off the output files.  Please let us know also if you have problems 
with the collection package.  We can be contacted at traffic@excalibur.usc.edu.
