BREGX is a demonstration program using AWK to process 
*binary* files.  The source code is specific to the Thompson 
AWK compiler and Instant AWK versions, but may contain ideas 
useful in other implementation of the language.  Since all 
files are fundamentally binary, the program can be used on 
ANY type of file.

Limits:
        The largest file that can be processed is the 
smallest of: the amount of disk space available, 2^32 bytes, 
or the largest that the operating system can support.

        The largest replacement text is the smaller of the 
maximum length you can get on the command line or 128 
characters.

        The pattern has the same limits as the replacement 
string.

        The smallest file that can be processed is 1 byte; 
the program will not attempt to read empty files.


SYNTAX EXAMPLES:

 BREGX {pattern} {string} {source} {target} [offset] [count]

 BREGX {pattern} {source} /f [offset] [count]
 
where {pattern} is the regular expression to search for; 
{string} is the string literal to substitute for the 
pattern; {source} is the source file; {target} is the name 
of a file to write the processed data to; offset is the 
number of bytes to skip over before beginning the processing 
(a HEX number in the format /0xnnn); count is the maximum 
number of replacements to make.

SWITCHES:
        /0xnn   sets the number of bytes to skip over before 
                any processing begins.

        /a      causes the program to write the offsets of
                matches/substitutions to STDOUT.

        /cnn    sets the upper limit on the number of 
                matches/substitutions allowed.

        /f      puts the program in find-only mode.  There 
                is no output file, but the offsets of the 
                matches are written to STDOUT.  This switch 
                forces the /a switch also.

ERRORLEVELS:
        The program returns an ERRORLEVEL to the calling 
batch file or program:

Termination -
    0   -   normal termination

Abend -
    1   -   no arguments
    2   -   unrecognized switch
    3   -   user aborted at "Replace target?" prompt
    4   -   excessive number of arguments 
    5   -   replace mode chosen, but either replacement 
            string or target file argument (or both) missing
    6   -   source file or pattern argument missing (or both)
    7   -   source file does not exist or is unreadable
    8   -   source file is empty
    9   -   target file cannot be opened for write
   10   -   defective /cnn argument
   11   -   the pattern matches the replacement string

This program is presented primarily as example code, with an 
executable version for those without access to the required 
compiler.  However, there is no reason why it can't be used 
as a general purpose find/replace tool.

The program has not been extensively tested, so some bugs 
are to be expected, as is non-optimum coding of some 
routines.


OK, so I've got the program, now, what use is it?

        BREGX \xcc \xdd {icon1} {icon2} /0x80

copies an icon file while changing blue to magenta.

        BREGX {pattern} {patch} {exe1} {exe2} /c1 /0x400

applies a patch to an executable at the first match after
the 1024th byte.

And so forth.

BINARY REGULAR EXPRESSIONS:
        Regular expressions are much more powerful than wild 
cards - they allow quite sophisticated matches.  In binary 
regexs, it is necessary to express non-typable characters 
with escape strings of the form \xnn, where nn is the value 
of the byte in hex.  In this particular program, the 
beginning of string and end of string meta characters (^ 
and $) are of little use.  For files of less than 4096 
bytes, they match the beginning and end of the file, but for 
longer files, they match the beginning and end of each file 
read cycle, in a fairly complex way due to an overlap of 128 
bytes in the reads.

In general:

        *       matches zero or more occurrence of the 
                preceding character

        ?       matches zero or one occurrence of the
                proceeding character

        +       matches one or more occurrences of the
                preceding character

        .       matches any single character

        []      enclose lists or ranges of permissible
                matches ( [aA] matches either 'a' or 'A'
                [0-9] matches any digit.

        ^       in some positions matches everything 
                *except* the following class or character

        |       is the OR operator.

        ()      are used to group sub patterns.

        \xnn    is a byte value in hex

        \\      is the backslash character

        \+      is the plus sign character
                etc.

There is no intrinsic relationship between the pattern and 
replacement string.  They do not have to be the same length, 
or have anything in common.  "" is the null replacement; quote 
marks and several other characters must either be escaped 
with a preceding backslash or given as byte values or they 
will be interpreted by the command processor before they are 
passed to the program.


How it works:

Review the source code for this information.
        BREGX.AWK
        OVERHEAD.FCT
        PARSER.FCT
        ENGINES.FCT
        ERRHNDL.FCT

MAKE.BAT will compile the program, provided that you have 
the Thompson AWK compiler and that %AWK% points to it.

BREGX.EXE is the stand-alone executable compiled from the 
supplied source code using version 2.03a of the AWK compiler 
supplied by
        Thompson Automation Software
        (800) 944-0139

(The compiler is excellent, but a bit pricey.)

The program writes most of its messages to STDERR, except 
for the list of offsets and the closing summary.  You might 
want to change some or all of that.


LICENSE and LEGAL stuff:

This program has been placed in the PUBLIC DOMAIN by the 
author, on 6 March 1994.  The Author advises you that the 
program is provided AS-IS, complete with probable bugs.  
Using this program signifies YOUR acceptance of all 
responsibility for its behavior.

Ted Davis
[73500,2314]
6 March 1994
