




                            LIT TEXT UTILITY MANUAL

                             Version 2.0, 11/19/86
                      Copyright (C) 1986  Donald J. Irving

  Lit  is  a  command  line invoked text utility which filters a text file to
  stdout   printing  printable  characters  as  they  are,  and  showing  all
  non-printable  characters  in  any  one  or  more  of  three representation
  formats.   The  only  character interpreted (acted upon) by lit is the line
  feed  character which causes lit to issue a line feed.  The inspiration for
  lit  came from the "l" command in many of the UNIX line editors. Lit is not
  quite  the  same  as  any  of  these, however. For one thing, lit output is
  never ambiguous.  

  Here is an example of what lit does: 

      Say the file 'myfile' consists of the following ascii characters: 

              HT, HT, h, e, l, l, o, space, w, o, r, l, d, BEL, LF

      Saying 'lit myfile' would produce the following output:

              \t\thello world\007\n

      And saying 'lit myfile [various options]' might produce any of:

              \t\thello world^G\n
              ^I^Ihello world^G^J
              \011\011hello world\007\012
              \09\09hello world\07\0A
              \009\009hello world\007\010

  You control the output with optional command line arguments which provide: 

      1. The name of the file to read as input.
      2. What subset of the file lines to print.
      3. In which format(s) to represent non-printable characters.
      4. Which number base to use for numeric representations.

  If you do not supply these, they default (in the original version) to:

      1. Stdin.
      2. The whole file.
      3. Backslash constructs if possible else  numeric representations.
      4. Octal.

  Here  is  the  command line template. The arguments may be specified in any
  order.  The  -bcanohd  options may be stacked after one minus sign, or they
  may appear as separate arguments.  

      lit [<filename>] [-s<linenum>] [-p<numlines>] [-[bcan][ohd]] 









                          THE NAME OF THE INPUT FILE

  The  first  command  line argument  encountered which does not start with a
  minus  sign   is  considered  to  be  the  input  file name. Any subsequent
  command  line argument which does not start with a minus sign is considered
  to  be an error.  If no command line argument is found which does not start
  with a minus sign lit uses <stdin> for input.  

                    PRINTING A SUBSET OF LINES OF THE FILE

  Lit  prints the whole file by default. You can tell it on which line in the
  file  to  start printing and/or how many lines to print by supplying either
  of both of these command line arguments: 

      -s<linenum>         lit will start printing at line <linenum>
      -p<numlines>        lit will print <numlines> lines

  There  is  no  space  between  the  's'  or 'p' and the number. There is no
  validity checking on the number values.  

               FORMATS FOR REPRESENTING NON-PRINTABLE CHARACTERS

  There   are   three  formats  in  which  non-printable  characters  may  be
  represented:  C  Language  style  backslash  representations  such  as  \n,
  control   character   representations   such   as  ^J,  and  numeric  value
  representations such as \012.  

  C Language Backslash Representations 

  The  form  is a backslash followed by a lower case letter. Here is the list
  of the applicable characters: 

      line feed           \n
      horizontal tab      \t
      backspace           \b
      carriage return     \r
      form feed           \f

  The  ascii  NUL character representation \0 is omitted.  NUL is represented
  by its control character representation or as a numeric value.  

  Control Character Representations 

  The  form is a caret followed by another symbol, where the second symbol is
  the  keyboard  control  character  of the character to be represented.  For
  example,  the  ascii  line  feed character is represented as ^J.  The ascii
  character DEL has an arbitrarily assigned representation of ^?.  













  ASCII Numeric Value Representations 

  The  representation  is  in  the  form  \num  where  num is the character's
  numeric  value. (the unsigned integer value of its eight bits) displayed in
  any  of  the  three number bases octal, decimal, or hexadecimal.  For octal
  representations,   num   is   exactly   three   octal   digits;   for   hex
  representations,  num  is  exactly two hexadecimal digits; and for  decimal
  representations,  num  is  exactly three decimal digits. Num is zero-padded
  on  the  left  if  necessary to make up the required number of digits.  For
  example,  the  ESC  char  is  represented  as  \033, \027, or \1B in octal,
  decimal,  and  hex  respectively.  NUL  would  be \000, \000, or \00.  This
  format  is  not  limited  to  ascii  characters;  any  eight  bits  can  be
  represented.  Numbers  of  \200  (octal),  \128  (decimal),  \80  (hex), or
  greater  are  byte  values beyond the upper end of the ascii character set.
  The  largest  byte value (all bits on) is  \377 (octal), \255 (decimal), or
  \FF (hex).  

          COMMAND LINE ARGUMENTS FOR SELECTING REPRESENTATION FORMATS

  You  tell  lit which representation format or combination of formats to use
  for   non-printable  characters  by  supplying  one  of  the  command  line
  arguments  -b,  -c,  -a,  or  -n.  If  you supply none of these, then -b is
  selected  by  default.   If  you  supply  more  than  one,  then the latter
  supersedes the former.  

      -b      use backslash representations such as \n
              if possible, else use numeric representations.

      -c      use control char representations such as ^J
              if possible, else use numeric representations.

      -a      all; use backslash reps if possible, else use control
              char reps if possible, else use numeric representations.

      -n      use numeric representations only.


  You  tell  lit  which  number  base  to  use for numeric representations by
  providing  one  of  the command line arguments -o, -h, or -d. If you supply
  none  of  these,  then  -o is selected by default.  If you supply more than
  one, then the latter supersedes the former.  

      -o      octal
      -h      hexadecimal
      -d      decimal















                            EXCEPTIONAL CHARACTERS

  Two  characters have special meaning in lit output. The backslash character
  \  always  has  special  meaning. The caret character ^ has special meaning
  whenever control character representations are enabled.  

  The Backslash Character \ 

  As  already  described, the \ character in lit output signals the beginning
  of  either  a  special  letter  representation  such  as  \n  or  a numeric
  representation  such  as \012. The \ is also used to relieve a subsequent \
  or  ^  of  its  special meaning.  \\ represents the actual character \, and
  (when  control  character  representations  are  enabled) \^ represents the
  actual character ^.  

  The Caret Character ^ 

  When  control  character  representations  are  enabled,  a  ^  signals the
  beginning  of  a  control  character  representation  such  as ^J. Note the
  implication  therefore that ^^ means Control caret (ascii RS), and ^\ means
  Control  backslash  (ascii FS). In both of these cases the second character
  is  relieved  of  its  special  meaning  because  it is part of the control
  character  representation.   If  control  character representations are not
  enabled, then ^ is just another printable character.  

                                  CONCLUSION

  Lit  fills  the  gap  between  text editors which usually interpret special
  characters  in  special  ways,  and  hex dump utilities which make terrible
  reading  for  text  files.   One  of  lit's  greatest  strengths is that it
  interprets  nothing  but  the  linefeed  character; everything else is just
  represented to the output stream.  

  Although  lit  provides  a  variety  of  output  formats,  perhaps its main
  usefulness  is in quickly locating U.F.O.s (Unidentified File Objects) that
  have  gotten into your text files.  (like that ESC char that's wierding out
  your  printer) For this purpose, the default options are adequate, and, for
  C programmers at least, already familiar.  



  Donald J. Irving
  9812 Gardenwood Way
  Sacramento, CA 95827
  (916) 366-3225

  CIS:    73547,1335
  PLINK:  ops158












  Post scripts:  

  ** 

  One  convenient way of getting to know lit is to use the default input file
  stdin.  Just  say  'lit [-options]' with no file name. Now you  can type in
  lines  one  at  a  time  and  have lit filter them back to you.  Try typing
  control  characters  to  see how they come back.  Keep in mind that in this
  configuration,  the  CLI  is  still trapping and interpreting (acting upon)
  what  you type, so  screen control characters like  form feed, and tab, for
  example,  actually  cause form feeds and tabs to occur on the screen before
  lit  has  a chance to send you its output.  This may make the screen look a
  little  messy,  but  at  least if the CLI is interpreting everything it can
  tell when you type Control C to break out.  

  ** 

  Want  to  have  lit  give  you a Usage statement? Say 'lit lskdmlsdm' where
  lskdmlsdm  is  any  string of garbage which doesn't add up to the name of a
  real file.  

  ** 

  Why  not  use  \0  to  represent  NUL?  Consider  the  following  character
  sequence: 

              BEL, space, NUL, 0, 7 

  Using  \0  for  NUL  would  yield  the  output  '\007  \007'. To avoid this
  ambiguity,   the   \0   construct   is   not   included  in  the  backslash
  representations.  

  ** 

  Why  use  ^?  for  DEL?  Keyboard  control  characters are always 64 places
  higher   in   the  ascii  table  than  the  non-printable  characters  they
  represent.   DEL is at the high end of the ascii character set, however, so
  there's  no  keyboard  character  to  represent it.  We need to arbitrarily
  choose  some  character.   The  ?  seems  to  make at least some sense as a
  choice;  it  is  64  places  less than DEL, and that kind of satisfies ones
  desire  for  symmetry  in the world. (Besides, some of the UNIX world tools
  already do it that way.) 

  **

  If  you  don't  like  the  default option settings, they are very simple to
  change  in  the  C  source.  If you don't have a C compiler, and can't live
  with  the  settings,  I  will  be willing to recompile it with your desired
  option  settings.  Send me a disk in a protective mailer and include return
  postage. I will return your disk in the same mailer.  






