    Title: STANDARD USAGE OF C LANGUAGE RECIO LIBRARY
Copyright: (C) 1994 William Pierpoint
  Version: 1.10
     Date: March 28, 1994



1.0 INTRODUCTION

The implementation descibed by this standard usage is a superset of the
recio specification.  Enhancements are noted in the text.


1.1 Mnemonics

The recio functions have been given a consistent mnemonic naming
convention.  All recio functions are in lower case and start with
the letter r.  Function names are analogous to <stdio.h> functions.
Mnemonics are as follows:

Single letter (field functions)               Multi-letter
----------------------------------------      -----------------
b - base (prefix)                             beg - beginning
c - column (prefix), character (suffix)       ch  - character
d - double (suffix)                           col - column         
f - float (suffix)                            cxt - context        
i - integer (suffix)                          eof - end of file
l - long (suffix)                             err - error          
n - number                                    fld - field buffer
r - record pointer (first letter)             fn  - function
s - string pointer (suffix)                   no  - number         
u - unsigned (suffix)                 	      rec - record buffer  
					      siz - size of buffer
					      str - string
					      txt - text
1.2 Order

The order in which the prefix mnemonics appear indicates the order in which
the arguments appear in the function.  The suffix mnemonics tell you what
the function returns.  For example,

rbgetui():
    arguments:  r - record pointer
                b - base (radix) of input
      returns: ui - unsigned integer

Note: c is used in the prefix of a function's name only once even if there
      are two column arguments.  If the function returns a character, there
      is only one column argument; otherwise there are two.



2.0 ERROR CHECKING

The functions declared in the header <recio.h> make use of the errno macro
defined in section 4.1.3 of ANSI X3.159-1989.  This mechanism was chosen
because (1) the <stdlib.h> conversion functions (strtod(), strtol(), etc.)
make use of this error reporting mechanism and (2) the <recio.h> functions
make use of the <stdlib.h> conversion functions.

In this implementation, errno can return the following macro constants:

         0 - No error.
    EACCES - permission denied.
    EINVAL - invalid argument (usually null record pointer).
    EMFILE - too many open files.
    ENOENT - no such file or directory.
    ENOMEM - out of memory.

Beginning with version 1.1, recio functions set errno when the record 
pointer is invalid and set an internal error number when the record pointer 
is valid.  The recio error number is accessed through the rerror function.

The rerror function can return the following macro constants:

         0 - No error.
  R_EINVAL - invalid argument (not the record pointer).
 R_EINVDAT - invalid data.
 R_EMISDAT - missing data.
  R_ENOMEM - out of memory.
  R_ENOREG - unable to register exit function with atexit().
  R_ERANGE - data out of range.



2.1 Define Callback Error Function

First define a callback error function to be used by the recio functions.
You may give the function any name you wish.  In the sample function below,
the name rerrfn is used.  The function takes one argument, a record pointer
(REC *).  It returns nothing (void).  The function must first check for a
valid record pointer using the risvalid function.  Other than that, you can
customize it to do whatever you want.

The recio functions use a callback error function in order to give the
most flexibility in handling errors.  This rerrfn function just sends
information to stderr.  You may wish to send information to a printer,
a file, a window, or a dialog box.  You might even want to give users
the ability to examine errors and enter corrections.  If the error is
corrected, you will want to call the rclearerr function before your
callback error function returns.

When your callback error function is invoked, check rerror() or errno
to determine the cause of the error.

Symbolic errno constants:

* EACCESS means that you don't have permission to access this file.  All
  MSDOS files have read permission.

* EINVAL indicates an invalid argument to a function, usually a NULL record
  pointer.  This resulted from a programming error.

* EMFILE means the program tried to open more files than the maximum allotted
  by ROPEN_MAX or FOPEN_MAX.  If your program is interactive, the user can
  close one or more open record streams.  Or you might decide that ROPEN_MAX
  or FOPEN_MAX needs to be a larger value.

* ENOENT says that ropen() could not find the requested file to open.
  Perhaps the name of the file was misspelled, or your program looked in
  wrong directory.  If your program was trying to read a configuration file,
  it could use internal default values when the configuration file does
  not exist.

* ENOMEM indicates that the program ran out of heap space.  You may be able
  to correct this if you are able to deallocate memory you no longer need.
  For example, you could reduce the size of buffers when the size only
  affects speed.  Such buffers need to be flushed first.  Buffers used by
  the recio library do not fit this criteria.

Symbolic rerror() constants:

* R_ENOREG means the program was unable to register the internal recio exit
  function with the ANSI atexit() function.  The internal recio exit
  function ensures that all open record streams are closed and all dynamic
  memory allocated by the recio library is deallocated.  This error is not
  fatal.  EFAULT only has this meaning within the context of the recio
  library.

* R_EINVDAT says the data is invalid.  Invalid data is caused by an 
  unrecognized character in the field.  For example, rgetui() doesn't 
  expect to see a negative sign, so a negative number will be flagged as 
  invalid data.

* R_EMISDAT says the data is missing.  Missing data means the field is empty.  
  If you expect a number, you could substitute either zero or some unique 
  number to indicate an empty field.

* R_ENOMEM indicates that the program ran out of heap space.  You may be able
  to correct this if you are able to deallocate memory you no longer need.
  For example, you could reduce the size of buffers when the size only
  affects speed.  Such buffers need to be flushed first.  Buffers used by
  the recio library do not fit this criteria.

* R_ERANGE tells you that the data is outside the range of the function.
  For instance, suppose you used rgeti() to get an integer and the data
  value is 32768.  If a 16-bit integer has an upper limit of 32767, the
  value is too large.  If the data is wrong, you can have the error
  function correct it.  If the data is right, you have to correct the
  program.

The main purpose of this sample callback error function is to show some of
kinds of things you can do in a callback error function.  Note that when an
error occurs, the column number indicator rcolno() has moved just beyond
the error.  To make it clearer to the user where the error occurred, rerrfn()
displays rcolno()-1, but not less than the column number for the first column 
of the record.  For a more detailed callback error function, see the source 
code for one the test programs.

/* define callback error function */
void rerrfn(REC *rp)
{
    int errnum; /* error number */

    /* if rp is a valid record pointer */
    if (risvalid(rp)) {

      /* reof flag set */
      if (reof(rp)) {
          fprintf(stderr, "ERROR reading %s: "
           "tried to read past end of file\n\n", rnames(rp));

      /* rerror flag set */
      } else {

          /* determine cause of error */
          errnum = rerror(rp);
          switch (errnum) {

          /* data errors */
          case R_ERANGE:
          case R_EINVDAT:
          case R_EMISDAT:

              /* print location of error */
              fprintf(stderr, "DATA ERROR in FILE %s at LINE %ld,"
               " FIELD %u, COLUMN %u\n", rnames(rp), rrecno(rp),
               rfldno(rp), max(rcolno(rp)-1, rbegcolno(rp)));

          /* warnings: non-fatal errors */
          case R_ENOREG:
              fprintf(stderr, "WARNING: could not register exit function\n");
              rclearerr();
              break;

          /* fatal errors (R_EINVAL, R_ENOMEM) */
          case R_EINVAL:
            fprintf(errout, "FATAL ERROR reading FILE %s: invalid argument", 
             rnames(rp));
            abort();
            break;
          case R_ENOMEM:
            fprintf(errout, "FATAL ERROR reading FILE %s: out of memory", 
             rnames(rp));
            abort();
            break;
          default:
            fprintf(errout, "FATAL ERROR reading FILE %s: unknown error", 
             rnames(rp));
            abort();
            break;
          }
      }

    /* else invalid record pointer */
    } else {
        switch (errno) {

        /* non-fatal errors */
        case EACCES:
        case EMFILE:
          fprintf(errout, "WARNING: %s\n", strerror(errno));
          break;

        /* fatal errors (EINVAL, ENOMEM) */
        default:
          fprintf(errout, "FATAL ERROR: %s\n", strerror(errno));
          abort();
          break;
        }
    }
}


2.2 Register Callback Error Function

Once you have written your callback error function, you must let the other 
recio functions know that it exists.  You use the rseterrfn function to 
register your callback error function.

    /* register rerrfn() as callback error function for recio */
    rseterrfn(rerrfn);



3.0 OPEN FILE


3.1 Open File and Get Record Pointer

Use the ropen function to open the file you want to read.  Store the record
pointer returned by the ropen function.  To read from standard input, do 
not try to open recin.  It is always open, so it does not need to be opened 
or closed.

    REC *rp = ropen("FILENAME.DAT", "r");


3.2 Check Record Pointer

Following the ropen function, you need to check to see if the file was
opened correctly.  If ropen returned a NULL pointer, then the file was not
opened.

Errors other than ENOENT are reported to your callback error function.
ENOENT is not reported since you may want to use default values if the
data file is not available.

    /* if ropen() failed */
    if (!rp) {
        /* if it failed because file does not exist */
        if (errno==ENOENT) {
            /* action to take when file does not exist */
            ...
        }
    /* else ropen() succeeded */
    } else {
        /* set up for read (see sections 3.3 and 3.4) */
        ...
        /* read through file (see sections 4 and 5) */
        ...
        /* close file (see section 6) */
        rclose(rp);
    }


3.3 Set Field and Text Delimiters

The space character is the default value for both the field and text 
delimiters.  If you need to use something else, you need to explicitly
set the values.  Application maintenance may be easier if you always 
set the values.

    rsetfldch(rp, ',');  /* set field delimiter character */
    rsettxtch(rp, '"');  /* set text delimiter character */


3.4 Set Field and Record Buffer Sizes

Setting the field and record buffer sizes is optional.  Buffers will be
automatically reallocated as necessary.  However if you set the field and 
record sizes in advance to the maximum value needed, you can reduce memory 
fragmentation.

    rsetfldsiz(rp, 41);  /* set size of field buffer */
    rsetrecsiz(rp, 133); /* set size of record buffer */


3.5 Set Context Number

If your application opens record streams with more than one data format, you 
will want to set a context number.  You use the context number so that your 
callback error function can determine (using the rcxtno function) which data 
format it is dealing with.  Each context number must be a positive integer; 
zero and negative numbers are reserved.

#define SOILS_DB      1
#define BUILDINGS_DB  2

     rsetcxtno(rp, SOILS_DB); /* set context number */


3.6 Set Beginning Column Number

The first column number in the record buffer defaults to zero.  If you prefer 
column numbering to start at one, use the rsetbegcolno function.  It is mainly 
useful if using column delimited data.  If a number takes up the first ten 
columns of the record, the column numbering will be 0 to 9 if rsetbegcolno() 
is set to 0, or 1 to 10 is rsetbegcolno() is set to 1.

     rsetbegcolno(rp, 1); /* number first column as one */



4.0 READ ALL RECORDS IN FILE

4.1 The rgetrec Function

If all the records in a data file have the same format, you will want to 
loop through all the records until the end of file is reached.  If each
record has a different format, you must call the rgetrec function each
time you want to get the next record.  Calling rgetrec() is optional for
the first record.

    /* loop through all records in file */
    while (rgetrec(rp)) {
        /* Section 5 field functions go here ... */
    }


4.2 The rrecs Macro

To get a pointer to the start of the record buffer, use the rrecs macro.

    /* echo record contents to stdout */
    printf("%s\n", rrecs(rp));


4.3 The rrecno Macro

To get the record number, use the rrecno macro.

    /* echo record number and record contents to stdout */
    printf("%ld: %s\n", rrecno(rp), rrecs(rp));



5.0 GET FIELD DATA FOR EACH RECORD

The recio functions can handle records for two types of fields: 
(1) character delimited and (2) column delimited.  


5.1 Character delimited fields

5.1.1 Character fields

5.1.1.1 The rgetc Function

Use the rgetc function to get a field consisting of a single non-whitespace 
character.  Any whitespace in the field is skipped.

    /* get one non-whitespace character */
    int ch = rgetc(rp);


5.1.2 String fields

String field functions return a pointer to the string buffer.  The string 
buffer is overwritten each time a new string field is read.  To save the 
string for later use, copy the string to a character array with sufficient
space to hold the string (including the terminating null).


5.1.2.1 The rgets Function

Use the rgets function to get a field consisting of a string.

    /* duplicate string in string buffer */
    char *str = strdup(rgets(rp));
    ...
    /* free string memory space when done with string */
    free (str);


5.1.3 Floating point fields

5.1.3.1 The rgetd Function

Use the rgetd function to get a field consisting of a double precision 
floating point number.

    /* get a double */
    double result = rgetd(rp);


5.1.3.2 The rgetf Function

Use the rgetf function to get a field consisting of a single precision
floating point number.

    /* get a float */
    float result = rgetf(rp);


5.1.4 Integer fields

5.1.4.1 Base 10 integer fields

5.1.4.1.1 The rgeti Macro

Use the rgeti macro to get a field consisting of an decimal integer.

    /* get a decimal integer */
    int result = rgeti(rp);


5.1.4.1.2 The rgetl Macro

Use the rgetl macro to get a field consisting of a decimal long.

    /* get a decimal long */
    long result = rgetl(rp);


5.1.4.1.3 The rgetui Macro

Use the rgetui macro to get a field consisting of an unsigned decimal 
integer.

    /* get an unsigned decimal integer */
    unsigned int result = rgetui(rp);


5.1.4.1.4 The rgetul Macro

Use the rgetul macro to get a field consisting of an unsigned long decimal 
integer.

    /* get an unsigned decimal long */
    unsigned long result = rgetul(rp);


5.1.4.2 Explicit base integer fields

5.1.4.2.1 The rbgeti Function

Use the rbgeti function to get a field consisting of an integer in a 
specified radix.

    /* get a hexadecimal integer */
    int result = rbgeti(rp, 16);


5.1.4.2.2 The rbgetl Function

Use the rbgetl function to get a field consisting of a long integer in a 
specified radix.

    /* get a hexadecimal long integer */
    long result = rgetl(rp, 16);


5.1.4.2.3 The rbgetui Function

Use the rbgetui function to get a field consisting of an unsigned integer in 
a specified radix.

    /* get a hexadecimal unsigned integer */
    unsigned int result = rgetui(rp, 16);


5.1.4.2.4 The rbgetul Function

Use the rbgetul function to get a field consisting of an unsigned long 
integer in a specified radix.

    /* get a hexadecimal unsigned long integer */
    unsigned long result = rgetul(rp, 16);


5.1.5 Other Functions

5.1.5.1 The rskipfld Macro

If your application does not need the data in a field, you can skip over the
field by using the rskipfld macro.

    /* skip over a field */
    if (rskipfld(rp) != 1) printf("Unable to skip field.\n");


5.1.5.2 The rskipnfld Function

If your application does not need the data in several adjacent fields, you 
can skip over the fields by using the rskipnfld function.

    /* skip over three fields */
    if (rskipnfld(rp, 3) != 3) printf("Unable to skip 3 fields.\n");


5.2 Column delimited fields
    
5.2.1 Character fields

5.2.1.1 The rcgetc Function

Use the rcgetc function to get a character from a specific column.

    /* get character from column number 12 */
    int ch = rcgetc(rp, 12);


5.2.2 String fields

String field functions return a pointer to a static string.  This static
string is overwritten each time a new string field is read.  To save the 
string for later use, copy the string to a character array with sufficient
space to hold the string (including the terminating null).


5.2.2.1 The rcgets Function

Use the rcgets function to get a string between two column locations.

    /* duplicate string in string buffer */
    char *str = strdup(rcgets(rp, 0, 9));
    ...
    /* free string space when done with string */
    free (str);


5.2.3 Floating point fields

5.2.3.1 The rcgetd Function

Use the rcgetd function to get a double between two column locations.

    /* get a double between columns 0 and 9 */
    double result = rcgetd(rp, 0, 9);


5.2.3.2 The rcgetf Function

Use the rcgetf function to get a float between two column locations.

    /* get a float between columns 0 and 9 */
    float result = rgetd(rp, 0, 9);


5.2.4 Integer fields

5.2.4.1 Base 10 integer fields

5.2.4.1.1 The rcgeti Macro

Use the rcgeti macro to get a decimal integer between two column locations.

    /* get a decimal integer between columns 0 and 9 */
    int result = rcgeti(rp, 0, 9);


5.2.4.1.2 The rcgetl Marco

Use the rcgetl macro to get a decimal long integer between two column 
locations.

    /* get a decimal long between columns 0 and 9 */
    long result = rcgetl(rp, 0, 9);


5.2.4.1.3 The rcgetui Macro

Use the rcgetui macro to get a decimal unsigned integer between two column 
locations.

    /* get a decimal unsigned integer between columns 0 and 9 */
    unsigned int result = rcgetui(rp, 0, 9);


5.2.4.1.4 The rcgetul Macro 

Use the rcgetul macro to get a decimal unsigned long between two column 
locations.

    /* get a decimal unsigned long between columns 0 and 9 */
    unsigned long result = rcgetul(rp, 0, 9);


5.2.4.2 Explicit base integer fields

5.2.4.2.1 The rcbgeti Function

Use the rcbgeti function to get an integer in a specified radix from between
two column locations.

    /* get a hexadecimal integer between columns 0 and 9 */
    int result = rcbgeti(rp, 0, 9, 16);


5.2.4.2.2 The rcbgetl Function

Use the rcbgetl function to get a long in a specified radix from between two 
column locations.

    /* get a hexadecimal long between columns 0 and 9 */
    long result = rcbgetl(rp, 0, 9, 16);


5.2.4.2.3 The rcbgetui Function

Use the rcbgetui function to get an unsigned integer in a specified radix from 
between two column locations.

    /* get a hexadecimal unsigned integer between columns 0 and 9 */
    unsigned int result = rcbgetui(rp, 0, 9, 16);


5.2.4.2.4 The rcbgetul Function

Use the rcbgetul function to get an unsigned long in a specified radix from 
between two column locations.

    /* get a hexadecimal unsigned long between columns 0 and 9 */
    unsigned long result = rcbgetul(rp, 0, 9, 16);


5.3 Other Functions

5.3.1 The reof Macro

Use the reof macro to determine when the record stream has reached the 
end of file.

    /* if error or end of file reached */
    if (rgetrec(rp)==EOF) {
    
        /* if end of file */
        if (reof(rp)) {
           ...
        /* else error */
        } else {
           ...
        }
    }
    

5.3.2 The rerror Macro

Use the rerror macro to determine if an error has occurred on a record 
stream.  The rerror macro returns the error number.  It is a good practice 
to check for any errors just prior to closing a record stream.  If the 
error indicator is clear, you have additional confidence that the stream 
was read correctly.  

    if (rerror(rp)) printf("File %s not read correctly.\n", rnames(rp));
    rclose(rp);


5.3.3 The rseterr Function

If you write wrapper functions or other functions that interact with 
recio functions, your code will need to handle errors.  If can use 
the rseterr function to set the error number and to call the record 
stream callback error function.

/* get integer and validate range */
int rrgeti(REC *rp, int min, int max) {
    int result;
    
    result = rgeti(rp);
    if (result < min || result > max) {
        rseterr(rp, R_ERANGE);
    }
    return result;
}


6.0 CLOSE FILE

6.1 Close File

When finished reading a data file, close it.  Do not attempt to close recin 
as it is always open.

    /* close record file */
    rclose(rp);


6.2 Close All Files

Rather than closing record files one at a time, one can close all open 
record files at once using the rcloseall function.

    /* all done */
    rcloseall();



7.0 INDEX

errno macro ............ 2.0, 2.1, 3.2
rbegcolno macro ........ 2.1
rbgeti function ........ 5.1.4.2.1
rbgetl function ........ 5.1.4.2.2
rbgetui function ....... 5.1.4.2.3
rbgetul function ....... 5.1.4.2.4
rcbgeti function ....... 5.2.4.2.1
rcbgetl function ....... 5.2.4.2.2
rcbgetui function ...... 5.2.4.2.3
rcbgetul function ...... 5.2.4.2.4
rcgetc function ........ 5.2.1.1
rcgetd function ........ 5.2.3.1
rcgetf function ........ 5.2.3.2
rcgeti macro ........... 5.2.4.1.1
rcgetl macro ........... 5.2.4.1.2
rcgets function ........ 5.2.2.1
rcgetui macro .......... 5.2.4.1.3
rcgetul macro .......... 5.2.4.1.4
rclearerr macro ........ 2.1
rclose function ........ 6.1
rcloseall function ..... 6.2
rcolno macro ........... 2.1
rcxtno macro ........... 3.5
recin expression ....... 3.1, 6.1
reof macro ............. 2.1, 5.3.1
rerror macro ........... 2.1, 5.3.2
rflds macro ............ 2.1
rfldno macro ........... 2.1
rgetc function ......... 5.1.1.1
rgetd function ......... 5.1.3.1
rgetf function ......... 5.1.3.2
rgeti macro ............ 5.1.4.1.1
rgetl macro ............ 5.1.4.1.2
rgetrec function ....... 4.1
rgets function ......... 5.1.2.1
rgetui macro ........... 5.1.4.1.3
rgetul macro ........... 5.1.4.1.4
risvalid function ...... 2.1
rnames macro ........... 2.1
ropen function  ........ 3.1
rrecs macro ............ 2.1, 4.2
rrecno macro ........... 2.1, 4.3
rsetbegcolno function .. 3.6
rsetcxtno function ..... 3.5
rseterr function ....... 5.3.3
rseterrfn function ..... 2.2
rsetfldch function ..... 3.3
rsetfldsiz function .... 3.4
rsetfldstr function .... 2.1
rsetrecsiz function .... 3.4
rsettxtch function ..... 3.3
rskipfld macro  ........ 5.1.5.1
rskipnfld function ..... 5.1.5.2
