    Title: RECIO TIPS
Copyright: (C) 1994 William Pierpoint
  Version: 2.04
     Date: October 10, 1994



1.0 COMMON PROGRAMMING ERRORS

1.1 Inadvertently typing a stdio function instead of a recio function

Many functions in recio and stdio have similar names and similar uses.  
This makes it easier to learn the recio functions, but it also makes it 
easier to unintentionally type in a stdio function when you really want a 
recio function.  When compiling it is best to have all warnings turned on.  
A mistyped function will give a "suspicious pointer conversion" warning; 
a mistyped macro will give an "undefined symbol" error.  Exceptions: 
no error or warning if you use the fcloseall function instead of the 
rcloseall function, or strerror instead of rstrerror.  Hint: in most 
cases you can use the rerrstr function instead of rstrerror.

1.2 Inadvertently typing the wrong symbolic name for an error constant

Symbolic names for error constants are similar between recio and those 
supported by the compiler.  You may have typed ENOMEM when you meant 
R_ENOMEM.  Check to make sure that for valid record pointers, symbolic 
error constants start with "R_"; for invalid record pointers, symbolic 
error constants start with "E".



2.0 ERROR HANDLING

2.1 Callback Error Function

The first use of any recio function should be rseterrfn() to register 
your callback error function for your application.


2.2 Explicit Conditions Not Reported to Callback Error Function

There are two conditions that your code must explicitly handle as 
these are not reported as errors to the callback error function:

1)  Test ropen() for NULL return and, if true, test errno for ENOENT.  
This indicates that the file could not be opened since it does not exist.  
Any other errors are reported to the callback error function (if registered); 
your code can handle them there.

    REC *rp = ropen("file", "r");
    if (!rp) {
        if (errno == ENOENT) {
        /* file does not exist */
        ...
        }
    }

2)  Test the return value from rgetrec().  If it is a NULL return, then 
either end-of-file reached or an error occurred.  You need to follow up on 
the NULL return value to determine which one happened.  You can use either 
the reof or the rerror function.  Any errors would have been reported to the 
callback error function; be careful that you don't report the same error 
twice.

    /* loop through all records in file */
    while (rgetrec(rp)) {
        ...
    }
    if (!reof(rp)) {
        /* error occurred before all records read */
        ...
    }


2.3 Make an Error Check Just Prior to the rclose Function

Check for errors just before closing any record stream.  This is a good 
safety check since (1) you might have forgotten to install your callback 
error function or (2) your callback error function failed to catch and 
correct all the errors.

    if (rerror(rp)) {
        /* file not completely read in */
        ...
    }
    rclose(rp);

If you use recin in your program, check for errors after your last use of 
recin or just before you exit your program.

    if (rerror(recin)) {
        /* error occurred on recin stream */
        ...
        exit(EXIT_FAILURE);
    }
    exit(EXIT_SUCCESS);


2.4 The rsetfldstr Function Clears Error and End-of-File Indicators

The rsetfldstr function has a side effect in that it internally calls 
the rclearerr function, which clears the error and end-of-file indicators.  
The rationale for this is as follows:  

    1. The rsetfldstr function is used in the callback error function to 
       correct data errors.  Even if used elsewhere it's purpose is to 
       force-feed a data value to the program.
       
    2. When the callback error function returns, the recio library 
       functions will only read the replacement value if the error 
       and end-of-file indicators are clear.
    

2.5 The rsetrecstr Function Clears Error and End-of-File Indicators

Rationale is similar to rsetfldstr function, section 2.4 above.



3.0 FIELDS

3.1 Field and Text Delimiters

Field separator and text delimiter characters must be ASCII, but not
the null character as the C language uses it to mark the end of a string.

If a delimiter is set to the space character, it is taken to mean any
white space: space, tab, etc.  See the documentation that comes with your 
compiler for the isspace() function.

Delimiters around text are optional; no error is generated if they are 
missing.  However text delimiters are needed if the string contains a 
field separator character.  Also no harm is done if text delimiters 
are put around non-text fields.

Field and text delimiters apply only to character delimited fields.


3.2 String Fields

Empty strings are legal; no error is generated if there is nothing in a 
string field.  All other types of fields must have something in them or 
a missing data error is generated.

If you do this:

    /* usually bad */
    char *strptr = rgets(rp);

strptr points to the string buffer which changes every time a new field 
is read.  Instead copy the data into your string.  But the method below 
could truncate your data if the field buffer has expanded.

    char str[FLDBUFSIZ+1];
    
    /* could lose data if field buffer has expanded */
    strncpy(str, rgets(rp), FLDBUFSIZ);
    str[FLDBUFSIZ] = '\0';

Instead you will need to dynamically allocate memory space for your strings.  
The macros scpys and scats allow you to dynamically copy and concatenate 
strings.  To use the scpys and scats macros, you will need to (1) set all 
string pointers to NULL when declaring them, and (2) free your strings when 
finished with them.

    char *str=NULL;
    ...
    scpys(str, rgets(rp));
    ...
    free(str);



4.0 FINE TUNING

4.1 Better Use of Heap Space

If you are tight on memory, you can fine tune recio for your application 
by doing the following:

1. Set ROPEN_MAX to the minimum number of files you need open simultaneously.
   Note that recin is always open and must be included in the count.

2. Use rsetfldsiz() and rsetrecsiz() functions to set the maximum size 
   record and field needed for that record stream.  Use these functions 
   before the first field or record is read from the file.  To determine 
   the maximum size record buffer, determine the number of characters in 
   the longest line of the file to be read, including the newline.  To 
   determine the maximum size field buffer, determine the number of 
   characters in the longest field in the record.  If the longest field 
   contains text delimiters, a trailing field delimiter, or white space 
   between the trailing text delimiter and the trailing field delimiter, 
   include these as part of the size.



5.0 IDEAS FOR EXPANDED CAPABILITIES


5.1 Additional Types of Input Functions

The macros rget_fn, rcget_fn, etc are used to define functions that get 
numerical input.  By developing the appropriate conversion functions, 
one could expand recio to get other types of data, such as time, date, 
etc.


5.2 Wrapper Functions and Macros

If you define wrapper functions or macros that supply a default value 
when the record pointer is NULL, then you can combine reading the file or 
reading a set of default values with the same section of code.

REC *rp = ropen("file", "r");
if (rp || (!rp && errno==ENOENT)) {
    /* read data using wrapper functions with default value */
    ...
    if (rp) rclose(rp);
}


5.2.1 Default Value

If your application cannot find a data file, you may want to use a set of
built-in default values.  This could be a good strategy if your application
uses configuration files.  Note that wrappers will be needed on almost every 
recio function that occurs after the ropen function, including rclose, 
rsetrecsiz, rsetfldsiz, rsetfldch, and rsettxtch, to prevent reporting a
null record pointer to your callback error function (or you will first have 
to test for a null record pointer, such as "if (rp) rclose(rp)").

Example Function

The rdgeti function gets an integer from an opened data file or gets the
default value if the data file has not been opened (NULL pointer).

/* if file open, read value from file; else use default value */
int rdgeti(REC *rp, int default) {
    return (rp ? rgeti(rp) : default);
}

Example Macro

You can easily rewrite the rdgeti function as a macro.

#define rdgeti(rp, default) ((rp) ? rgeti(rp) : (default))


5.2.2 Validated Range

In order to validate data, you need to make certain that the value read
from the file is within established limits.  You may want to add functions 
that post the range values to an internal data clipboard which your callback 
error function can access.  If you are letting users correct data on the 
fly, convert the minimum and maximum values to strings, and post pointers 
to the strings.

Example Function

The rrgeti function gets a integer and validates that the integer is within
the established range.

int rrgeti(REC *rp, int min, int max) {
    int result;
    
    result = rgeti(rp);
    if (result < min || result > max) {
        rseterr(rp, R_ERANGE);
    }
    return result;
}


5.2.3 Default and Range

You may want to combine default value and range validation into one function.

Example Function

The rdrgeti function gets an integer from an opened data file or gets the
default value if the data file has not been opened.  The function validates 
that the integer is within the established range.

int rdrgeti(REC *rp, int default, int min, int max) {
    int result;

    if (!rp) {
        result = default;
    } else {
        result = rgeti(rp);
    }
    if (result < min || result > max) {
        rseterr(rp, R_ERANGE);
    }
    return result;
}
