This article is reprinted from the May 1990 edition of TechNotes/dBASE
IV.  Due to the limitations of this media, certain graphic elements
such as screen shots, illustrations and some tables have been
omitted.  Where possible, reference to such items has been deleted. 
As a result, continuity may be compromised.  

TechNotes is a monthly publication from the Ashton-Tate Software
Support Center.  For subscription information, call 800-545-9364.

The Myth of Multi-File-Append
Joe Stolz

There has always been much discussion and confusion in the relational
database world about multi-file append: the capability of appending to
several open files simultaneously.  If you are at odds about the
issue, the next several pages will be enlightening.

Related files  Why Bother?

Files can become related when two or more data files contain some
common field that allows the information in one to be "linked"
logically to the data in one or more other files.  A common example of
this is an employee file of names and addresses which relates to a
time card file of worked hours for every employee.  There is a unique
employee number for each employee which is duplicated in both files. 
Each of the daily records of an employee's start and end times also
contain the employee's number.  This setup, where one record in one
file points to several in a second file, is called a one-to-many
relation.  

Why do we want a one-to-many relationship and why do we want to relate
or link the two files together?  The answer is simple.  Imagine that
if we only had one single file, we'd have to type in the name,
address, phone number, etc. for each employee every single day.  That
certainly would be tedious not to mention inefficient.  A relational
setup, on the other hand, allows us to enter the static personal data
only once in one file.  At any given time we can view the data from
both files as if they were one big joined file.  

Further Compelling Evidence

So, you may ask, why not make one big joined file and find some way to
"ditto" or copy the name and address from the previous record to each
new record?  This would take the tedium out of data entry and avoid
the relation.  However, this method would make changing personal data
much more complex if the data exists in dozens, maybe hundreds of
records  you would have to update every record and there is still 
room for error in data entry.   If all the personal data is in one
record, it means only a single record has to be changed.  If the
employee and time card files are related, the changes appear as an
instantaneous update of the employee's personal data for every one of
the corresponding daily time records.

If that isn't a compelling enough argument, consider the fact that
copying the employee's personal data along with the daily time records
will take up much more disk space than using the relational method.  I
don't know anyone today who has enough hard disk space, especially for
unnecessarily duplicated data.  
For a detailed discussion of how and when to split files to optimize
disk usage and to minimize updating, see Programming with dBASE IV,
Chapter 4 and Advanced Topics, Chapter 2.  The technique involved is
the science of data normalization.

Why is it Read-Only?
Now that we are convinced that relating files is the way to go, how do
we get data into both the files simultaneously?  A new user will soon
discover  that, although there is a query facility in the Control
Center to relate files, you cannot make these linked views EDIT-able,
much less  APPEND-able!  What I mean is that when you link two files
in a  query, the resultant view is flagged as read only; the data can
be seen, but not altered.  Why does dBASE IV enforce this
limitation?  
Let me amplify this allegation for those unfamiliar with the way
queries function.  You can create a query that contains two file
skeletons by placing a linking variable under the column for the
common field in both files which causes the files to be related.  When
you view the data, the files appear as one big, joined file.  dBASE IV
automatically makes the parent file's data appear as if it were
dittoed!  This means that the Query generator is automatically doing
the hard work for you, duplicating the repeated data record by record,
yet giving you the advantages of separate files in terms of those
issues described previously.  It's a simply wonderful feature. 
However, dBASE IV marks the related, displayed data as "read only"
making appending or editing impossible.  Why dBASE IV causes this to
occur is the subject of computer science courses and books (ref.
C.J.Date's book A Guide to DB2 (Addison-Wesley, 1984 pp. 129-134).
Explaining the Dilemma
A simple example will highlight the dilemma of modifying the data of
just such a view.  Let's use the above example and visualize a single
table that has employee names in column one of the table, the address
in column two, other personal data in subsequent columns, and start
and end times in the last columns.  
As I mentioned, dBASE IV is copying the personal data over and over
from the single employee record in the employee file at the same time
that it is combining it with each unique day's record in the time card
file.  The result is what appears as duplicated personal data.  What
if I were able to change the employee's name or address in even one of
this employee's personal data records?  Since only one record has been
changed, the implication is that only that specific record is
inaccurate or inconsistent with other relative records in that file. 
But in the underlying files, a single record in the personnel file is
potentially responsible for other records of personal data.  Does it
make sense to change one record out of many, and does that single
change mean that you want all applicable records to be changed?  Part
of the data in each row of a view comes from one file, part from the
other.  What does it really mean to change data that is joined in a
view? 
This thorny problem is handled in a safe fashion.  dBASE IV marks all
data in the underlying files as read only to prevent accidental or
ambiguous changes to the files.  In fact, this is the recommended and
accepted way that related files have their data protected in both PC
and mainframe database software products.  It is important to
understand the basic concepts involved in queries, and why it is
logical to be unable to add data to related files that are set up as
linked views.
What Are the Alternatives?
So, how are we going to get the underlying data into each file of the
view?  There's got to be an accepted method, and in fact, there is. 
We do this by adding data to each file separately.  

To further clarify the problem, let's use the previous example.  I
expect to be able to locate an employee by name, show this data on the
screen, and also to be able to add the daily time card information to
the daily record file.  However, the view I would set up, and the data
entry form that displays data from both files will not allow me to
enter data to the two files.  As discussed already with regard to
editing data, it's impossible to add data to two files
simultaneously.  If we were able to add new records to the view, would
we be adding new records to both files simultaneously?   Actually,
no.  Even if appending were possible, only the currently selected file
would be appended to.  Furthermore, the same thorny problem still
applies:  how do you add one record in one file and many records in
the second file?  You may have a format file that shows fields in
alternate database filess but since those files are not the currently
selected database, they cannot be extended or appended to.
Knowing When to Add Records 
In the case of already existing employees, the personal data of the
employee already exists and need not be duplicated, so data needs to
be added to only one of the files.  However, in the case of new
employees, we need to add records to both files initially.  This
ambiguity, to sometimes add data to one file, sometimes to both,
cannot be intuitively managed by a computer program that needs to do
the same actions mechanically, over and over.  You could have two
appending views, one for adding records in the personnel file and one
for time card records but that is the same thing as adding data to the
two files separately and "un-relatedly"!  

This clearly illustrates the dilemma of multi-file append routines. 
Everyone wants them.  Everyone demands them and some may cite dBASE IV
for not providing a facility to automatically produce them.  But
modern computer science on this level is not yet accommodating the
protean characteristics of this problem.  
Let's examine a system of separate data entry routines.  When a new
employee arrives, we open the Employee file and add all the pertinent
personal information.  When this employee's personal data changes, we
have only one single record to change.  Day by day, as time card
information is added, we simply enter a unique employee number and
their in and out times.  There is no need to search for an employee in
the employee file and no need to place time card data into any
particular order in the time card file either.  
Indexing the time card file on the employee number will group all
records for a particular employee together.  In addition, using QBE we
can make a view that displays all the information from the two files
in a very organized fashion.  We can index the Employee file and then
relate this file to the time card file using the one-to-many concept. 
Then, a search for a particular employee is done to see all the data
in one group.  Better still, we can create reports that can calculate
the total number of hours worked using calculateed fields that
sub-total by group (or employee).  

By using a calculated field, this newly derived total hours figure can
be multiplied by the worker's hourly rate kept with other personal
data.  When you see what you can do with the system, you will find
obvious advantages to keeping your data in the manner presented here. 

There should be no complications using this method.  All you have to
do is open the file that needs to be updated and begin appending new
data to that file.  The action of relating files can be performed when
it comes time to generate reports.  This is where  a relation really
becomes effective. 
What if the data to be entered is organized in such a fashion that
there are several child records for each family?   Further, this stack
of data contains both new parent file and child file data.  We would
need to enter data into both files with this scenario. Going through
the input data successive times would be a waste of time. A little
automation is in order in the form of a program that is intelligent
enough to help us through our multi-file problems.  
Back to School
I'd now like to discuss a simple program that aids in adding new
records to two files at a time.  The example will be a new one: adding
students names to a school list.  A family may have more than one
child in the school.  We will be adding a new family's information
only once, and specific information about the individual siblings in
the child file.  
The program example that is included in this article enters data into
one file at a time.  Although multi-file appends are impossible, you
can write a program whose results appear as the technical equivalent
of what you might expect to get from a multi-file append.  Let's look
now at a programming example that automates the process of shuttling
between two files.

We have two files in this system: Parents and Children.  We only need
to add family information once.  Each child will have his own record
in the Children file.
Our goal is to simplify our data entry procedures.  We will be
receiving a stack of papers, one per child, sorted by family.  We want
to go through the stack only once.  Only a program can help us in this
case.  Without a program, we'd have to go through the stack twice,
once to enter each child in the Children file, and once to enter a
single entry per family.  This is a viable way to do data entry, but
it's not very efficient.

About the Program

The program is set up so that it contains the minimum number of
commands to make the system work.  There are two data entry screens in
the system, one for each file (see Figures 1 and 2).  The program
switches files without you noticing, and at the same time, switches
the associated format file.  The one trick I have used in the format
file is to include a memory variable in the format screen for the
Children file to prompt you if you want to add another record/sibling
in the same family.  

Note the absence of a SET RELATION command in the program.  It's
totally unnecessary in this system because we are adding data to each
file separately.  I save the key field (the one on which the
relationship is linked) information in a memory variable and replace
this variable immediately in the Children file.  I do this because the
linking of the two files is crucial.  Bypassing the user to edit/enter
this field themselves and possibly mistyping the key data insures file
relation integrity. 

The way that you place the memory variable "more" into the Children
format file is simple but not obvious.  Format files offer the
inclusion of fields, calculated fields and memory variables.  Pull
down the Fields:Add field menu option and type the word MORE as the
name of the memory variable.  You then can place the memory variable
at the appropriate location on the screen preceded by a prompting
question.  

Later on in the program, we initialize or create this variable.  We
will use it to test  whether the data entry person desires to add
another child/sibling record to be linked to the current family
(Parents) record.  If he enters "Y" he can enter another child.  If
"N" is entered, the program loops and another parent record can be
added.  

Finally, I've added a way to drop out of the system once all data
records have been added.  I am assigning a special keystroke to set a
flag that drops us out of the system.  Once out of the append mode,
the program then relates the files, allowing you to see the resultant
related view.

Wrap-up

It's that simple.  You can use this example to automate your data
entry system.  It's even easier to create a program that works
similarly but adds only one record to each file at a time.  

You can easily modify the program to work with data that is not
indexed or in a sorted order.  In that scenario, your program must
first issue a SEEK to find the parent's name.  If it's there, you
proceed to add the child's name.  If it's not, you add the parent's
name, then move on to the child's.  It's only slightly more
complicated than the system I have shown.

Try out the example.  You can gain a greater understanding of the
concept of relational databases in general and the ease of use of such
a powerful system.      

Listing 1: 1ToMany.PRG

* Program ...:  1ToMany.PRG
* Author ....:  Joe Stolz
* Date ......:  November 21, 1989
* Notes......:  This routine puts one blank record in the file, 
*               then allows edit of that parent record in the parent
file
*               It then switches to the children file and allows
multiple children to be added.

USE Parents in 1 
USE Children IN 2 ORDER File no
SET FORMAT TO Parents

DO WHILE .T.                &&  Until <escape> signals the end of the
session.
        SELECT 1
        APPEND BLANK
        EDIT NEXT 1 NOAPPEND
        
        IF LASTKEY()=27     && the escape key
          DELETE NEXT 1
          EXIT
        ENDIF
        
        mfile no = file no  && Memory variable containing the key expression.
        more=.T.
        DO WHILE more
                SELECT 2
                APPEND BLANK
                REPLACE Children->File no WITH mfile no
                SET FORMAT TO Children
                EDIT NEXT 1 NOAPPEND
        ENDDO
        
ENDDO

* this just displays all the data in a related table

SELECT 1
SET RELATION TO File no INTO Children
SET SKIP TO B
SET FIELD TO Lname,Address,Children->Fname,B->Grade,B->Teacher
LIST
*  EOP: 1ToMany.PRG

