Clipper Support Bulletin #9
Clipper supported file structures

Copyright (c) 1991, 1992 Nantucket Corporation.  All rights reserved.


Version:  Clipper 5.0, version 5.01
Date:     20th February, 1992
Revised:  22nd May, 1992
Status:   Active

================================================================================

This Support Bulletin covers the following topics:

   1. DBFNTX/DBFNDX database (.dbf) file format
       1.1. File description record
       1.2. Field descriptor table
       1.3. Character fields
       1.4. Numeric fields
       1.5. Logical fields
       1.6. Date fields
       1.7. Memo fields
   2. DBFNTX/DBFNDX memo (.dbt) file format
   3. DBFNTX index (.ntx) file format
   4. Memory (.mem) file format

================================================================================
1. DBFNTX/DBFNDX database (.dbf) file format
   
   Database files supported by the Clipper 5.0 DBFNTX and DBFNDX
   database drivers are standard (.dbf) files supported by all
   previous versions of Clipper as well as dBASE III and dBASE II
   PLUS.
   
   DBFNTX/DBNDX database (.dbf) files include a header describing the
   file structure and specifications field, data records, and an end
   of file mark (1Ah).  The header consists of three sections: a file
   description record, one or more field descriptor records, and an
   end of header mark (0Dh).
   
   Note: All number values in this Support Bulletin are expressed in
   decimal unless otherwise noted.
   
   ----------------------------------------------------------------------------
   1.1. File description record
   
   The first record in a database (.dbf) header is 32 bytes in length
   and contains information describing the file as follows:
   
   Table: File description record
   ------------------------------------------------------------------
   Offset      Format         Contents
   ------------------------------------------------------------------
   0           03 or 083h     Signature Byte:
                              03h - (.dbf) with no memo (.dbt) file
                              083h- (.dbf) with memo (.dbt) file
   1           Year           Last update year without century
   2           01 to 12       Month of last update
   3           01 to 31       Day of last update
   4-7         Long           Number of records
   8-9         Word           Location in file where data begins
                              (START)
   10-11       Word           Record length (field sizes plus 1)
   12-31       N/A            Reserved
   ------------------------------------------------------------------

   Note: When a database file is created, bytes 12 thru 31 are NUL
   filled.  Once a database (.dbf) file exists, care must be taken
   NOT to change reserved values since dBASE III PLUS uses these
   values where the Clipper DBFNTX/DBFNDX database drivers do not.
   
   ----------------------------------------------------------------------------
   1.2. Field descriptor table
   
   Following the file description record, there is a table of field
   descriptor records beginning at byte 32.  Each field descriptor
   record is 32 bytes in length and defines the attributes of
   database field: name, data type, length, and decimals for numeric
   fields.
   
   Table: Field descriptor definition
   ------------------------------------------------------------------
   Offset      Format         Contents
   ------------------------------------------------------------------
   0-10        Character      Field Name (printable string; no spaces;
                              NUL terminated; NUL padded)
   11          Character      Field Type (C=character, L=logical,
                              M=memo, N=numeric, D=date)
   12-15       N/A            Reserved
   16          Unsigned int   Field length, including decimal for
                              numerics (referred to in text as LENGTH)
                              See offset 17 for information on
                              character fields exceeding 256 bytes
   17          Unsigned int   Numeric fields--number of decimal places
                              Character fields--most significant byte
                              (MSB) of LENGTH for fields whose lengths
                              exceed 256 bytes
   18-31       N/A            Reserved
   ------------------------------------------------------------------

   The last field structure is followed by a constant 13 (0Dh) and a
   constant 0 (00h) indicating the end of the field structures.  This
   differs from dBASE III PLUS, which does not write the 0 (00h)
   byte.
   
   At START (defined above), the record data begins.  Each field is
   stored sequentially according to the order in the header.  Before
   each record is a deleted flag which is either a space or an
   asterisk ("*").  If the deleted flag is an asterisk, the record is
   assumed to be deleted.  The field length specified in the header
   includes the deleted flag.  Below is a brief definition of each
   field type, and the method of storage employed.
   
   ----------------------------------------------------------------------------
   1.3. Character fields
   
   Character fields may contain any ASCII character from 0 to 255,
   and are always of a static length (defined by LENGTH in the field
   structure definition).  Note that the string is not NUL
   terminated.  An empty character field contains all spaces (32,
   20h).
   
   ----------------------------------------------------------------------------
   1.4. Numeric fields
   
   Numerics are stored as character equivalents with the decimal
   included.  There is no decimal character if the number of decimal
   places is zero.  Empty numerics are padded with leading spaces,
   have a zero before the decimal point, and zero padding after the
   decimal point to the end of the field.  An empty numeric of length
   9 with 2 decimals would look like this: "     0.00".
   
   ---------------------------------------------------------------------------
   1.5. Logical fields
   
   Logical fields are stored as a single character.  "T" is stored
   for true.  All other characters are assumed to equate to a false
   value (though "F" is most likely to be used).  An empty logical
   contains an "F" character.
   
   ----------------------------------------------------------------------------
   1.6. Date fields
   
   Date fields are exactly eight characters in length.  A date field
   is stored in the format YYYYMMDD where YYYY = Year with century,
   MM = Month, and DD = Day.  10/20/82 would be stored as "19821020."
   
   ----------------------------------------------------------------------------
   1.7. Memo fields
   
   Memo fields are always ten bytes in length.  The ten bytes hold a
   pointer to the first 512 byte block in a (.dbt) file that contains
   the memo text.  The pointer is in ASCII--all spaces indicates that
   there is no memo text for that field.
   
================================================================================
2. DBFNTX/DBFNDX memo (.dbt) file format
   
   If a database (.dbf) file is defined containing a memo field, it
   has an accompanying memo file with the same name and a (.dbt)
   extension.  The memo (.dbt) file contains the actual variable-
   length memo field data where the memo field in the database file
   contains the memo file record numbers where each memo field value
   begins in the memo (.dbt) file.
   
   DBFNTX/DBFNDX memo fields can be up 64K in length and are stored
   identically to dBASE III and dBASE III PLUS.  Note that dBASE III
   and dBASE III PLUS memo field values can be up to 512K in length.
   
   DBFNTX/DBFNDX memo files consist of a series 512 byte records, a
   header record followed by one or more data records.  The header
   record has the following format:
   
   Table: Memo header record
   ------------------------------------------------------------------
   Offset      Format    Contents
   ------------------------------------------------------------------
   0-3         Long      Number Of 512-byte blocks in the file,
                         including the header (also the next available
                         record)
   4-511       Unused    Reserved
   ------------------------------------------------------------------

   Each memo field value is stored as a series of memo file records
   terminated with a Ctrl-Z (01Ah) character.  If a memo field does
   not contain an even multiple of 512 bytes, the unused remainder of
   the last record is padded to 512 bytes with spaces.  Note that the
   last record in the memo file is not padded and may be less than
   512 bytes in length.
   
================================================================================
3. DBFNTX index (.ntx) file format
   
   The Clipper DBFNTX database driver uses a modified B+ tree style
   index structure.  Each index (.ntx) file consists of pages that
   are 1024 bytes long.  The first page is a header with the
   following structure:
   
      Table: First (.ntx) page
      ------------------------------------------------------------------
      Offset      Format    Contents
      ------------------------------------------------------------------
      0-1         Word      Signature Byte: 03 = Index file
      2-3         Word      Clipper indexing version number
      4-7         Long      Offset in file for first index page
      8-11        Long      Offset to an unused key page
      12-13       Word      Key size + 8 bytes (distance between key
                            pages)
      14-15       Word      Key size
      16-17       Word      Number of decimals in key (if numeric)
      18-19       Word      Maximum entries per page
      20-21       Word      Minimum entries per page or half page (The
                            first, or root page of an index has a minimum
                            of 1 entry regardless of this value)
      22-277      256 bytes Key expression, followed by CHR(0) bytes
   278         Byte      1 if index is unique, 0 if not
   279         744 bytes Filler (pads to 1024) bytes
   ------------------------------------------------------------------

   Subsequent index key pages consist of the following structure:
   
   Table: Subsequent index pages
   ------------------------------------------------------------------
   Offset      Format    Contents
   ------------------------------------------------------------------
   0-1         Word      Number of used entries on this page. This
                         number will be between the Minimum and
                         Maximums defined in the header, unless it is
                         the root page.
   2           Unsigned  An array of unsigned longs begins here. The
               ptrs      array length is equal to the maximum number
                         of key entries per page +1.  They contain
                         offsets onto the page where the key values
                         (ITEMS) are located.
   Remainder of page     ITEM entries (described below).
   ------------------------------------------------------------------

   Following the array of unsigned pointers to offsets in the page
   are the key value entries.  These so-called ITEM entries describe
   a key value and its record's position in the database.  The key
   value is always stored as a character string, regardless of its
   type.  The structure of an ITEM entry is as follows:
   
   Table: ITEM entry
   ------------------------------------------------------------------
   Offset      Format    Contents
   ------------------------------------------------------------------
   0-1         Long      Pointer to a page in the index file,
                         containing keys that are prior to this key
   2-3         Long      Record number in controlling database file
   4           Character Key value.  This field begins at offset 4
                         and continues for the length of the key.
                         Numerics are padded with leading zeros
   ------------------------------------------------------------------

   For more information about traversing Clipper (.ntx) files, you
   may want to reference the following books:
   
   Spence, Rick.  Clipper Programming Guide, Second Edition.
   (Microtrend Books, Slawson Communications, Inc.;
   ISBN 0-915391-41-4)
   
   Tenenbaum, A.M. et al., Data Structures Using C. (Prentice-Hall).
   
================================================================================
4. Memory (.mem) file format
   
   The value of each memory variable in a (.mem) file is preceded by
   a 32-byte identifier that has the following structure:
   
   Table: Memory variable identifier
   ------------------------------------------------------------------
   Bytes       Contents
   ------------------------------------------------------------------
   0-10        A null-terminated string containing the variable name
   11          Variable type
   12-15       Reserved
   16          Numeric--Number of whole digits
               Character--Low-order byte of variable length
   17          Numeric--Number of decimal digits
               Character--High-order byte of variable length
   18-31       Reserved
   ------------------------------------------------------------------

   Additional information about (.mem) files, including the C source
   code to dump a (.mem) file, can be found on page 575 of Rick
   Spence's book "Clipper Programming Guide - 2nd Edition."
   

                              *  *  *
