
                            GEDCOM Hash Package

                     Initial Interface Design - DRAFT

                             26 December 1990


Purpose of Package

The purpose of the hash package, called gedhash, is to provide a general-purpose hash-based
table lookup facility that can store a value to be later retrieved by key.  Both key and value are
variable length and may contain binary (non-ASCII) information.

A principal use of the package is to support pointer / cross reference resolution in processing a
GEDCOM transmission.  

This package is optimized for broad usability and ease of understanding and maintaining the
package itself.  Performance and space efficiency are good, but not as high as highly specialized
or complicated approaches could provide.

Memory Management

All tables create and maintain their own POOL structure from the GEDCOM library, from
which is allocated at least enough memory for the hash-tab stretched that controls the table.

A memory-resident hash table expands by allocating from its own pool until a maximum
memory usage threshold is reached, at which time it is copied to a disk-resident hash table and
all memory is released except for the hash_tab stretched.

The user-controlled GEDCOM library current pool maintained via ged_set_pool() is saved by
these routines and restored before returning.

The pool of a disk-resident hash table is only large enough for the hash_tab struct.

Using the functions of the gedhash.

You must include two header files in your source code--one for the GEDCOM library and one
for gedhash:

     #include "gedcom.h"  /* Must be included first */
     #include "gedhash.h"

The basic command line compiler/linker commands to build your file containing gedhash and
the other GEDCOM library functions are as follows:

     Microsoft C:
          
          cl /AL /Gcs myfile.c gedhash.obj gedcom.lib

Portability

This code is intended to be portable.  Testing has so far been performed on an IBM PC
compatible, compiled with Microsoft C 6.0 and on Turbo C, version 2.0.
                          The gedhash Functions

gedhash_create_table                                            - gedhash.c

                                     
Summary

#include "gedcom.h"
#include "gedhash.h"

HASH_TAB *gedhash_create_table(char *perm_file, LEN hash_size, LEN max_memory)


Description

If perm_file is null (called with "" instead of a name), a memory-based table is created which, if
needed, will be automatically copied over into a larger temporary disk-based hash table when
allocated memory would exceed max_memory.

If perm_file references a name string, an existing hash table file of that name will be opened
and used, or a new one will be created if now exists.  The file will be permanently saved if
gedhash_close() is called.

Subsequent user operations on the hash table remain the same, whether the table is in memory
or on disk.


Return Value

Returns a pointer to the created hash table structure or NULL if failure.

Example

See the appendix for an example.gedhash_close(hash_tab)        - gedhash.c

                                     
Summary

#include "gedcom.h"
#include "gedhash.h"

int gedhash_close(HASH_TAB *htab)


Description

Closes a hash table file created with a permanent name, saving in the file all information
required to use the table next time.

Return Value

Returns zero if successful, non-zero otherwise.

Example

See the appendix for an example.gedhash_destroy_table          - gedhash.c

                                     
Summary

#include "gedcom.h"
#include "gedhash.h"

int gedhash_destroy_table(HASH_TAB *htab)

Description

Destroys the table, releases all memory used by it, and deletes the hash file on disk, if any.

Return Value

Returns nothing.

Example

See the appendix for an example.gedhash_insert                 - gedhash.c

                                     
Summary

#include "gedcom.h"
#include "gedhash.h"

int gedhash_insert(HASH_TAB *htab, char *key, LEN keyLen, char *val, LEN valLen)

Description

Adds a new key/value pair to the hash table.  Will overwrite an existing pair of the same key.

Return Value

Returns one if successful, zero otherwise.  Currently always returns a one.

Example

See the appendix for an example.gedhash_lookup                 - gedhash.c

                                     
Summary

#include "gedcom.h"
#include "gedhash.h"

int lookup(HASH_TAB *htab, char *key, LEN keyLen, char *val, LEN *valLen)

Description

Finds the value of in the table that was last associated with it by gedhash_insert(), and returns
the value (and length) via the value and value_len reference parameters.

Return Value

Returns zero if successful, non-zero otherwise.

Example

See the appendix for an example.Appendix A

Sample Program Using gedhash

#include "gedcom.h"
#include "hash.h"

void cdecl main(void)
{
   HASH_TAB *htab;
   char key[400], val[400];
   LEN valLen;

   if((htab = gedhash_create_table("test.fil",(LEN)7,(LEN)10000))==(HASH_TAB *)NULL) 
      ged_error("Could not create hash table\n",NULL,0);
   printf("Enter a string for a key: ");
   gets(key);
   printf("Enter a value to insert into the hash table: ");
   gets(val);
   ged_insert(htab, key, (LEN)strlen(key), val, (LEN)strlen(val));
   printf("\nEnter a key to search the hash table: ");
   gets(key);
   lookup(htab, key, (LEN)strlen(key), val, &valLen);
   if(val)
      printf("The value is: %s\n",val);
   else
      printf("The value could not be found.\n");
   int gedhash_close(HASH_TAB *htab);
   }

/**************************************************************************
 * NAME: ged_error
 *
 * DESCRIPTION:   Prints out an error message. Standard gedcom library error
 *                messaged function.  Must be present if using gedproc or gedhash.
 *************************************************************************/
int ged_error(msg, dataCtxt, status)
char *msg;
NODE *dataCtxt;
unsigned short status;
{
  printf("%s", msg);
  return(0);
}

                                Appendix B

                  Future functions in the gedhash library


int gedhash_delete(hash_tab, char * key, size_t key_len)

Finds the key in the hash table and deletes its cell.

Returns zero if successful, non-zero otherwise.


int gedhash_count(hash_tab,,)

Returns count of the number of keys currently held in the table.

int gedhash_copy(hash_tab_from_table, hash_tab to_table)

Copies contents from_table to to_table.  to_table does not need to be empty.  Both must have
been previously created using gedhash_create.  Neither their respective sizes nor their being on
disk or in memory affects the operation.

Returns zero if successful, non-zero otherwise.


int gedhash_iterate(hash_tab, char ** key, size_t * key_len, char ** value, size_t * value_len)

Successively returns every key/value pair in the has table.  The order of returning pairs is
undefined.  gedhash_iterate_reset() must be called before calling this function.  The function is
called repeatedly in a loop until it returns zero, indicating that all pairs have been returned.

Returns non-zero as long as key/value pairs remain that have not yet been returned.


gedhash_iterate_reset(hash_tab)

Resets internal hash-tab state so that successive calls to gedhash_iterate() will return every
key/value pair in the table.

Returns nothing.

 