==================================================================
README                                         R.Valdes    3/16/93
==================================================================



This is the README file for the code listings that accompany the
article "TEXT EDITOR ARCHITECTURES AND ALGORITHMS", Dr Dobbs Journal,
April 1993, by Ray Valdes.

The TXT_ED.ARC file contains the following files:

    READ_ME          7294   3-16-93  10:12p  <--The file you are reading
    EDITLINE C       2540   3-16-93  10:22p  <--Simplest line editing routine
    EDITLIN2 C       3214   3-16-93  10:21p  <--Better line editing routine
    BUF_GAP  C      11111   3-16-93   6:32p  <--Buffer gap management
    USENET   TXT    47136   3-16-93   8:23p  <--How-to discussion
    TEXT_ED  C     133792   3-16-93  10:12p  <--My text editor kernel code

These files are not standalone programs but serve to illustrate
the points raised in the DDJ article.  The most substantial
file is TEXT_ED.C, which is the kernel to my Windows-hosted
text editor, and is discussed further below.


Notes On TEXT_ED.C Text Editor Kernel Code
------------------------------------------

   This is *not* a complete working program. That is still currently
   under (re)-construction. I thought I would have the program complete
   by the time the article was printed, but it has since grown to 
   *double* the original size (from 4000 to 8000 lines of code
   for the total program, while the kernel has grown from 2000 to
   about 5000 lines of code).

   I've been adding various optimizations, MDI support, a icon toolbar, 
   updates for Windows 3.1, and also trying to make the coding style 
   more consistent (following my idiosyncratic coding convention).
   However, a lot of what Microsoft calls the "Windows grunge" is not yet
   working.  Even so, few of these changes are relevant to the subject 
   of my article, which was on general algorithms and architectures for
   text editors. In my article, I promised to provide "a concrete
   illustration" of the discussion on architectures and algorithms.

   So the 5000 lines of TEXT_ED.C provide an example how to implement
   many key routines within an industrial-strength text editor engine.
   Unfortunately, the code is not standalone, and will produce compilation
   errors in its present form.
   
>> So it's best if you view this code as an extremely detailed pseudocode <<<
>> representation of a text editor, rather than as an actual              <<<
>> implementation.                                                        <<<

   I'll make a working implementation available at a future date, as
   time permits.  Sorry for the change! But the code here is a much
   greater amount of text editor source than originally promised.

   (BTW, if you do want proven editor code, this not hard to come by.  
   There are a number of complete, tested, working implementations of text
   editors available on CompuServe and on the Internet. On CompuServe,
   the MSWIN32 forum has source code MicroEmacs for Windows/NT.
   The IBMPRO forum has code for DOS MicroEmacs, FreeMacs, and
   a bunch of other editors.  For a small, clean, fast implementation
   in C, check out F.Eng's CHI editor in Borlands C/C++ DOS Forum.
   Austin CodeWorks in Austin TX sells, on floppy disk, the source
   code to 20 different public-domain or publicly available editors,
   such as MicroEmacs, Jove, Elvis, Grief, etc.)
   
   My code illustrates the following aspects of implementing a
   text editor:

   * maintaining a document/view architecture (although this implementation
     only allows one view per document, it can be extended to multiple
     views per document without too much additional work).

   * how to maintain two parallel streams, one of text characters and
     the other of text attributes.  The attribute stream is a sparse
     representation using an array of attribute records. An attribute
     record is allocated and used to point into the text stream whenever
     there is any change in font/size/face combination.
     
   * how to maintain a simple text-to-screen map of line breaks. This
     is roughly similar to the attribute stream in that a separate
     array points into the stream of text characters.

   * providing a small public interface (API) to clients of this engine.
     All public routines are prefixed with "ed_", as in 
     ed_CreateNewDocument() or ed_UpdateView().  The rest of the 
     implementation is encapsulated in private (i.e., static) routines
     that are prefixed with "priv_" as in priv_ReformatEntireDocument().
     
   * isolating the graphics primitives from the engine (all environment-
     dependent routines are prefixed with "gp_" as in "gp_TextOut()").
     I've tried to minimize dependencies on Windows-specific constructs
     such as HWNDs and memory handles.  The memory subsystem, in
     particular, is encapsulated by an interface layer whose functions
     are prefixed with "mem_" as in mem_AllocHandle().  It should
     be portable to other platforms such as Macintosh and Unix/Motif
     without too much pain.  There is still some Windows-specific
     code, which you can recognize by #ifdef WINDOWS_SPECIFIC_CODE,
     as well as by use of functions prefixed with "win_", such as
     win_GetDC(), which are basically macro substitions for the
     equivalent functions in the native Windows API.

   * how to process keystrokes which are both commands as well as text.
     As an illustration, there is a very small subset of the Emacs command
     set implemented, see the priv_DoEmacsCommand() function.
     
   * how to handle mouse-down, mouse-move and mouse-up events and highlight
     a region of text on the screen. See ed_OnMouseDownMsg() and
     related functions.
     
   * scrolling the view. See the function ed_ScrollView().
   
   * maintaining a mapping between text stream positions and screen
     (x,y) locations. See priv_MapTextToScreen().

   * various straightforward strategies for incremental reformatting
     and redisplay.

   There are some important items missing from my example.  The most
   glaring omission is any kind of sophisticated buffer management.

   Regarding buffer management, in my article I discuss 3 strategies
   for managing the text stream:
       
   1. Dumb and simple, using memmov() to shift bytes in memory
      one keystroke at a time.

   2. Better, but limited in storage capacity, using a buffer-gap
      approach on a single large block of RAM to minimize per-keystroke
      processing.

   3. More sophisticated but requiring more implementation effort,
      using a "virtual memory" scheme implemented in software, to
      eliminate RAM-size constraints.

   In my article, I present a buffer-gap module I wrote that is 
   derived from code by Joe Allen.  That code is included in this
   ARC file, but has not been integrated with this text editing
   engine.  This engine uses the dumb approach in item #1 above.
   
   The design goals of this engine are not for handling large
   amounts of text, but for illustrating mouse-event handling
   and multi-font processing, and also as a laboratory for
   testing various optimization strategies for incremental
   reformatting and redisplay.
   
   There are 3 optimization strategies being tested here,
   which are enabled by compile-time switches (preprocessor
   #defines), as follows:
   
   1. CHAR_WIDTH_CACHE -- code that is delimited by this #ifdef
      directive implements a cache for character widths, on a
      per font/size/face combination.  This helps avoid having
      to make calls to the Windows API.
      
   2. XCOORD_CACHE -- this conditional code implements a cache
      for horizontal positions (x-coordinates) on a given line.
      So when a mouse-down event occurs, we can determine very
      quickly which character corresponds to the mouse down. 
      This structure is also useful in centering lines of text,
      or in setting text flush right.
   
   3. OUTPUT_BITMAP_CACHE -- this optimization may actually be
      of questionable value on most machines, but seemed at one
      time to be necessary for slow machines running Windows (e.g.,
      286). This code maintains a cached image of the current 
      line of text, including various font and pointsize changes.
      So when the user types a keystroke, the program uses this
      cached bitmap to minimize rendering the rest of the line,
      and instead uses BitBlt to output to screen.

I'll try to make a working version of the editor available in the
next month.  (I get paid to edit articles and write an occasional
piece, not to spend my time writing and debugging code.)  So chances
are the version I'll make available won't have the bells and whistles
of this version; it but will have the buffer gap manager integrated
with a lot of the code in TEXT_ED.C.

--Ray Valdes
  Dr Dobbs Journal
