PCPCDOC.TXT    Copyright (c) 1991  by Georg Post

Documentation on PCPC:    Post's Conservative Pascal-C Converter

Please read the licence information in file PCPCINFO.TXT first.

Refer to  FILES.TXT for the list of supplied files. All of them consist of
printable 7-bit ASCII characters, there's not a single bit of binary code.

The file PCPCMESS.TXT lists the error and warning messages issued by PCPC.

To learn more about the internals of PCPC and the related utilities, have a
look at the file PCPCCODE.TXT, a supplement to this one.


i   . Introduction
ii  . User's Guide
iii . Translation Strategy



I. INTRODUCTION
---------------

About the design of PCPC:

   The Pascal-C converter will accept "portable" algorithms coded in
Pascal's most widespread dialect which is Turbo Pascal (Trademark of Borland
International). Hardware-related features such as the Unit Dos, the register-,
memory- and port-relative data access, and machine code inclusion, will NOT be
supported. Anyway, it's easy to code system program fragments directly in C,
the language that is meant for bit-twiddling.

   As can be tolerated for a Programmer's utility only, the converter takes
command line parameters and lacks a slick, colorful User Interface. On the
other hand, being of interest to all folks who have a working knowledge of
Turbo Pascal, the software should be - and is - available in source code.
I must apologize for my bad English in comments and document files; a
special-purpose program for an international audience just has to get along
with English explanations.

   Protagonists of Pascal-C translation may want to gently push you away from
the "student language" into that industrial world of Real (i.e. C-)
Programming. Not so with PCPC: Keep your Turbo Pascal, fix the safety belts of
strong type and index bound  checking (array index bugs must be a nightmare
in C), and use low-level C as the portable assembler which allows you to run
programs anywhere you like. This utility is biased towards Pascal programmers
who don't want to fiddle with C code much more than with assembler: the less
is the better.

   The output will conform to the 1978 Kernighan-Ritchie definition of C and
import the BARE MINIMUM of standard C libraries which are supposed to exist
everywhere: stdio.h, stdlib.h and math.h . On classical Unixes, there is no
"stdlib", but I need only 5 symbols therein that will link automatically:
  exit  free  malloc  rand  srand .
One "modern" feature has been permitted, for minimal type safety: the function
declarations are preferentially written as ANSI prototypes. A postprocessor
option may still produce the obsolete header syntax of the historical C.
The postprocessor does not kill THREE extensions to K&R'78 which I use, but
which ought to be available on many pre-ANSI C compilers:
  void              as a result type of functions
  enum              data types, assignment with integers is tolerated.
  unsigned short    data type, equivalent to Pascal's Word.
( The enum types exist by November 1978: "Recent Changes to C", and the void
results and unsigned shorts before 1982: S.R. Bourne, "The UNIX System").
PCPC never produces the following C keywords:
    auto  const  continue  float  int  register  signed  volatile
and just  *hates*  MSDOS keywords like:  asm  cdecl  far  huge  near ...
Requiring a wide portability of the output code calls for old-fashioned habits.
Along these lines, Structures don't appear as function results or parameters,
only pointers to them will pass as arguments or return values.

   Since my idea of a general-purpose program embraces this software itself,
I gave the highest priority to the correct translation of Record, String
and Set data and operations, and of nested procedures with all kinds of
recursions. Somewhat less important are the variant records, the typed
constants, and handling of other-than-Text files. Turbo extensions like all
those facilities to work around strong data typing, are considered luxury
goods with a low priority for implementation.

   In summary, if a Pascal program is written with portability in mind,
refraining from IBM-PC and MS-DOS (trademarks of the respective companies)
specific details, non-text files, unreadable nested type declarations, chances
are good that the automatic conversion to standard C code will succeed.
The translator will evolve from this initial release (1.0) to include
"esoteric" Turbo Pascal features: in decreasing order of usefulness from
the author's point of view - or better - that of the users who care to give
me some feedback.

    As a mandatory test, PCPC is able to translate ITSELF and the accompanying
utilities into plain vanilla K&R-1978 C with (removable) ANSI function
prototypes.
Under C, the translated code shoud compile flawlessly and run perfectly to
redo its own translation, reproducing the same result. That allows anyone
to bootstrap a reasonable subset of Turbo Pascal on most systems, IBM-PC
compatible or not, given the ubiquity of C compilers.
  For my C tests, I used Turbo C++ 1.0 with ANSI prototypes, and the C compiler
of the Ridge Operating System 3.3 (1985 Unix derivative), after filtering the
code through my ANSI killer. My Turbo C 1.0 (1987) failed but I don't know why.

   No effort has been spent on transporting Comments from Pascal to C: the best
comment on the C output is the original Pascal program. Anyway, C code appears
all too unreadable to me: Has the portable assembler called C  been hyped
into the True Professional's tool, just because its use is pretty dangerous?
Both the dictatorship of Microsoft over the PC world, and the reign of old
UNIX on the workstation universe, have enforced that system language as the one
and only programming hook to modern graphical user interfaces. Which isn't
to say that Turbo Pascal, with all those extensions borrowed from C, qualifies
as this planet's most neat and orderly computer tongue!

   PCPC is a two-pass translator: the program PCPC proper converts a file
xyz.PAS into the intermediate text xyz.PC1; the postprocessor REORD2 does
some macro expansion and rearrangement to obtain the final xyz.C.
The optional ANSI destroyer in REORD2 also comes as a stand-alone utility,
ANSIKILL.
PCPC requires that the Units used by xyz.PAS be in the same directory, with
file names uuu.PAS, and that each Unit identifier equal its file name.
The main file should start with the PROGRAM keyword (though TP4 allows you to
drop that one).
  I recommend to submit only texts that the fast TP4 compiler did accept,
in order not to bother the lengthy PCPC translator with trivial syntax errors.


II. USER'S GUIDE
----------------

Remark: in the following text, there are some pieces of MSDOS code for you to
block-write into batch files. I did not want to clutter up the distribution
disk with those tiny command scripts.


2.1  Installation of PCPC, and Part 1 of self-test :

These are suggestions only, not a unique recipe. Dear reader, you are a Pascal
programmer and so you'll easily manage things intelligently your own way.

Let's assume the following:
*  You have Turbo Pascal Version 4.0
*  You have 580 K bytes of free DOS memory (no TSRs, Devices, ... )
*  there is a hard disk volume C: under MS-DOS Version 3.3.
*  The DOS prompt says:    C:\>

- Create some directory , for example, type:     md \pastoc
- Move into the directory:                       cd \pastoc
       make document subdirectory:               md doc
       make examples subdirectory:               md ex
- Copy the whole distribution diskette:
     copy a:*.*                 ( use xcopy if you didn't forget the syntax..)
     copy a:\doc\*.* doc
     copy a:\ex\*.*  ex
- Load the Turbo Pascal 4 integrated version:    \turbo4\turbo
  ...and step through the following compilations & executions:
        File/ Load/ pcpc.pas      F10/ Compile/ Destination=Disk  Make
   F10/ File/ Load/ chekgram.pas  F10/ Compile/ Make
   F10/ File/ Load/ reord2.pas    F10/ Compile/ Make
   F10/ File/ Load/ ansikill.pas  F10/ Compile/ Make
  (By now, any Unit has its TPU counterpart, the compiler saw every *.PAS)
-  F10/ Options/ Parameters      -b pcpc
  (Pcpc and Reord take identical command line input: -Build option, File pcpc)
   F10/ File/ Pick/ pcpc.pas    F10/ Run
   F10/ File/ Pick/ reord2.pas  F10/ Run
  (self translation of PCPC and its units. It is a slow 2-pass process ).
-  F10/ Options/ Parameters  -b reord2
   F10/ File/ Pick/ pcpc.pas    F10/ Run
   F10/ File/ Pick/ reord2.pas  F10/ Run
-  F10/ Options/ Parameters  gramtool chekgram ansikill
   F10/ File/ Pick/ pcpc.pas    F10/ Run
   F10/ File/ Pick/ reord2.pas  F10/ Run
  (self translation of Reord2, Chekgram and Ansikill)
-  F10/ File/ Quit
  By now, you have PCPC.EXE, REORD2.EXE, CHEKGRAM.EXE, ANSIKILL.EXE,
  and a lot of .C files, one for each of the parameter list entries and its
  subordinate units.
  The Pascal part of the test is over, I'll discuss the C part soon.


2.2  Full syntax of the Pcpc and Reord2 command lines ( [] = optional part):

Command:   Pcpc(Reord2)  Options file1 [ file2 file3 ... file9]
Options:   [/Ppath] [/Dpath] [/A] [/B] [/T] [/U]

The Options may appear in arbitrary order but must precede the (list of) file
name(s). The leading '/' is interchangeable with '-' .
  The file names must NOT have a directory prefix (use the /P option instead)
nor a suffix (automatically ".pas" or ".pc1").

/P specifies the "path" where to find the .PAS files to be translated. Any Unit
   used by a file must exist as a .PAS file in the same directory, except for
   Dos, Crt and Graph. By default, "path" is the current directory.
   PCPC output files .PC1 are stored in the "path" directory, and Reord2 stores
   the .C result files there as well.

/D is the path where Pcpc will find its auxiliary files grammar5.txt, crt.pas,
   dos.pas and graph.pas. By default, this is the current directory.
   Reord2 ignores the option.

/A tells Reord2 to kill the Ansi function headers in order to obtain old-style
   K&R'78 C output. Pcpc ignores this option.

/B is the Build option. If it is present, the program translates only the first
   file of the list. However, every Unit on which it depends will be translated
   automatically. If /B is missing, Pcpc and Reord2 convert the list of files
   and check only the Interface part of any imported Unit.
   When the /B option is given,
      Reord2 -b file1
   creates two auxiliary files to aid with compiling and linking under C:
      Cfile1.bat  and file1.lnk.
   They are useless outside the MSDOS environment.
   See the example for MicroCalc below.

/T is a Trace option which sends debugging output to the screen.
   Currently undocumented. Ignored by Reord2.

/U is undocumented.


2.3  About the use of PCPC :

The files PCPC.EXE, REORD2.EXE, GRAMMAR5.TXT, CONVPAC.H and CONVPAC.C make up
the entire Pascal-C converter. That's NOT YET an independent public-domain
program since GRAMMAR5 is precious source text - I don't care for the EXEs.
CHEKGRAM.EXE is essentially a debugging tool for the maintenance of GRAMMAR5.
ANSIKILL.EXE is nothing but the part of REORD2 triggered by the -a option.
Please give GRAMMAR5 to anyone BUNDLED only with the COMPLETE source code and
documentation, which is copyrighted shareware. If you scramble GRAMMAR5 into
its binary memory image format, you may work out a faster PCPC.EXE which would
then be freeware, but bare of any documentation...

The size of buffers and data structures has been adjusted such that the two
parts of the converter work well from inside the integrated Turbo Pascal 4
environment: at least on the largest of its own files, on 640 K PC's
without bulky TSR programs.

When running, both PCPC and REORD2 put a fair amount of noise on the screen.
Every now and then, one more dot appears so that the user can be sure that the
program is still alive and she can stop it orderly with Ctrl-Break.
Things in the source code which don't appeal to PCPC trigger a two-line
message; a copy goes as a comment into the C output file.

Outside Turbo Pascal, you can start the converter by typing an old-fashioned
command line. For example, if you want to convert two files One.pas and
Two.pas  which are in the path \foo\bar , enter:

pcpc    /P\foo\bar One Two          ( the first pass makes *.pc1 )
reord2  /P\foo\bar One Two          ( the second pass makes One.c Two.c )
The  /P path specifier defaults to the currrent directory if omitted.

Here, I assumed that  pcpc.exe, together with the auxiliary files grammar5.txt,
crt.pas, dos.pas, graph.pas, was in the current directory. If pcpc is started
from elsewhere, supply an optional "codepath" specifier /C because Pcpc
isn't smart enough to know where it comes from.
   Example:
d:\util\pastoc\pcpc /Cd:\util\pastoc /P\foo\bar One Two Three

   It is said that, ( I was too conservative to check for details) above
certain version numbers of Dos and/or Turbo Pascal, the ParamStr(0) call
returns a program's full path name...

Any units on which the argument files depend must exist as source files *.PAS
in the "/P" directory. In fact, the first thing PCPC will do is to read the
interface source code af all Units which the file relies on, in a direct or
some indirect way.

In the package there are "system" files Dos.pas,
Crt.pas and Graph.pas which are quite different from the Dos.doc, Crt.doc
and Graph.doc header files (Copyright by Borland): they're uncommented and
written with my private extensions called "Sloppy Pascal". If your program uses
Printer, copy Borland's Printer.doc as Printer.Pas to your "/P"  directory.
Important: for subtle reasons the method doesn't work  for Dos,Crt,Graph.
Never did I try anything with the backward compatible Turbo3 & Graph3 units.
No equivalent C library for Dos/Crt/Graph exists yet in version 1.0 of the
software. Programs using Dos/Crt/Graph may be converted to C, they will
compile but won't link. Programs headed for export from the PC platform - the
initial motivation of this package - will avoid Dos/Crt/Graph, of course.
  The unit ex\crtdos.pas is my C-convertible substitute for Dos and Crt which
supports only the most popular features  (Read the comments in Crtdos).
  I should need the help of C aficionados to write the missing
libraries for me: as general as possible, even running on UNIX ?


2.4  Anti-bug Checklist, before an attempt at Pascal-C conversion:

- The software is a set of (20 or less) *.PAS files, each smaller than 55K.
- The Unit identifiers  are identical to the file names ($U not supported).
- The Include directive: not recommended (partial support for short Includes)
- Units Turbo3, Graph3 and Printer are unwanted.
- For export from MS-DOS, units Dos,Crt and Graph to be avoided.
- All identifiers are unique in the first 15 characters.
- Expressions are not too long ( string concatenations may saturate buffers).
- Type identifiers are longer than one character (... the ^T bug)
- There are no: Inline External Absolute Mem Port PrefixSeg Ptr @ ...  hacks.
- File I/O only by sequential, ASCII text Read/Write.
- The shortcut anyString[0] for Length is never used.
- Identifiers from Interfaces of used Units are mutually distinct.
  (qualified identifiers: not supported; identical re-definition: tolerated)


2.5   Compiling and linking the C output:

  After running Pcpc and Reord2, you get as many *.C files as you had *.PAS
files to start with.  Delete the intermediate *.PC1 files by hand.
If you're confident, you'll invent a simple batch file to chain PCPC and
REORD2 (with identical arg lists) and DEL *.PC1 , like this:

c:\pastoc\pcpc   -Dc:\pastoc  &1 &2 &3 &4 &5 &6 &7 &8 &9
c:\pastoc\reord2  &1 &2 &3 &4 &5 &6 &7 &8 &9
del *.pc1

   Your C compiler needs:
stdio.h, math.h  : very standard headers, always included via convpac.h.
stdlib.h         : included by convpac.c (but superfluous on old Unix?).
convpac.h        : any C output file starts with #include "convpac.h"
convpac.c        : the PCPC "system library"
ex\crtdos.c      : if your Pascal code uses Crt and/or Dos.
                   Crtdos is incomplete and non-portable outside MSDOS!
<product>.c      : all the output from PCPC + REORD2

    Compile all the *.c files into *.obj,  with the LARGE MEMORY MODEL.
Compiler errors reflect bugs or size limits of PCPC. In these cases, adapt the
Pascal sources to the constraints of PCPC (or repair PCPC and show me
your work) and restart.
Then, link together all those *.obj files (crtdos.obj only when needed), using
the standard libraries of your compiler. For Turbo C, these are:
  - the object module c0L.obj
  - the 3 libraries   emu.lib  mathL.lib   cL.lib


2.6   Self test of PCPC, Part 2  (suggested as an example) :

Move into the directory where you did the Pascal part of the self-test.
Let me suppose that you have Turbo C++ 1.0 on disk D in subdirectory tcpp.

* Make 3 batch files modeled after these:

--------------
rem cc.bat
d:\tcpp\bin\tcc -ml -N -c -Id:\tcpp\include %1 %2 %3 %4 %5 %6 %7 %8
--------------
rem bcc.bat
d:\tcpp\bin\tcc -ml -N -c -Id:\tcpp\include %args%
--------------
rem tlink.bat
d:\tcpp\bin\tlink /c d:\tcpp\lib\c0L @%1.lnk,,d:\tcpp\lib\emu d:\tcpp\lib\mathL d:\tcpp\lib\cL
-------------

The first one: CC invokes the C compiler with the correct options (large memory
model, check stack overflow, don't link) and takes up to 8 file name
arguments.

The second one: BCC runs the C compiler just like CC, but it takes its argument
list from a DOS environment variable (SET ARGS=.....). This is used in batch
processing: by one of MSDOS' peculiarities, a batch file which starts another
one by name, really does a GOTO; but a batch cannot pass parameters if it uses
the CALL command instead. We must use Call after Set, to get arguments across.

The third one: TLINK launches the Turbo linker with the case-sensitivity
option and the standard Turbo C startup code and libraries. It takes one
argument which is an auxiliary file (*.lnk) with the list of linkable OBJ's.

* Make sure that the first part of the self test made two auxiliary linker
data files (else "Ctrl-K W" them from here):

----------------
|  pcpc.lnk    |  3-line text file:  Obj file list , Exe file name
----------------
convpac pcpcdata pascannr semanti6 cdeclara+
cnesting cbulk getunits pcpcpars pcpc+
,pcpc

----------------
|  reord2.lnk  |  2-line text file
----------------
convpac getunits killansi reord2+
,reord2


* Make 2 more batch files (just block-write the following with the TP editor):

rem ctod.bat  :  rename all .c to .d
ren pcpcdata.c pcpcdata.d
ren pascannr.c pascannr.d
ren semanti6.c semanti6.d
ren cdeclara.c cdeclara.d
ren cnesting.c cnesting.d
ren cbulk.c cbulk.d
ren getunits.c getunits.d
ren pcpcpars.c pcpcpars.d
ren pcpc.c pcpc.d
ren killansi.c killansi.d
ren reord2.c reord2.d

rem fccd.bat   :  file compare all .c to .d
fc pcpcdata.c pcpcdata.d
fc pascannr.c pascannr.d
fc semanti6.c semanti6.d
fc cdeclara.c cdeclara.d
fc cnesting.c cnesting.d
fc cbulk.c cbulk.d
fc getunits.c getunits.d
fc pcpcpars.c pcpcpars.d
fc pcpc.c pcpc.d
fc killansi.c killansi.d
fc reord2.c reord2.d

* Now execute the following commands under DOS:

----  compile  the PCPC "system library"
cc convpac
cc ex\crtdos
----  compile the pcpc C code
cc pcpcdata pascannr semanti6 cdeclara cnesting cbulk
cc getunits pcpcpars pcpc killansi reord2
---- linking
tlink pcpc
tlink reord2
---- now you should have C-derived versions of pcpc.exe and reord2.exe
---- ( you could use the automatic way instead of cc and tlink:
----   Part 1 created  batches cpcpc.bat and  creord2.bat for you)
----  rename all .C files to .D for later comparison
ctod
---- use the C versions to redo the self-translation:
pcpc   -b pcpc
pcpc   -b reord2
reord2 -b pcpc
reord2 -b reord2
del *.pc1
---- compare the new .C files with the old ones (renamed .D)
fccd
---- you should find only minor differences (trailing blank lines ?)


* This brings the test of PCPC to an end and hopefully demonstrates that the
Pascal and the C version work alike for self-translation.
In your spare time, you might try the C version of CHEKGRAM and its three
parsers, too. The Turbo C versions execute appreciably faster than the
Turbo(?)Pascal 4.0 originals. Nothing astonishing about that: C never checks
any array indexes, whereas Pascal's $B+ option takes you on a safer route.


2.7   A second test:

You may check that the Pascal and C versions of pcpc + reord2  work
identically on Borland's MicroCalc as an input file set.

--- copy MC files from your TP4 distribution diskette to some directory "test"
copy a:mc*.pas test
--- run the converter, C version
pcpc -ptest -b mcalc
reord2 -ptest -b mcalc
---- save all the C files mc*.c
---- rerun the conversion from within Turbo Pascal 4, Pascal version of PCPC
---- compare the *.C files

Note that the C version of MicroCalc is useless! More about that later.
Nevertheless, let's look at the two auxiliary texts mcalc.lnk and Cmcalc.bat,
written by Reord2:

-----  mcalc.lnk ------
convpac crtdos mcvars mcutil mcdisply mcparser+
mclib mcinput mcommand mcalc+
,mcalc
-------Cmcalc.bat -----
rem Cmcalc.bat
set args=test\mcvars test\mcutil test\mcdisply test\mcparser
call bcc
set args=test\mclib test\mcinput test\mcommand test\mcalc
call bcc
set args=
tlink mcalc

The batch file would start the C compiler several times, using the Set/Call
parameter passing scheme and the BCC batch explained above.
The last line would start the linking process (see the TLINK.BAT example) with
the help of mcalc.lnk.
To translate, compile and link MicroCalc, you would type these 3 lines only:

pcpc -ptest -b mcalc
reord2 -ptest -b mcalc
cmcalc


2.8   A third test

This one demonstrates the limited Crt and Dos capabilities of PCPC.
Make sure that you have the files ex\crtdos.c, ex\toyedit.pas, ex\toymenu.pas
and ex\toytest.pas  with respect to your working directory.
   Warnings: If you remake ex\crtdos.c from the supplied ex\crtdos.pas, don't
forget the special -u option and patches mentioned in crtdos.pas. The macro
that invokes Turbo C's "intr" function may need some tweaking for other
brands of compilers.
   Now, play around under Turbo Pascal with "toytest" which is a trivial
split-screen text editor with some menus and a file pick list.
Then, translate it to C: (I use my "standard" batch files):

pcpc   -pex -b toytest
reord2 -pex -b toytest
Ctoytest

You are ready to play around with "toytest.exe" which is a trivial text editor
handling console I/O with slow BIOS calls.
   Despite an impressive number of warning messages from Pcpc on unsupported
Crt and Dos calls: the result works with Turbo C++ . Please let me hear of
bugs with other C compilers.


2.9   Exporting the PCPC software from MS-DOS  to the world

Many non-DOS systems have C compilers which do not understand ANSI prototypes.
To transport the PCPC programs to a Ridge Computer, I did the following:

- Run "pcpc" and "reord2" on themselves with the -A -B flags.
- shake the modernisms off  convpac.h and convpac.c:
  ansikill convpac.h convpac.c
  (makes  convpac.krh and convpac.krc, to be renamed properly)
  The "toexport.bat" command sequence (below)  automates these steps.
- Patch the "convpac.h" file to undefine the msDOS symbol,
  i.e. comment out the #define msDOS line
- Transfer all the resulting .c files, including convpac.h, convpac.c,
  grammar5.txt and "pcpc.bat"(see below) via serial link with Kermit programs.
- Compile and link everything on ROS using the following batch file (oh sorry,
  "shell script" out there).
- Transfer all the *.pas programs of this project
- Check that the self-translation works properly on the non-DOS computer.
  (self translation = lines 2 thru 7 of "toexport.bat", runs on both systems! )
- Now I can program in Turbo Pascal under ROS 3.3 !

----------------------------------
rem  toexport.bat  : prepare all PCPC software for export, under DOS
pcpc -b pcpc
pcpc -b reord2
pcpc ansikill gramtool chekgram
reord2 -a -b pcpc
reord2 -a -b reord2
reord2 -a ansikill gramtool chekgram
del *.pc1
rem  now kill the Ansi features in "convpac" files.
ansikill convpac.c convpac.h
ren convpac.c convpac.ac
ren convpac.h convpac.ah
ren convpac.krc convpac.c
ren convpac.krh convpac.h
echo  Now export all *.c files , convpac.h  and  grammar5.txt !
echo  Do not forget to mask the MsDOS flag in convpac.h !
rem  end of DOS batch  toexport.bat

----------------------------------
# pcpc.bat  :  compile and link the pcpc software on  ROS 3.3
# type this command line :
#  sh -v pcpc.bat
cc -c convpac.c
cc -c pcpc.c
cc -c pcpcdata.c
cc -c pcpcpars.c
cc -c semanti6.c
cc -c pascannr.c
cc -c getunits.c
cc -c cdeclara.c
cc -c cnesting.c
cc -c cbulk.c
#  link pcpc
cc pcpc.o pcpcdata.o pcpcpars.o semanti6.o \
   pascannr.o getunits.o cdeclara.o cnesting.o cbulk.o \
   convpac.o -lm -o pcpc
cc -c killansi.c
cc -c reord2.c
#  link reord2
cc reord2.o killansi.o getunits.o convpac.o -lm -o reord2
cc -c ansikill.c
#  link ansikill
cc ansikill.o killansi.o convpac.o -lm -o ansikill
cc -c gramtool.c
cc -c chekgram.c
#  link chekgram
cc chekgram.o gramtool.o getunits.o semanti6.o pascannr.o pcpcdata.o \
   convpac.o -lm -o chekgram
# end of batch
----------------------------------

Bugs: Away from DOS, the -p and -c options of PCPC and REORD2 won't work.
Directories and file names are glued  with "/" there, whereas this software
follows the nasty "\" convention of DOS.  It would be nice, too, to generate
standard "makefiles" for non-DOS systems.

I quickly tried out my code on some Hewlett Packard workstation (don't ask
me what model). It seemed to work, after substitution of any "cc -c" by
this:  "cc -c +Np1000"  (don't ask me what it means).


III. TRANSLATION STRATEGY
-------------------------

  This part explains the rules, exceptions and known bugs in PCPC for the
translation of Turbo Pascal features: Identifiers, Units, Directives,
Constants, Types, Variables, Expressions, Nested Procedures, Sets, Strings,
Records.
   Examples are taken from PCPC self-translates  and from translation runs on
Borland's MicroCalc which is part of the Turbo Pascal 4 package. MicroCalc is
heavily hardware-oriented and uses string index 0 trickery and the like, and so
the C outputs will never-never link together to anything useful. Yet, it's a
good example to demonstrate the limits of PCPC which I do not want to conceal.


IDENTIFIERS: Uniqueness and Scopes

Identifiers are arbitrarily cut to 15 characters.
PCPC identifier Scope Rules differ slightly from Turbo Pascal's (sorry):
The re-declaration of Interface identifiers of some "used" unit in the
Interface part of another unit gives a Warning. Things work well ONLY if the
second declaration is identical to the first one (Level 1). However, in the
Implementation part or in the global scope of the main Program file, imported
identifiers may be redeclared without damage (Level 2). As a related bug,
the "qualified identifier" syntax is not (yet) fully supported.

The Borland System library units (System,Dos,Crt,Graph) have a special
status: their symbols may be redeclared everywhere. I do NOT support the
Turbo3 and Printer units which were introduced for backward compatibility
and are not recommended for new Turbo Pascal projects.

-  The  System Dos Crt Graph  identifiers have scope level Zero (privileged)
   Interface parts of other imported Units: scope level 1
   Globals in Implementation or Main Program: scope level 2
   Locals of a global procedure: scope 3
   Locals of local procedures:  4,5,6,.... a maximum of 10 is allowed.

 - Pascal programs may legally re-use scope 0 (and alien unit scope 1)
   identifiers in its global scopes( 1,2) , using re-definition.

 - C programs MUST create new distinct identifiers: no inter-unit recycling.
   PCPC flags such identifiers with a counter ReUse>0. After translation they
   pick up an underbar prefix, e.g. _New (if ReUse=1) or _2New (if ReUse=2)
   [Cdeclara.auxSuffix].
   (No risk of collision since TP4 identifiers should begin with a letter)

-  Prefixing of ALL redeclared identifiers from the System unit, at any scope,
   allows PCPC to #define freely any System features as macros.

 - The scanner protects the code converter against Pascal user symbols which
   happen to clash with C reserved words or library items: those are all
   lower_case (rare: lowercase only with underbar prefix) or all UPPER_CASE,
   in the few header files I need. [Pascannr.AnnexId] always sets the
   first letter of Pascal or source file symbols to upper, and the last
   one to lower case. This screening trick may give ugly results. For the
   letters in between, the identifier tables retain the user symbols the way
   they appear for the FIRST time in the Pascal source code, and the code
   generator always writes them like that. Always compile the produced C
   programs leaving case sensitivity ON, the standard setting!

 - Use of the underbar:
   In Pascal, an identifier officially starts with a letter. While the TP4
   compiler tolerates leading underbars, PCPC decides to reject them
   (though an "undocumented" flag in PCPC will let them slip through ).
   As C identifiers may legally start with an underbar, PCPC has a device
   to create names that do not collide with any symbols of the Pascal input
   program or with the basic C keywords and libraries (stdio stdlib math).
   In fact, PCPC makes identifiers that have the format:

   uppercaseLetter (alphanumerics)
   underbar uppercaseLetter (alphanumerics)
   underbar numerics (alphanumerics)
   underbar lowercaseLetter numerics
   underbar lowercaseLetter uppercaseLetter (alphanumerics)

"Non-conflicting" aux. identifiers invented by PCPC:
  _s1 _sX ...  auxiliary String variables and operations
  _e1 _eX      Set variables, operations
  _mX _cX      memory moves and compares
  _rX _wX      read and write operations
  _0 _1 ....   internal unions and structs of variant records
  _l99...      numeric labels
  _nIdent      enums
  _Ident       reused Pascal "system" identifiers, for ex. _Pi
  _gIdent      auxiliary global Id to some local "Ident"
  _fIdent      nested function made global
  _pIdent      aux. parameter for nested functions
  _tIdent      globalised local type Id
  _1Ident      multiply defined global Ident


UNITS:

  Upon translating a Unit, PCPC does not create separate header and code files.
Instead, the imported declarations from used Units are echoed in any output
file (with "extern" prefixes for data) and the own interface declarations
are written without storage class (implicit "global"). Global data and
functions appearing in the Implementation part pick up the "static" prefix to
prevent linkers from knowing about them.

COMPILER DIRECTIVES:

  These comments {$...} or (*$...*) are not treated with mathematical rigour.
  Compiler switches are initialized (one boolean per uppercase letter) with
  the TP4 defaults. Their state is monitored but has no immediate effect on
  the code generator, only an indirect one with IFOPT.
  The one-letter directives  L M U are completely ignored.

  Source file inclusion {$I...} is supported with severe size restrictions:
  PCPC physically inserts included files into the 50K source code buffer,
  hence the main file PLUS all included files must be smaller than the buffer!
  The Included file must not have its own directory path and must be in the
  same directory as the main file. Aborted inclusions will upset the parser
  as it encounters the first undefined-but-used item.

  To support the DEFINE and UNDEF directives, a buffer of 20 strings keeps
  track of defined symbols. Predefined: VER40, MSDOS, CPU86 but not CPU87.
  PCPC makes no effort to detect a coprocessor. A fourth predefined symbol
  (PCPC) allows for translator-specific code (debugging...).
  Conditional compilation directives IFDEF IFNDEF IFOPT ELSE ENDIF are
  understood up to a nesting level of 16. PCPC behaves like a compiler
  and handles only the unmasked code fragments. This means that no compiler
  directives are generated in the C output. To translate the conditionals would
  require the management of multiple threads of symbol tables...


CONSTANTS:

 Real, String and Structured Pascal Constants are translated in C as  Static
 Initialized variables, in order to respect scopes: [Cdeclara.ConstDeclare].
 Big integer-type constants translate to #define directives, because
 of their predominant use as array bounds. The resulting scope bugs are not
 yet tackled in this version. Small integers and char constants are coded
 with funny "enum" declarations to circumvent the scope problem.
 The ANSI-C "const" keyword goes beyond K&R, so I could not use it.

Examples: Pascal integer,real,string,set and char constants --> no #define !

const
  EXPLIMIT = 88;
  SQRLIMIT = 1E18;
  MSGHEADER = 'MICROCALC - A Turbo Pascal Demonstration Program';
  DEFAULTFORMAT = $42;
  COMMAS = $20;
  DOLLAR = $10;
  LETTERS : set of Char = ['A'..'Z', 'a'..'z'];
  NULL = #0;
  BS = #8;
---------------------------------------------PCPC output:
typedef enum {EXPLIMIt=88} _nEXPLIMIt;
Real SQRLIMIt=1E18;
char *MSGHEADEr="MICROCALC - A Turbo Pascal Demonstration Program"
;
typedef enum {DEFAULTFORMAt=66} _nDEFAULTFORMAt;
typedef enum {COMMAs=32} _nCOMMAs;
typedef enum {DOLLAr=16} _nDOLLAr;
Set LETTERs=
{0x0000,0x0000,0x0000,0x0000,0xfffe,0x07ff,0xfffe,0x07ff,0x0000
,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000};
typedef enum {NULl='\000'} _nNULl;
typedef enum {Bs='\010'} _nBs;

Constants in CASE labels may be integers, enumerates or characters. CASE
intervals are only supported up to a length of 15: the C output is an ugly
list of individual "case" lines, one for each interval member.
Pascal const declarations for Integers and chars become "enum" typefs
in C (scope awareness). Whenever such a constant identifier is used in
expressions, it is typecast to (long). This is because some C compilers do
not allow comparison operators between enums and integer types.

TYPES:

Sets and Strings are reproduced as arrays in C. The elegant Pascal
operators + - * IN, comparisons,... applied to these objects, translate into
awkward function and macro calls. Quite a few intermediate local variables
must be created for the C version of the program.
Bytes become wasteful "unsigned short" in C. Signed "char" could have been used
instead, with nasty sign-extension clipping (x & 255) in all expressions?

Examples for enumeration,string,pointer,record:
type
    idClass=
  (constId, typeId, fieldId, varId,  functId  );
  str8=string[8];
  ptpel=^tpel;
  tpel= record
      cl: char;
      l: longint;
      m: longInt;
      hook,
      p,q: ptpel;
      tName: pide;
      ixName:pide;
  end;
---------- PCPC output:

typedef enum {ConstId,TypeId,FieldId,VarId,FunctId} IdClass;
typedef Char Str8[9];
typedef struct _gTpel *Ptpel;
typedef struct _gTpel {
    Char Cl;
    Longint L;
    Longint M;
    Ptpel Hook;
    Ptpel P;
    Ptpel Q;
    Pide TName;
    Pide IxName;
  } Tpel;

EXPRESSIONS:

The operator precedence rules for Pascal and C differ quite a bit.
I gave tightest binding operators the lowest numbers. PCPC keeps track of the
binding force of the expressions on stack, to decide when to add parentheses.

Operator precedence for Pascal:

 1:  NOT  @
 2:  * / div mod and shl shr
 3:  + - or xor
 4:  = <> < > <= >= in

For C:

 0:  () [] -> .  identifiers
 1:  Unary   ! ~ -- ++ - * &  f(x) (typecast)
 2:  * / %
 3:  + -
 4:  << >>
 5:  < <= > >=
 6:  == !=
 7:  &
 8:  ^
 9:  |
 10: &&
 11: ||
 12: =

If the C precedence differs from the Pascal order, PCPC puts parentheses
around the loose-binding part ( higher C precedence number but lower Pascal
class number).

While PCPC is in the statementPart, all expression strings are accumulated on
a stack ( [Cdeclara.expStack]), and retrieved when the operator comes in (the
parser elaborates kind of a postfix order for expression trees).
Parentheses are added as follows: If a newcoming operator (unary or binary)
has a lower level (= higher binding force) than the last one used, say,
in the left operand, parentheses go around that operand.

Pascal string concatenations +..+..+ and set constructors
[...] are expressions of the "list" type: I do not know how many items I get.
They cannot generate code for a single C function call: restricted ANSI C
functions, the only ones I admit, have a fixed number of parameters. There is
no mechanism to generalise the flexible parameter scheme of library functions
such as "printf": the ellipsis (...) prototyping is unheard of in 1978
Kernighan-Ritchie, so I never use it. The comma operator provides the way out.
For example, the set constructor yields a comma expression ( , , , ) of calls
that add the elements to a temporary set variable, one by one. The last sub-
expression returns the pointer to that set (= an array of Words), the final
result:  PCPC produces slow, dirty code for set arithmetic.

ASSIGNMENTS:

Simple vairable assignments can be handled with the = expression in C.
Array,set,record,string ... assignments translate into appropriate macro calls
which know about the size of the objects to be copied. There is one quirk in
C which I discovered rather late (Turbo C++ Programmer's Guide, p. 82): When
an array A appears as a parameter of a function, sizeof(A) is the size of a
POINTER, even if the function prototype declares some array type!  PCPC catches
this situation and outputs sizeof(the_type) instead of sizeof(the_variable)
in those cases. Perhaps some bugs still lurk on "semilocal" arrays (see below)
which are artificially converted into parameters...


PROCEDURES & FUNCTIONS:

In ancient C, unlike Pascal, arrays and records may not be passed as value
parameters or appear as function results . The translator handles those
parameters as pointers and explicitly creates local copies for structured
types passed as value parameters.  However,
inspection of Pascal source texts shows that in many instances the "logical"
local copies are never used as alterable storage and only waste space and time.
The current version of PCPC does not detect this "not threatened" condition,
missing a chance of optimization.
Translation rules for array parameters (arrays, strings, sets) account for C's
pointer-array equivalence.

To support String-valued functions, the translator inserts a pointer to the
result string, as a first argument. The caller provides the necessary
auxiliary string storage on its local stack.

Slight type corrections in call-by-value situations are made: a char argument
gets "stringized" when the function asks for a formal string parameter; a short
integer argument undergoes a typecast to (long) if that's the formal parameter
type. An ANSI C compiler would handle the latter situation automatically, but
in K&R C, a forgotten typecast is disastrous.

LIBRARY PROCEDURES:

In my convpac.c library, a Turbo Pascal standard procedure with an optional
parameter has two C versions: the one without the parameter has its last
character converted to upper case.
For example, PCPC produces the code:
  InC(X);   Inc(X,3);
  ReseT(F); Reset(F,128);

System procedures with weakly defined numeric arguments produce different C
function calls, according to the actual parameter type:
  Abs(x)      Val(s,x,err)            if x is a (short) Integer
  ABs(x)      VAl(s,x,err)            if x il a long integer
  AbS(s)      VaL(s,x,err)            if x is real

The DOS linker  must be case-sensitive, of course.

The following assembler-like primitives are declared in Convpac.h and used
in addition to the Turbo Pascal system procedures. I introduced the comparison
opcodes  h:=   G L g l E U  for:  >  <  >=  <=  ==  != .

_mY(a,b)        array a:=b  (sizeof(a) known)          MEMORY MOVES
_mR(a,b)        record a:=b
_mA(a,b,n)      array a:=b, copy n bytes
_mV(a,b, n)     string a:=b, n bytes
_mF(a, n,c)     fill string a with char c, n bytes
_mC(h, a,b, n)  compare objects a,b: n bytes, operation code = char h = EU.

-cA(h,a,b)      compare arrays a,b with opcode h= EU    MEMORY COMPARES
_cR(h,a,b)      compare records a,b with       h= EU
_cF(h,a,b,n)    compare n bytes of a,b with    h= EU
_cS(h,a,b)      lexical compare null-terminated strings a,b, h= GLglEU
_cL(h,a,b,n)    lexical compare n bytes of char-arrays a,b,  h= GLglEU
_cE(h,a,b)      bitset compare Sets a,b, opcode              h= glEU

_eE(s,x)        add simple object x to Set s           SET OPERATIONS
_eR(s,x,y)      add interval x..y to Set s
_eE_(s, n)      add short int n to Set s
_eR_(s, x,y)    add short int range x..y to Set s
_eIn(x,s)       test if x is in Set s
_eU(s, a,b)     s:= Union of sets a + b
_eI(s, a,b)     s:= Intersection of sets a * b
_eD(s, a,b)     s:= set difference a - b
_eV(s)          s:= empty set []
_eC(s,n)        -- reserved --

_sI(s,t)        string s:=t                            STRING OPERATIONS
_sS(s,t)        string s:=s+t
_sM(s,t, n)     string s:=s+t,  length(s) cut to n bytes
_sK(s, k)       s:= one-char-string with char k
_sL(k, s)       _sK with inverted args
_sC(s, k)       add char k to string s
_sY(a, b, n)    add array b to string a,  n characters
_sA(a, b, n)

_wN()           write newline                          BIOS CONSOLE OUTPUT
_wC(c)          write char c
_wK(c, n)       write char c in field of size n
_wS(s, n)       write string s in field of size n
_wI(i, n)       wtrite long int i in field of size n
_wF(f, n,m)     write float  f with format n:m

_rN()           read   newline                         BIOS CONSOLE INPUT
_rC(c)          readln char c
_rS(s)          readln string s
_rI(i)          readln integer i
_rF(f)          readln float f

NESTED PROCEDURES:

 Untangling of nested procedures  is achieved by adding parameters.
A schoolbook case may be the translation of mutually recursive routines, one
of them defined inside the scope of the other. This is common in recursive
descent parsers like the one used in PCPC.

The following 2 principles underlie my design:
Never create auxiliary variables on the heap, to avoid complicated
  memory initializations and managements!
Never create them as global variables, in order to keep re-entrant Pascal code
  that way!

As a consequence, auxiliary variables are either "automatic", local to the
procedure that uses them or appear as additional parameters in the procedure
declaration (= the ANSI prototype).

Let me define a
   "SEMILOCAL" symbol
as an identifier that is declared neither in the global scope nor in the local
one of a nested procedure/function, but at an intermediate scope level.
The interesting case is the use of such a symbol at an innermost level. All
these "semilocal" data references give rise to additional pointer-type
parameters in an extended procedure header; only then the nested procedure may
be lifted to the global context, as required by C.

PCPC creates C function parameters for all SEMILOCAL data symbols.
I end up with 6 different contexts for such an object:

1.  Origin,      some outer procedure declares a VAR or a var/val parameter
2.  Use,         in the procedure where it originates
3.  Use,         in the body of an INNER procedure
4.  Declaration, in the header of the inner procedure to be made global
5.  Call,        parameter passed from the procedure declaring the symbol
6.  Call,        by another procedure inside the scope of symbol's origin

A Pascal program has the symbol in places 1 to 3 only, the translator
must add them in places 4 to 6. The C version will pass them as pointers
(addresses) between the procedures involved. When PCPC first encounters a
semilocal symbol, the C code generator adds it to a list for the current
procedure. In a later propagation phase, second and higer-order semilocals
are filled in: such symbols aren't needed immediately by the nested procedure
but by another non-global procedure which it calls, directly or not.
A translation table for the 6 contexts figures in pcpccode.txt, as well as an
example of code.

Semilocal Const and Type declarations are promoted to the global level.
The corresponding identifiers go into some special symbol list since they may
get lost otherwise. During the propagation phase for indirect auxiliary
parameters, the scope context for these identifiers and their
type qualifiers may no longer exist.

Currently, the unnesting machinery has a bug: It loses "anonymous semilocal"
types. That happens when you declare a semilocal variable with an explicit
structured type that has no name of its own (typical Pascal construct:
  var X: record .....  end; ). The C code generator still is unable to insert
the correct type info into the expanded ANSI prototypes of globalised
functions that use such a variable. PCPC does some guesswork on arrays and
pointers, but a no-name semilocal record type definitely gets lost.
To circumvent this bug, make explicit TYPE declarations in your Pascal source.
Future versions of PCPC will create adequate global typedefs.


POINTERS:

PCPC thinks of Pointers as 32-bit entities with an offset and a segment part.
( do not use the Ofs Seg Ptr  functions, however, if you want portable
programs). The MS-DOS based C compilers should work in the LARGE memory model.
The generic Pointer data type of Turbo Pascal is implemented as (char *):
unfortunately, the (void *) is not portable to classical C compilers.
Bug 1: Pascal may declare a pointer type to some base type BEFORE the
base type declaration; PCPC supports this order ONLY IF the base type
is a RECORD. In fact, that's the most reasonable use of "forward" pointers,
in list and tree structure declarations.
Bug 2: Taking the address of a procedure/function does not work at all.

SETS:

In C, a Set is rendered as an array of 16 2-byte words and occupies 32 bytes.
Set constants, in typed const declarations, generate an initializer code for
such an array with hexadecimal elements. Pascal's readability is gone...
Example for set expressions:

procedure testLL1( var conti:contiTp);
var  t,common:termSet;
begin
    calcFirst(prodList[k], t);
    if not (0 in t) then conti[k]:=t
    else conti[k]:=(t - [0]) + follow[prodList[k,0]-256];

--------------------------- PCPC output:
void TestLL1 (Conti)
ContiTp Conti;
{
  TermSet T,Common;
  Set _e1, _e2, _e3;
    CalcFirst(ProdList[K-1], T);
    if(!(In(0,T))) _mY(Conti[K-1],T);
    else _mY(Conti[K-1],_eU(_e3,(_eD(_e2,T,(_eV(_e1),_eE(_e1,0
    )))),Follow[(ProdList[K-1][0]-256)]));


STRINGS & CHARACTERS:

 The string constant scanner does recognize the ^C - kind embedded control
  characters even if the hat symbol is the reserved Pascal token for
  pointers. My ad-hoc strategy: When '^' is followed by one uppercase letter
  AND one non-alphanumeric, the scanner thinks there's a Control character.
  This gives rise to a  parsing bug: If a Pascal program declares a Type
  identifier T which is 1 character long (perfectly legal), the
  corresponding  pointer type (syntax ^T) is mistaken as a control char.
  Rules:
    1. Never declare one-letter type names!
    2. Recode the ^C  characters by using the # symbol.
  As an  excuse, I might mention that the ^-character syntax has become an
  "undocumented feature" of Turbo Pascal Version 4, intended to be
  dropped in modern programs.

In string and char constants, the special IBM-PC characters above 128 are
written in the \ddd octal format, so that the output C file contains only
standard ASCII; I avoided even the Tab character.

Pascal Strings are mapped into C's null-terminated arrays of char.
Two possible bugs derive from this transformation:
1.) The widespread Pascal trick of direct access to the length at
index zero (or worse, using an Absolute variable overlay) is not portable
with this translator. Only disciplined string constructions like concatenation
and library functions, and the use of the Length function, give good results.
2.) Pascal strings which explicitly contain the #0 character are ill-behaved
in C: my substitute for the Length function thinks that the object
stops when it runs into the first '\0' !

I had a portability problem in some Pascal code passing strings to MSDOS
interrupts: The pointer I made (= stringAddress+1) got the first character
right in Pascal but skipped it in C! My new rule was: Never use Pascal strings
to interface with the System, make arrays-of-char instead.

The C code is clumsy due to a lot of intermediate concatenation variables.
Example:
  Cfile:=dataDir + sourceName[iRun] + '.pc1';
Translation:
  _sM(Cfile,(_sI(_s1,DataDir),_sS(_s1,SourceName[IRun-1]),_sS(_s1,".pc1")),40);

Note that the character s[1] of Pascal string s becomes  s[0] in the C code,
and that PCPC doesn't support Pascal's  length(s)=ord(s[0]) .

C chars are signed (C compilers with unsigned chars are too sophisticated
for my purpose), so any comparison operation involving them has macro calls
which chop off the unwanted char->int sign extension:
 c > d  becomes Lo(c) > Lo(d)  ,  where Lo(x)  means  (x) & 0xff .


VARIANT RECORDS:

Records which have no variant part are mapped to C's  struct declarations.
Records with no fixed part (and anonymous! case selector) are mapped to C union
code. General variant records yield a struct with a union inside.
If there is a union, each one of the Pascal case clauses makes up for another
struct inside it, except when there's only one data item in that case clause.
The internal structs have dummy identifiers _1 , _2 , _3 etc.
The internal union has the dummy identifier _0 in C.
Thus, a Pascal record.field  variable R.F  may translate to one of these
horrible things: R.F  or  R._0.F  or  R._3.F  or R._0._3.F .
The case label's value has no equivalent at all in the C output.

Example: pascal record  --->  a nested struct-union-struct:
type
  IString = String[MAXINPUT];
  CellRec = record
    Error : Boolean;
    case Attrib : Byte of
      TXT : (T : IString);
      VALUE : (Value : Real);
      FORMULA : (Fvalue : Real;
                 Formula : IString);
  end;
  CellPtr = ^CellRec;
--------------------------------PCPC output:
typedef Char IString[80];
typedef struct _gCellRec {
    Boolean Error;
    Byte Attrib;
    union {
      IString T;
      Real Value;
      struct {
        Real Fvalue;
        IString Formula;
      } _3;
    } _0;
  } CellRec;
typedef CellRec *CellPtr;

   It is legal in Pascal to write nested Case lists inside a Case clause of
a variant record. Such things, like other complicated type nestings, are NOT
supported by this translator and result in an awful lot of nonsense in the C
output, without a warning from the program! That's a bug of this software
I might cynically apologize for, by calling it a punishment for obscure coding
practices. If anyone can give me valid arguments in favour of case-inside-case
records, I'll consider implementing them in an upcoming version, even if that
will gratify us with R._0._3._100._103.F in C.
  Using the same field tags _0 _1 etc. for all variant records may be a bug
in old C compilers: Their field identifiers, even for quite unrelated
records, are drawn from a common pool and must be essentially unique.

FILES:

The only practically portable thing is a Pascal TEXT file. And only if you try
to ignore the precise anatomy of end-of-line markers: use Eoln and Readln to
get across lines, do not assume there's a CR-LF pair!
BUG: Under MS-DOS, the Pascal and C versions will react in a different way on
CR without a subsequent LF in a file:  Pascal will flag Eoln(), the PCPC
translate will NOT!

------------------------------------

Size restrictions on the Pascal source code:

- No source file may be longer than 55 k bytes, including {$include ...}s.
- A global procedure or function text, including all its locally nested
  functions,  must be smaller than 60 k bytes in the intermediate .PC1 file.
- Any identifier is truncated to 15 characters.
- There may be no more than 10 active records inside WITH constructs.
- The scope nesting of procedures/functions is restricted to 15 levels.
- Nesting of function calls, parentheses, operators and other
  EXPRESSION-generating features is limited to a total of 10.
- There can be at most 20 arguments in a Read/Write list.
- The number of auxiliary parameters, to be added to a local procedure
  which is lifted to global scope, cannot exceed 40.
- The longest Strings have 255 characters.


   Error handling:

Suppose that you feed PCPC with a Program (or Unit) file that compiles
well under Turbo Pascal 4.0.
PCPC may still get lost on a more or less catastrophic path :

* A severe bug: PCPC goes into an infinite loop or hangs otherwise;
    stack or heap overflow;  pointer, index or variable out of range,....
    In spite of thorough testing, I cannot guarantee that there is no such bug.
    Please send the fatal Pascal code to the author. The worst known problem
    is the handling of undefined or redeclared identifiers (PCPC can be wrong
    on identifiers, being blind beyond 15 characters or ignoring Include
    directives). The scanning may go on but some semantic actions (heap
    activity!) are suppressed, leaving inconsistent pointers.
    A radical solution would declare identifier errors as Fatal and halt the
    program; I felt that would be an overkill, in this version.
* A bug in the grammar specification Grammar5.txt:
    This brings the parser to a halt with a message of the type "This symbol
    not allowed here" or "Another symbol expected" or "Undefined identifier".
    I tried to get very close to the Turbo Pascal syntax diagrams,
    but there still are some deviations. In particular, Constant Expressions
    which may appear instead of constants in TP versions higher than 4, are not
    supported; minor uses of sizeof and typecasts aren't recognised, either.
    If you discover a serious parser bug, please send a sample of the
    offending Pascal code to the author.
* Pascal constructs like "Inline", "Mem", "Port", "External", "Interrupt",
    which are not yet supported (or will never be):
    PCPC prints a message about the unsupported feature and goes on.
    Unsupported items have a _forbidden marker in the grammar rules.
* A procedure or function is called which is declared in the "System" unit
    but there is no proven C library code for it in the "convpac.c" file
    (for example "randomize" has C code, but I didn't any tests yet):
    PCPC prints a message "This function has no C code" and goes on.
    C compilers will produce OBJ files but linking might fail.

Whenever PCPC detects a problem, a message goes to the screen AND into the
output file *.PC1, as a comment.
        There are three levels of messages:
- Warnings have a question mark at the end?
    Formal translation is still possible.
    example: unsupported library function
- Errors end with a period.
    The C output code may be erroneous or incomplete.
    example: unsupported feature like Inline
- Fatal errors have the exclamation mark!
    Translation must be stopped as soon as possible.
    example: non-declared identifier.



