
				GenProto v1.2

		Copyright 11 November 1996 by Nicolas Pomarde
			pomarede@isty-info.uvsq.fr



Introduction
------------

This little tool was developped to scan C/C++ source files and to print
the name of all the functions' definition.
Definitions can then be sorted depending on various criterias, and results
are printed to STDOUT or any other file.

The purpose of this tool is to automatically generate indexes of all
the functions used in a project, which is very useful when you have to manage
more than 10 or 20 files.
You then have a kind of "table of content" which can help you to browse
your sources more quickly when looking for a specific function.



Distribution and Disclaimer
---------------------------

This program is free, you can distribute it to anyone, as long as it
remains free ; that is, you shouldn't charge more than the cost of the
media used to distribute it.
Also, you're only allowed to distribute the whole package, without
removing any of the original files.
Since the source are included, you can modify them. However, I would prefer
you send me any modification before releasing the new source.

Although this program has been carefully checked, I take no responsability
for any damage or loss of datas this program could cause.
USE IT AT YOUR OWN RISK, and don't forget to read the "Limitations" part.

The following files are included:

README				This file
genproto			Binary exe for 68000 Amiga
genproto.readme			Small readme
lex.yy.c			The file generated by lex (if you don't have lex)
main.c				The main program
makefile			Makefile for UNIX
proto.h				Header for main.c / proto.l
proto.l				The Lex definitions file
smakefile			Makefile for SAS C on Amiga.


GenProto latest version should always be available from the Aminet sites,
whose main site is ftp.wustl.edu, in /pub/aminet/dev/c (packed with Lha).
Look in /pub/aminet for the list of all the mirrors worldwide.



Usage
-----

GenProto is a little tool to extract all the functions' definition contained
in several C/C++ source files. GenProto will add the line number and the
file name. You can then print the result with a customized printf which allows
normal escape sequences, as well as %-command to print the special fields
generated by genproto.
The resulting output is very useful to improve the documentation of your program.


The syntax of genproto is the following:

genproto [-h] [-d ] [-b] [-f<Format String>] [-s[CNFL]] [-o<Output File>] <Source Files>

  -h prints a little help
  -d turns debug mode ON and prints all tokens
  -b turns banner off (the little copyright text)
  -f specifies the string used to print the functions. The default is:
	"%R\t%C %S %N\t%P\tline %L, file %F\n"
     Here's the meaning of the %-commands
        %R : prints the return parameters of the function
        %C : prints the ClassName of the function (if it exists)
        %S : prints "::" if the ClassName exists
        %N : prints the name of the function
        %P : prints the input parameters of the function
        %L : prints the line number of the function
        %F : prints the file where the function is declared
     Any other %-command will cause the program to stop.
     All the \x escape sequences are recognized (\n,\t,...) except the octal
     and hexadecimal conversion (\ooo and \xhh) (They must be preprocessed by your shell).
  -s specifies the sorting criterias. The default is "CNFL", where C is the ClassName,
     N is the FunctionName, F is the FileName and L is the LineNumber.
     Functions will be sorted using the Sort String before being printed
     (you can specify any number of criterias (up to 4)).
  -o Results will be printed to Output File ; the default is stdout.

All others parameters are interpreted as files to be scanned (you can scan as many files
as you want in one run).


For example, if you type : "genproto main.c", you will get :

GenProto v1.2 (11/11/96) by Nicolas Pomarde

void      Usage ( char *name )  line 133, file main.c
char *    CharCopy      ( char *buf , int len ) line 174, file main.c
char *    AddTokens     ( char **TokenList , int FirstToken , int LastToken )   line 191, file main.c
void      AddPrototype  ( char **TokenList , int ClassNamePos , int FunctionNamePos , int ParamPos )  line 236, file main.c
void      ScanOneFile   ( void )        line 300, file main.c
void      SortPrototypes        ( struct function **T , int Gauche , int Droite )    line 403, file main.c
void      Swap  ( struct function **T , int i , int j ) line 427, file main.c
int       CompareFunctions      ( struct function *pF1 , struct function *pF2 ) line 452, file main.c
void      CopyListToTable       ( void )        line 496, file main.c
void      PrintOnePrototype     ( struct function *pF ) line 545, file main.c
void      PrintPrototypes       ( void )        line 593, file main.c
void      DeletePrototypes      ( void )        line 614, file main.c
void      MyExit        ( void )        line 649, file main.c
void *    MyAlloc       ( size_t size ) line 663, file main.c
void      main  ( int argc , char **argv )      line 687, file main.c


GenProto can also cope with C++ files and recognizes the "::", "~" and ":" in
functions' definition. Due to the more complex grammar of C++ and to the
simple implementation I use, this might not work properly in all the cases
(although I tested it on many C++ files without any errors).


Internal work / Technical infos
-------------------------------

The lowest level of this program is a LEX scanner that reads "words"
from the source files. These words are called "tokens" and have different
values, depending on their role in a function's definition.

They are 6 kinds of tokens:
  RESET		This value is returned if the associated token can't appear
		in a function's definition (e.g. for, if, while, auto, ...)
		The tokens' stack is then emptied (see later).
  KEYWORD	This value is returned if the associated token is a C/C++
		instruction that can appear in a definition.
  ID		This value is returned for any tokens made of the letters
		'~', '_', 'A'..'Z', 'a'..'z' and '0'..'9', but not
		belonging to the C/C++ list of reserved words.
  PARAMS	This value is returned when a bloc delimited by an opening
		'(' and a closing ')' has been found. Such a bloc might
		contain embedded pairs of '(' and ')'.
  BLOC		This value is returned when a bloc delimited by '{' and '}'
		is found. In such a case, all the data inside this bloc
		are ignored (since such a bloc can't contain function's
		definition).
  DEUX_POINTS	Returned if the string "::" is encountered ; this only
		happens in C++ files and is used to separate the class
		name and the function's name.

Each time LEX returns a token, it is stored in a stack, depending of its
value and of the previous state of the stack.
If a token doesn't appear in the right order to define a function, or
if the RESET value is returned, the stack is emptied and eveything restarts
from the current location.

Tokens are stacked as long as a valid tokens combination has not been found
(that is, as long as we haven't reach the end of the function's definition).

A valid stack could be created by:

int	main (int argc , char **argv)
{
 ... some code ...
}

resulting in:
	KEYWORD		int
	ID		main
	PARAMS		(int argc , char **argv)
	BLOC		{ ... }

After stacking a BLOC over a PARAMS, we know we encountered a function's
definition (use the "-d" option if you want to see the stack).

The tokens of the stack are then stored in a 'struct function' with the
following members:

  char			*ReturnParam;	->	"int"
  char			*ClassName;	->	EmptyString
  char			*FunctionName;	->	"main"
  char			*Param;		->	"(int argc , char **argv)"
  int			Line;		->	nnn
  char			*File;		->	"somefile.c"


All these 'struct function' are then stored in a linked list, which
is further sorted using the specified criterias in their order of
appearance (we use a "quick sort" algo). Do apply the quick sort,
the list is first converted into a single array.

The resulting (sorted) array is then printed according to the Format String.



Limitations
-----------

This program has been mainly made for C source file ; I then added
support for C++, but due to the more complex grammar, I'm not sure
it always works right. Please, report any bug.

Some of the variable have fixed length at compilation time. Although
I chose some rather large numbers, you should note the following :

 - You have 4 sorting criterias, this means the parameter of the "-s"
   option should't have to exceed 4 characters. Just in case, the
   internal buffer is 10 byte long.

 - When using the "-f" option, the format string is copied into
   a 1000 byte long buffer. Don't use format string longer than
   1000 bytes (this should be largely enough in all the cases), all
   exceeding chars will be ignored.

 - The parameters of a function's definition shouldn't exceed 2000 bytes;
   Any exceeding character will be suppressed (2000 bytes is more than
   20 lines; who uses more than 20 full lines of parameters ?
   (nobody I hope :) ).

 - A function definition should not include more than 50 tokens
   (all the parameters of the function are considered as ONE token).
   If such a case should appear, the function would be ignored.


Also, as a last remark, I would recommend to apply genproto on files
that compile without errors, and only on C/C++ files.



Future
------

This tool has been written in a few days, only to quickly create an index
of all the functions I used in a rather big C++ project.
If you find any bug or have ideas of improvement, don't hesitate to contact me.
I will try to do it as soon as possible, but I'm doing my military service :(



Contact Adress:
---------------

Nicolas Pomarede
13 allee des Vignes
78120 Rambouillet
France

e-mail: pomarede@isty-info.uvsq.fr
