
                               Bryan Kelly
                        Suggested Software Standards
                              28 June 1995
                               73507,3012

While I take full responsibility for any errors or omissions, the following 
have explicitly contributed to this list:
Darius Cooper
<your name?>

My opinion is that high quality software will be developed only by people 
with the right attitude.  But one cannot just say "My/our attitude will 
be ...", and expect it to be so.  Excepting traumatic events, attitudes are 
developed over time from our everyday events.  I believe that if one per-
forms little actions that would normally follow from the desired attitude, 
then that attitude will develop.

This list of standards is not intended to be a slavishly followed list of 
exacting requirements.  Rather it is hoped to be a set of guidelines, which 
when used, will foster the attitudes necessary for the creation of predict-
able, reliable software that meets the needs of the user.

If you have thoughts or comments concerning what I have written here, please 
start or join a thread, or send me some mail.

Finally, these are my opinions.  I do not pretend to dictate these standards 
to anyone.  Feel free to use or ignore any or all.  If you do use a large 
part, I ask, but do not demand, that credit be given.  The ulterior purpose 
is to promote better software.

Please read a few definitions that are included at the end to clarify some
terms.

                  Standards of Software Development


1)  Global variables are to be avoided wherever possible.1 2 3 45 When 
unavoidable, the module header must list the global variable and state the 
file in which it is declared.  Additions to the list of global variables 
must be approved by formal review.

2)  Global variables that are used must have a single clear purpose.  They 
may have a single writer and multiple readers, or a single reader and multi-
ple writers, but not multiple writers and multiple readers.  If this is 
required, declare and use two or more variables.  

The single exception to this is something on the order of performance moni-
toring.  Assume global variable "module_a_entry_count" is incremented in 
module "a" on each entry and read by the realtime monitoring program. It can 
be cleared by the monitor program to reset monitoring parameters.  This 
would be allowed because module "a" does not use the variable in any of its 
operations.  It just increments it and doesn't care what the value is.  This 
does not violate the spirit of the rule.

3) Variables used to directly control the state of a module will never be 
global or common.  When other modules must know the state of a variable, it 
will be copied to a common variable (This protects the variable from unwant-
ed or unexpected modification).  When a module must change its state or must 
change its operation depending on a common variable,  the common variable 
will be read, a decision will be based on reading that variable, then the 
action will be taken.  The decision must be made in the module taking the 
required action. (Decisions that are remote in space {another module} and 
time are contrary to good programming practice. See references 1,2,3, et 
al.)  

It should be noted that if other modules need to know the internal state of 
a module, there is probably a design error.  Check for this before proceed-
ing.

4)  All returned status values from procedures will be checked for errors 
before the code continues.  Unexpected conditions will be logged.  I suggest 
that one or more classes of return errors be generated and used.  For exam-
ple, in C++ one might have:

   typedef return_code
      {
     c_success                      = 0;
     c_argument_too_large             = 1;
     c_argument_too_small             = 2;
     c_file_not_found                = 3;
     ...
      };

In large projects, this can become awkward.  One file list may be used at a 
high level, and some local files used for return codes that are unique to 
smaller sections.  Care must to prevent duplication.  

5)  Unexpected conditions will cause a error status to be returned to the 
caller.  Program exit will not be called from subordinate modules unless 
there is literally no choice.  Calling EXIT or STOP is not an approved 
method of handing an error condition.  It is a clear indication of poor 
design and implementation.

6)  Error logging and notification routines should be invoked as close as 
possible to a) The cause of the error, or b) the detection of the error.  
When possible, the calling tree should be output.  This will clarify the 
problem when function x() generates/detects and error but is called from 
several locations.  Within a single function, in the event x() is called 
several times, the particular call to x() could be identified.

7)  All return status values will be parameters from the designated include 
file. The names are to be clear.

8)  The return status value shall be the first item in a list of arguments.  
In this manner, it is always argument number 1 and never changes position.  
Functions may return the status directly.  When elaborate error tracking is 
needed, this argument may be a structure elaborating the conditions of the 
error.  This can be carried to an unnecessary extreme.

9)  The "natural" or "default" size for variables will always be used unless 
there is a compelling reason to do otherwise. Using a byte when a standard 
integer will do saves nothing.  Any deviation from this must be clearly 
documented in the code.  

10)  All variable, structure, and record names will be descriptive of their 
purpose.  It is suggested that the entire project, or event the entire 
company adopt some type of naming convention.  The exact convention used is
probably not important.  This single topic merits, and has received, full
attention unto itself.  Suggestions and references are solicited.

11)  Incrementing and counting variables will be descriptive.  "I", "J" and 
their ilk will not be used.  If nothing else, ARRAY_INDEX is better.  (try a 
text search on "i", or a global replace.)
For example

FOR I = 1, 2000
   PID( I ) = 0
END DO

Should be replaced with

FOR PID_INDEX = 1, MAX_PID
     PID( PID_INDEX ) = P_INVALID
END DO

12) Multiple levels of indexing will not be used in a single statement.
Replace:
POINT_NAME = PID(  RAC( POINT_INDEX ) )
With
PID_INDEX = RAC( POINT_INDEX )
POINT_NAME = PID( PID_INDEX )

When a "subscript out of range" error is detected in the first case, the 
cause cannot be determined.  Was POINT_INDEX out of range, or was the value 
retrieved from RAC array out of range?.  In the second case, the variable in 
error will be apparent.

NOTE: The FORTRAN compiler and run time package for VAX/VMS can detect sub-
script out of bounds errors.  It will terminate the program and report the 
offending line number.  If there is only one level of subscripting per line, 
the index in error is obvious.

13) Writes to the error log will use FORMAT statements.  Use of the default 
format statement ( i.e. the asterisk) causes inconsistencies in the format 
of the error log.  Since FORMAT statements are required in a large number of 
places, consistency is greatly improved by using them in all error log 
output.

NOTE:  My current project uses a mailbox process at VERY high priority for 
error logging.  All processes can send error data to a single location for
automatic logging.  The output of this program can be sent to a CRT for 
realtime viewing.

14) The liberal use of spaces and blank lines is encouraged to increase 
readability.  Two to four blank lines are recommended to separate sections 
of code.  Do not cram code into as small a space as possible.  This makes it 
difficult to read.

15) Paragraphs of pseudo code or comments followed by paragraphs of code is 
highly discouraged.  The code tends to be changed over time, while the 
pseudo code and comments are ignored.  The end result is comments that are 
incorrect and misleading.

16) In general, code comments will be indented one level further than the 
actual code that is describes.
Reasons:
a. Maintaining consistent indentation scheme throughout the file improved 
the aesthetic appeal.
b. Gives prominence to the code instead of the comments.
c. Extracted pseudo-code will follow the general indentation pattern of the 
actual code.

17) Three character positions per level of indentation is recommended.  Two 
just doesn't seem to be enough.  Any more tends to become excessive.

18) The first line of executable code (i.e. the first statement that is 
actually performed at run time) should be preceded with a standard comment 
line such as:

in FORTRAN
  C EXECUTABLE CODE BEGINS HERE
or in C
  /* executable code beings here */

In this manner, the programmer can open the source or listing file and 
immediately move to the first line of code by searching on "C EXE" or "/* 
exe".  This is particularly useful when the module headers get rather long.

19) Complex IF statements should tend to have a separate line for each 
condition, with the conjunctive separators (i.e the ".AND." and the ".OR.") 
on their own line.  The liberal use of parenthesis to denote grouping is 
strongly encouraged.  When the parenthetical nesting becomes greater than 
two, the programmer should consider breaking the statement into two or more
separate statements.  An example in C source follows:

 if(((x==(4+(i*6))) && (y<=((x/6)-j))) || (z!=(6*k)))

Is far more understandable and easier to modify in the following format:

boolean1 = (x_position == color_4 + c_red_index * c_weight_factor);
boolean2 = (y_position <= x_position/c_weight_factor-c_blue_index);
boolean3 = (z_position != c_weight_factor * c_green_index);
if(boolean1 && boolean2 || boolean3)  doSomething();

20)  As noted in the previous item, avoid the use of "magic numbers." What 
is the significance of the 4 and 6 in the first equation.  Create a named 
constant that indicates the purpose of the value and use that instead of the 
number itself.  Some of the many advantages are: 1) It is easer to read.  2) 
If the value must be changed, it need be changed in only one place.  3) Try 
searching on "4" and then on "x_position".  4)  When you search on 4 and 
find it, does this particular "4" mean "x_position", or is it the 4 that 
means "max_error_count"?  

The definition:

   #define NUMBER_ZERO = 0;

is still a magic number, it is just spelled out.  If there is a actual need 
to use the number 0, for example, and it has no ulterior meaning other than 
the value of zero, then it is not a magic number.  For example:

   for( index = 0, index < array_size; index ++ ) ...;

We may start with zero simply because this is the first element of the 
array.  However, the situation should always be examined with a very criti-
cal eye.  If something you thought to be an unchangeable fact, turns out to 
be wrong, the price for fixing a bug of this type may be very expensive.


                           DEFINITIONS:

Module: a compilable entity such as a procedure or function.  Multiple 
modules may coexist in a single file.

Global variable: Data that is declared such that it can be visible to multi-
ple programs, or to any module within a single program or project. Emphasis
on "can" be visible. For example, in the VAX VMS environment, data can be 
loaded into memory and made visible to multiple programs.  Any such data is 
global data.  This definition is derived from the dictionary entry for glo-
bal, "Of, pertaining to, or involving the entire earth : WORLDWIDE"  

Common variable: Data that is declared such that it can be visible to a 
restricted set of modules within a single program.  Data shared my multiple 
function, in one or more files, is common data.

Local data: Data than is not visible outside of a singe module.

                              END NOTES

(1)Maguire, Steve. Writing Solid Code Microsoft Press, 1993 page 155 "The 
rule of thumb is Never pass data in global buffers unless you absolutely 
have to."  Page 156, Highlighted section summary "Don't pass data in static
(or global) memory."

(2)Page-Jones, Melir. The Practical Guide to Structured Systems Design. 
Yourdon Press Computing Series, 1988 Pages 73-77.  The author devotes an 
entire section of about five pages, entitled COMMON (ALIAS GLOBAL) COUPLING 
enumerating and describing seven reasons why global memory should not be 
used.

(3)Myers, Glenford J. Reliable Software Through Composite Design  Van 
Nostrand Reinhold Company, 1975, Page 37, the author devotes a three page 
section titled COMMON COUPLING to the hazzards and problems caused by 
common coupling and specifically references the COMMON statemen in FORTRAN.
 

(4) The concept of Object Oriented Programming (OOP), et al.  The concepts 
of Encapsulation and Data Hiding are fundamental to OOP.  In simple words, 
these mean to bury the data in the code and to forbid external reference to 
the data itself.  In other words, No common data allowed. While technically 
speaking we cannot implement OOP using FORTRAN under VMS, we can implement 
the concept and style, and therefore reap many of its benefits.  There is 
no excuse for ignoring the lessons of the past that have led to OOP.

(5)The C++ Programming Language, Bjarne Stroustrup, 1991, Addison-Wesley 
Publishing Co.  Chapter 0, page 10.  "When you define a class... [a] Don't 
use global data.  [b] Don't use global (nonmember) functions. [c] Don't use 
public data members.
