GNU Fortran currently generates code that is object-compatible with
the f2c
converter.
Also, it avoids limitations in the current GBE, such as the
inability to generate a procedure with
multiple entry points, by generating code that is structured
differently (in terms of procedure names, scopes, arguments, and
so on) than might be expected.
As a result, writing code in other languages that calls on, is
called by, or shares in-memory data with g77
-compiled code generally
requires some understanding of the way g77
compiles code for
various constructs.
Similarly, using a debugger to debug g77
-compiled
code, even if that debugger supports native Fortran debugging, generally
requires this sort of information.
This section describes some of the basic information on how
g77
compiles code for constructs involving interfaces to other
languages and to debuggers.
Caution: Much or all of this information pertains to only the current
release of g77
, sometimes even to using certain compiler options
with g77
(such as `-fno-f2c').
Do not write code that depends on this
information without clearly marking said code as nonportable and
subject to review for every new release of g77
.
This information
is provided primarily to make debugging of code generated by this
particular release of g77
easier for the user, and partly to make
writing (generally nonportable) interface code easier.
Both of these
activities require tracking changes in new version of g77
as they
are installed, because new versions can change the behaviors
described in this section.
g77
compiles a main program unit.
g77
constructs parameter lists
for procedures.
g77
handles alternate returns.
g77
implements alternate `ENTRY'.
g77
handles `ASSIGN'.
Fortran permits each implementation to decide how to represent names as far as how they're seen in other contexts, such as debuggers and when interfacing to other languages, and especially as far as how casing is handled.
External names--names of entities that are public, or "accessible",
to all modules in a program--normally have an underscore (`_')
appended by g77
, to generate code that is compatible with f2c.
External names include names of Fortran things like common blocks,
external procedures (subroutines and functions, but not including
statement functions, which are internal procedures), and entry point
names.
However, use of the `-fno-underscoring' option disables this kind of transformation of external names (though inhibiting the transformation certainly improves the chances of colliding with incompatible externals written in other languages--but that might be intentional.
When `-funderscoring' is in force, any name (external or local)
that already has at least one underscore in it is
implemented by g77
by appending two underscores.
(This second underscore can be disabled via the
`-fno-second-underscore' option.)
External names are changed this way for f2c
compatibility.
Local names are changed this way to avoid collisions with external names
that are different in the source code---f2c
does the same thing, but
there's no compatibility issue there except for user expectations while
debugging.
For example:
Max_Cost = 0
Here, a user would, in the debugger, refer to this variable using the
name `max_cost__' (or `MAX_COST__' or `Max_Cost__',
as described below).
(We hope to improve g77
in this regard in the future--don't
write scripts depending on this behavior!
Also, consider experimenting with the `-fno-underscoring'
option to try out debugging without having to massage names by
hand like this.)
g77
provides a number of command-line options that allow the user
to control how case mapping is handled for source files.
The default is the traditional UNIX model for Fortran compilers--names
are mapped to lower case.
Other command-line options can be specified to map names to upper
case, or to leave them exactly as written in the source file.
For example:
Foo = 9.436
Here, it is normally the case that the variable assigned will be named `foo'. This would be the name to enter when using a debugger to access the variable.
However, depending on the command-line options specified, the
name implemented by g77
might instead be `FOO' or even
`Foo', thus affecting how debugging is done.
Also:
Call Foo
This would normally call a procedure that, if it were in a separate C program, be defined starting with the line:
void foo_()
However, g77
command-line options could be used to change the casing
of names, resulting in the name `FOO_' or `Foo_' being given to the
procedure instead of `foo_', and the `-fno-underscoring' option
could be used to inhibit the appending of the underscore to the name.
When g77
compiles a main program unit, it gives it the public
procedure name `MAIN__'.
The libf2c
library has the actual `main()' procedure
as is typical of C-based environments, and
it is this procedure that performs some initial start-up
activity and then calls `MAIN__'.
Generally, g77
and libf2c
are designed so that you need not
include a main program unit written in Fortran in your program--it
can be written in C or some other language.
Especially for I/O handling, this is the case, although g77-0.5.16
includes a bug fix for libf2c
that solved a problem with using the
`OPEN' statement as the first Fortran I/O activity in a program
without a Fortran main program unit.
However, if you don't intend to use g77
(or f2c
) to compile
your main program unit--that is, if you intend to compile a `main()'
procedure using some other language--you should carefully
examine the code for `main()' in libf2c
, found in the source
file `gcc/f/runtime/libF77/main.c', to see what kinds of things
might need to be done by your `main()' in order to provide the
Fortran environment your Fortran code is expecting.
For example, libf2c
's `main()' sets up the information used by
the `IARGC' and `GETARG' intrinsics.
Bypassing libf2c
's `main()'
without providing a substitute for this activity would mean
that invoking `IARGC' and `GETARG' would produce undefined
results.
When debugging, one implication of the fact that `main()', which
is the place where the debugged program "starts" from the
debugger's point of view, is in libf2c
is that you won't be
starting your Fortran program at a point you recognize as your
Fortran code.
The standard way to get around this problem is to set a break point (a one-time, or temporary, break point will do) at the entrance to `MAIN__', and then run the program.
After doing this, the debugger will see the current execution point of the program as at the beginning of the main program unit of your program.
Of course, if you really want to set a break point at some other place in your program and just start the program running, without first breaking at `MAIN__', that should work fine.
Fortran uses "column-major ordering" in its arrays. This differs from other languages, such as C, which use "row-major ordering". The difference is that, with Fortran, array elements adjacent to each other in memory differ in the first subscript instead of the last; `A(5,10,20)' immediately follows `A(4,10,20)', whereas with row-major ordering it would follow `A(5,10,19)'.
This consideration affects not only interfacing with and debugging Fortran code, it can greatly affect how code is designed and written, especially when code speed and size is a concern.
Fortran also differs from C, a popular language for interfacing and to support directly in debuggers, in the way arrays are treated. In C, arrays are single-dimensional and have interesting relationships to pointers, neither of which is true for Fortran. As a result, dealing with Fortran arrays from within an environment limited to C concepts can be challenging.
For example, accessing the array element `A(5,10,20)' is easy enough in Fortran (use `A(5,10,20)'), but in C some difficult machinations are needed. First, C would treat the A array as a single-dimension array. Second, C does not understand low bounds for arrays as does Fortran. Third, C assumes a low bound of zero (0), while Fortran defaults to a low bound of one (1) and can supports an arbitrary low bound. Therefore, calculations must be done to determine what the C equivalent of `A(5,10,20)' would be, and these calculations require knowing the dimensions of `A'.
For `DIMENSION A(2:11,21,0:29)', the calculation of the offset of `A(5,10,20)' would be:
(5-2) + (10-1)*(11-2+1) + (20-0)*(11-2+1)*(21-1+1) = 4293
So the C equivalent in this case would be `a[4293]'.
When using a debugger directly on Fortran code, the C equivalent
might not work, because some debuggers cannot understand the notion
of low bounds other than zero. However, unlike f2c
, g77
does inform the GBE that a multi-dimensional array (like `A'
in the above example) is really multi-dimensional, rather than a
single-dimensional array, so at least the dimensionality of the array
is preserved.
Debuggers that understand Fortran should have no trouble with non-zero low bounds, but for non-Fortran debuggers, especially C debuggers, the above example might have a C equivalent of `a[4305]'. This calculation is arrived at by eliminating the subtraction of the lower bound in the first parenthesized expression on each line--that is, for `(5-2)' substitute `(5)', for `(10-1)' substitute `(10)', and for `(20-0)' substitute `(20)'. Actually, the implication of this can be that the expression `*(&a[2][1][0] + 4293)' works fine, but that `a[20][10][5]' produces the equivalent of `*(&a[0][0][0] + 4305)' because of the missing lower bounds.
Come to think of it, perhaps the behavior is due to the debugger internally compensating for the lower bounds by offsetting the base address of `a', leaving `&a' set lower, in this case, than `&a[2][1][0]' (the address of its first element as identified by subscripts equal to the corresponding lower bounds).
You know, maybe nobody really needs to use arrays.
Procedures that accept `CHARACTER' arguments are implemented by
g77
so that each `CHARACTER' argument has two actual arguments.
The first argument occupies the expected position in the argument list and has the user-specified name. This argument is a pointer to an array of characters, passed by the caller.
The second argument is appended to the end of the user-specified calling sequence and is named `__g77_length_x', where x is the user-specified name. This argument is of the C type `ftnlen' (see `gcc/f/runtime/f2c.h.in' for information on that type) and is the number of characters the caller has allocated in the array pointed to by the first argument.
A procedure will ignore the length argument if `X' is not declared `CHARACTER*(*)', because for other declarations, it knows the length. Not all callers necessarily "know" this, however, which is why they all pass the extra argument.
The contents of the `CHARACTER' argument are specified by the address passed in the first argument (named after it). The procedure can read or write these contents as appropriate.
When more than one `CHARACTER' argument is present in the argument
list, the length arguments are appended in the order
the orginal arguments appear.
So `CALL FOO('HI','THERE')' is implemented in
C as `foo("hi","there",2,5);', ignoring the fact that g77
does not provide the trailing null bytes on the constant
strings (f2c
does provide them, but they are unnecessary in
a Fortran environment, and you should not expect them to be
there).
Note that the above information applies to `CHARACTER' variables and arrays only. It does not apply to external `CHARACTER' functions or to intrinsic `CHARACTER' functions. That is, no second length argument is passed to `FOO' in this case:
CHARACTER X EXTERNAL X CALL FOO(X)
Nor does `FOO' expect such an argument in this case:
SUBROUTINE FOO(X) CHARACTER X EXTERNAL X
Because of this implementation detail, if a program has a bug such that there is disagreement as to whether an argument is a procedure, and the type of the argument is `CHARACTER', subtle symptoms might appear.
Adjustable and automatic arrays in Fortran require the implementation
(in this
case, the g77
compiler) to "memorize" the expressions that
dimension the arrays each time the procedure is invoked.
This is so that subsequent changes to variables used in those
expressions, made during execution of the procedure, do not
have any effect on the dimensions of those arrays.
For example:
REAL ARRAY(5) DATA ARRAY/5*2/ CALL X(ARRAY, 5) END SUBROUTINE X(A, N) DIMENSION A(N) N = 20 PRINT *, N, A END
Here, the implementation should, when running the program, print something like:
20 2. 2. 2. 2. 2.
Note that this shows that while the value of `N' was successfully changed, the size of the `A' array remained at 5 elements.
To support this, g77
generates code that executes before any user
code (and before the internally generated computed `GOTO' to handle
alternate entry points, as described below) that evaluates each
(nonconstant) expression in the list of subscripts for an
array, and saves the result of each such evaluation to be used when
determining the size of the array (instead of re-evaluating the
expressions).
So, in the above example, when `X' is first invoked, code is executed that copies the value of `N' to a temporary. And that same temporary serves as the actual high bound for the single dimension of the `A' array (the low bound being the constant 1). Since the user program cannot (legitimately) change the value of the temporary during execution of the procedure, the size of the array remains constant during each invocation.
For alternate entry points, the code g77
generates takes into
account the possibility that a dummy adjustable array is not actually
passed to the actual entry point being invoked at that time.
In that case, the public procedure implementing the entry point
passes to the master private procedure implementing all the
code for the entry points a `NULL' pointer where a pointer to that
adjustable array would be expected.
The g77
-generated code
doesn't attempt to evaluate any of the expressions in the subscripts
for an array if the pointer to that array is `NULL' at run time in
such cases.
(Don't depend on this particular implementation
by writing code that purposely passes `NULL' pointers where the
callee expects adjustable arrays, even if you know the callee
won't reference the arrays--nor should you pass `NULL' pointers
for any dummy arguments used in calculating the bounds of such
arrays or leave undefined any values used for that purpose in
COMMON--because the way g77
implements these things might
change in the future!)
Subroutines with alternate returns (e.g. `SUBROUTINE X(*)' and
`CALL X(*50)') are implemented by g77
as functions returning
the C `int' type.
The actual alternate-return arguments are omitted from the calling sequence.
Instead, the caller uses
the return value to do a rough equivalent of the Fortran
computed-`GOTO' statement, as in `GOTO (50), X()' in the
example above (where `X' is quietly declared as an `INTEGER'
function), and the callee just returns whatever integer
is specified in the `RETURN' statement for the subroutine
For example, `RETURN 1' is implemented as `X = 1' followed
by `RETURN'
in C, and `RETURN' by itself is `X = 0' and `RETURN').
g77
handles in a special way functions that return the following
types:
For `CHARACTER', g77
implements a subroutine (a C function
returning `void')
with two arguments prepended: `__g77_result', which the caller passes
as a pointer to a `char' array expected to hold the return value,
and `__g77_length', which the caller passes as an `ftnlen' value
specifying the length of the return value as declared in the calling
program.
For `CHARACTER'*(*), the called function uses `__g77_length'
to determine the size of the array that `__g77_result' points to;
otherwise, it ignores that argument.
For `COMPLEX' and `DOUBLE COMPLEX', when `-ff2c' is in
force, g77
implements
a subroutine with one argument prepended: `__g77_result', which the
caller passes as a pointer to a variable of the type of the function.
The called function writes the return value into this variable instead
of returning it as a function value.
When `-fno-f2c' is in force,
g77
implements a `COMPLEX' function as gcc
's
`__complex__ float' function,
returning the result of the function in the same way as gcc
would,
and implements a `DOUBLE COMPLEX' function similarly.
For `REAL', when `-ff2c' is in force, g77
implements
a function that actually returns `DOUBLE PRECISION' (usually
C's `double' type).
When `-fno-f2c' is in force, `REAL' functions return `float'.
g77
names and lays out `COMMON' areas the same way f2c does,
for compatibility with f2c.
Currently, g77
does not emit "true" debugging information for
members of a `COMMON' area, due to an apparent bug in the GBE.
(As of Version 0.5.19, g77
emits debugging information for such
members in the form of a constant string specifying the base name of
the aggregate area and the offset of the member in bytes from the start
of the area.
Use the `-fdebug-kludge' option to enable this behavior.
In gdb
, use `set language c' before printing the value
of the member, then `set language fortran' to restore the default
language, since gdb
doesn't provide a way to print a readable
version of a character string in Fortran language mode.
This kludge will be removed in a future version of g77
that,
in conjunction with a contemporary version of gdb
,
properly supports Fortran-language debugging, including access
to members of `COMMON' areas.)
See section Options for Code Generation Conventions, for information on the `-fdebug-kludge' option.
Moreover, g77
currently implements a `COMMON' area such that its
type is an array of the C `char' data type.
So, when debugging, you must know the offset into a `COMMON' area for a particular item in that area, and you have to take into account the appropriate multiplier for the respective sizes of the types (as declared in your code) for the items preceding the item in question as compared to the size of the `char' type.
For example, using default implicit typing, the statement
COMMON I(15), R(20), T
results in a public 144-byte `char' array named `_BLNK__' with `I' placed at `_BLNK__[0]', `R' at `_BLNK__[60]', and `T' at `_BLNK__[140]'. (This is assuming that the target machine for the compilation has 4-byte `INTEGER' and `REAL' types.)
g77
treats storage-associated areas involving a `COMMON'
block as explained in the section on common blocks.
A local `EQUIVALENCE' area is a collection of variables and arrays connected to each other in any way via `EQUIVALENCE', none of which are listed in a `COMMON' statement.
Currently, g77
does not emit "true" debugging information for
members in a local `EQUIVALENCE' area, due to an apparent bug in the GBE.
(As of Version 0.5.19, g77
does emit debugging information for such
members in the form of a constant string specifying the base name of
the aggregate area and the offset of the member in bytes from the start
of the area.
Use the `-fdebug-kludge' option to enable this behavior.
In gdb
, use `set language c' before printing the value
of the member, then `set language fortran' to restore the default
language, since gdb
doesn't provide a way to print a readable
version of a character string in Fortran language mode.
This kludge will be removed in a future version of g77
that,
in conjunction with a contemporary version of gdb
,
properly supports Fortran-language debugging, including access
to members of `EQUIVALENCE' areas.)
See section Options for Code Generation Conventions, for information on the `-fdebug-kludge' option.
Moreover, g77
implements a local `EQUIVALENCE' area such that its
type is an array of the C `char' data type.
The name g77
gives this array of `char' type is `__g77_equiv_x',
where x is the name of the item that is placed at the beginning (offset 0)
of this array.
If more than one such item is placed at the beginning, x is
the name that sorts to the top in an alphabetical sort of the list of
such items.
When debugging, you must therefore access members of `EQUIVALENCE' areas by specifying the appropriate `__g77_equiv_x' array section with the appropriate offset. See the explanation of debugging `COMMON' blocks for info applicable to debugging local `EQUIVALENCE' areas.
(Note: g77
version 0.5.18 and earlier chose the name
for x using a different method when more than one name was
in the list of names of entities placed at the beginning of the
array.
Though the documentation specified that the first name listed in
the `EQUIVALENCE' statements was chosen for x, g77
in fact chose the name using a method that was so complicated,
it seemed easier to change it to an alphabetical sort than to describe the
previous method in the documentation.)
The GBE does not understand the general concept of
alternate entry points as Fortran provides via the ENTRY statement.
g77
gets around this by using an approach to compiling procedures
having at least one `ENTRY' statement that is almost identical to the
approach used by f2c
.
(An alternate approach could be used that
would probably generate faster, but larger, code that would also
be a bit easier to debug.)
Information on how g77
implements `ENTRY' is provided for those
trying to debug such code.
The choice of implementation seems
unlikely to affect code (compiled in other languages) that interfaces
to such code.
g77
compiles exactly one public procedure for the primary entry
point of a procedure plus each `ENTRY' point it specifies, as usual.
That is, in terms of the public interface, there is no difference
between
SUBROUTINE X END SUBROUTINE Y END
and:
SUBROUTINE X ENTRY Y END
The difference between the above two cases lies in the code compiled for the `X' and `Y' procedures themselves, plus the fact that, for the second case, an extra internal procedure is compiled.
For every Fortran procedure with at least one `ENTRY'
statement, g77
compiles an extra procedure
named `__g77_masterfun_x', where x is
the name of the primary entry point (which, in the above case,
using the standard compiler options, would be `x_' in C).
This extra procedure is compiled as a private procedure--that is, a procedure not accessible by name to separately compiled modules. It contains all the code in the program unit, including the code for the primary entry point plus for every entry point. (The code for each public procedure is quite short, and explained later.)
The extra procedure has some other interesting characteristics.
The argument list for this procedure is invented by g77
.
It contains
a single integer argument named `__g77_which_entrypoint',
passed by value (as in Fortran's `%VAL()' intrinsic), specifying the
entry point index--0 for the primary entry point, 1 for the
first entry point (the first `ENTRY' statement encountered), 2 for
the second entry point, and so on.
It also contains, for functions returning `CHARACTER' and
(when `-ff2c' is in effect) `COMPLEX' functions,
and for functions returning different types among the
`ENTRY' statements (e.g. `REAL FUNCTION R()'
containing `ENTRY I()'), an argument named `__g77_result' that
is expected at run time to contain a pointer to where to store
the result of the entry point.
For `CHARACTER' functions, this
storage area is an array of the appropriate number of characters;
for `COMPLEX' functions, it is the appropriate area for the return
type (currently either `COMPLEX' or `DOUBLE COMPLEX'); for multiple-
return-type functions, it is a union of all the supported return
types (which cannot include `CHARACTER', since combining `CHARACTER'
and non-`CHARACTER' return types via `ENTRY' in a single function
is not supported by g77
).
For `CHARACTER' functions, the `__g77_result' argument is followed by yet another argument named `__g77_length' that, at run time, specifies the caller's expected length of the returned value. Note that only `CHARACTER*(*)' functions and entry points actually make use of this argument, even though it is always passed by all callers of public `CHARACTER' functions (since the caller does not generally know whether such a function is `CHARACTER*(*)' or whether there are any other callers that don't have that information).
The rest of the argument list is the union of all the arguments specified for all the entry points (in their usual forms, e.g. `CHARACTER' arguments have extra length arguments, all appended at the end of this list). This is considered the "master list" of arguments.
The code for this procedure has, before the code for the first executable statement, code much like that for the following Fortran statement:
GOTO (100000,100001,100002), __g77_which_entrypoint 100000 ...code for primary entry point... 100001 ...code immediately following first ENTRY statement... 100002 ...code immediately following second ENTRY statement...
(Note that invalid Fortran statement labels and variable names
are used in the above example to highlight the fact that it
represents code generated by the g77
internals, not code to be
written by the user.)
It is this code that, when the procedure is called, picks which entry point to start executing.
Getting back to the public procedures (`x' and `Y' in the original example), those procedures are fairly simple. Their interfaces are just like they would be if they were self-contained procedures (without `ENTRY'), of course, since that is what the callers expect. Their code consists of simply calling the private procedure, described above, with the appropriate extra arguments (the entry point index, and perhaps a pointer to a multiple-type- return variable, local to the public procedure, that contains all the supported returnable non-character types). For arguments that are not listed for a given entry point that are listed for other entry points, and therefore that are in the "master list" for the private procedure, null pointers (in C, the `NULL' macro) are passed. Also, for entry points that are part of a multiple-type- returning function, code is compiled after the call of the private procedure to extract from the multi-type union the appropriate result, depending on the type of the entry point in question, returning that result to the original caller.
When debugging a procedure containing alternate entry points, you can either set a break point on the public procedure itself (e.g. a break point on `X' or `Y') or on the private procedure that contains most of the pertinent code (e.g. `__g77_masterfun_x'). If you do the former, you should use the debugger's command to "step into" the called procedure to get to the actual code; with the latter approach, the break point leaves you right at the actual code, skipping over the public entry point and its call to the private procedure (unless you have set a break point there as well, of course).
Further, the list of dummy arguments that is visible when the private procedure is active is going to be the expanded version of the list for whichever particular entry point is active, as explained above, and the way in which return values are handled might well be different from how they would be handled for an equivalent single-entry function.
For portability to machines where a pointer (such as to a label,
which is how g77
implements `ASSIGN' and its cousin, the assigned
`GOTO') is wider (bitwise) than an `INTEGER', g77
does not
necessarily use
the same memory location to hold the `ASSIGN'ed value of a variable
as it does the numerical value in that variable, unless the
variable is wide enough (can hold enough bits).
In particular, while g77
implements
I = 10
as, in C notation, `i = 10;', it might implement
ASSIGN 10 TO I
as, in GNU's extended C notation (for the label syntax),
`__g77_ASSIGN_I = &&L10;' (where `L10' is just a massaging
of the Fortran label `10' to make the syntax C-like; g77
doesn't
actually generate the name `L10' or any other name like that,
since debuggers cannot access labels anyway).
While this currently means that an `ASSIGN' statement might not
overwrite the numeric contents of its target variable, do not
write any code depending on this feature.
g77
has already changed this implementation across
versions and might do so in the future.
This information is provided only to make debugging Fortran programs
compiled with the current version of g77
somewhat easier.
If there's no debugger-visible variable named `__g77_ASSIGN_I'
in a program unit that does `ASSIGN 10 TO I', that probably
means g77
has decided it can store the pointer to the label directly
into `I' itself.
(Currently, g77
always chooses to make the separate variable,
to improve the likelihood that `-O -Wuninitialized' will
diagnose failures to do things like `GOTO I' without
`ASSIGN 10 TO I' despite doing `I=5'.)
Go to the first, previous, next, last section, table of contents.