Go to the first, previous, next, last section, table of contents.
If you want to contribute to g77
by doing research,
design, specification, documentation, coding, or testing,
the following information should give you some ideas.
Don't bother doing any performance analysis until most of the
following items are taken care of, because there's no question
they represent serious space/time problems, although some of
them show up only given certain kinds of (popular) input.
-
Improve `malloc' package and its uses to specify more info about
memory pools and, where feasible, use obstacks to implement them.
-
Skip over uninitialized portions of aggregate areas (arrays,
`COMMON' areas, `EQUIVALENCE' areas) so zeros need not be output.
This would reduce memory usage for large initialized aggregate
areas, even ones with only one initialized element.
As of version 0.5.18, a portion of this item has already been
accomplished.
-
Prescan the statement (in `sta.c') so that the nature of the statement
is determined as much as possible by looking entirely at its form,
and not looking at any context (previous statements, including types
of symbols).
This would allow ripping out of the statement-confirmation,
symbol retraction/confirmation, and diagnostic inhibition
mechanisms.
Plus, it would result in much-improved diagnostics.
For example, `CALL some-intrinsic(...)', where the intrinsic
is not a subroutine intrinsic, would result actual error instead of the
unimplemented-statement catch-all.
-
Throughout
g77
, don't pass line/column pairs where
a simple `ffewhere' type, which points to the error as much as is
desired by the configuration, will do, and don't pass `ffelexToken' types
where a simple `ffewhere' type will do.
Then, allow new default
configuration of `ffewhere' such that the source line text is not
preserved, and leave it to things like Emacs' next-error function
to point to them (now that `next-error' supports column,
or, perhaps, character-offset, numbers).
The change in calling sequences should improve performance somewhat,
as should not having to save source lines.
(Whether this whole
item will improve performance is questionable, but it should
improve maintainability.)
-
Handle `DATA (A(I),I=1,1000000)/1000000*2/' more efficiently, especially
as regards the assembly output.
Some of this might require improving
the back end, but lots of improvement in space/time required in
g77
itself can be fairly easily obtained without touching the back end.
Maybe type-conversion, where necessary, can be speeded up as well in
cases like the one shown (converting the `2' into `2.').
-
If analysis shows it to be worthwhile, optimize `lex.c'.
-
Consider redesigning `lex.c' to not need any feedback
during tokenization, by keeping track of enough parse state on its
own.
Much of this work should be put off until after g77
has
all the features necessary for its widespread acceptance as a
useful F77 compiler.
However, perhaps this work can be done in parallel during
the feature-adding work.
-
Get the back end to produce at least as good code involving array
references as does
f2c
plus gcc
.
(Note: 0.5.18, with its improvements to the GBE for
versions 2.7.1 and 2.7.2 of gcc
, should succeed at
doing this.
Please submit any cases where g77
cannot be made to
generate as optimal code as f2c
in combination with
the same version of gcc
, but only for versions 2.7.1 and
greater of gcc
.)
-
Do the equivalent of the trick of putting `extern inline' in front
of every function definition in
libf2c
and #include'ing the resulting
file in f2c
+gcc
---that is, inline all run-time-library functions
that are at all worth inlining.
(Some of this has already been done, such as for integral exponentiation.)
-
When doing `CHAR_VAR = CHAR_FUNC(...)',
and it's clear that types line up
and `CHAR_VAR' is addressable or not a `VAR_DECL',
make `CHAR_VAR', not a
temporary, be the receiver for `CHAR_FUNC'.
(This is now done for `COMPLEX' variables.)
-
Design and implement Fortran-specific optimizations that don't
really belong in the back end, or where the front end needs to
give the back end more info than it currently does.
-
Design and implement a new run-time library interface, with the
code going into
libgcc
so no special linking is required to
link Fortran programs using standard language features.
This library
would speed up lots of things, from I/O (using precompiled formats,
doing just one, or, at most, very few, calls for arrays or array sections,
and so on) to general computing (array/section implementations of
various intrinsics, implementation of commonly performed loops that
aren't likely to be optimally compiled otherwise, etc.).
Among
the important things the library would do are:
-
Be a one-stop-shop-type
library, hence shareable and usable by all, in that what are now
library-build-time options in
libf2c
would be moved at least to the
g77
compile phase, if not to finer grains (such as choosing how
list-directed I/O formatting is done by default at `OPEN' time, for
preconnected units via options or even statements in the main program
unit, maybe even on a per-I/O basis with appropriate pragma-like
devices).
-
Probably requiring the new library design, change interface to
normally have `COMPLEX' functions return their values in the way
gcc
would if they were declared `__complex__ float',
rather than using
the mechanism currently used by `CHARACTER' functions (whereby the
functions are compiled as returning void and their first arg is
a pointer to where to store the result).
(Don't append underscores to
external names for `COMPLEX' functions in some cases once g77
uses
gcc
rather than f2c
calling conventions.)
-
Do something useful with `doiter' references where possible.
For example, `CALL FOO(I)' cannot modify `I' if within
a `DO' loop that uses `I' as the
iteration variable, and the back end might find that info useful
in determining whether it needs to read `I' back into a register after
the call.
(It normally has to do that, unless it knows `FOO' never
modifies its passed-by-reference argument, which is rarely the case
for Fortran-77 code.)
Making g77
easier to configure, port, build, and install, either
as a single-system compiler or as a cross-compiler, would be
very useful.
-
A new library (replacing
libf2c
) should improve portability as well as
produce more optimal code.
Further, g77
and the new library should
conspire to simplify naming of externals, such as by removing unnecessarily
added underscores, and to reduce/eliminate the possibility of naming
conflicts, while making debugger more straightforward.
Also, it should
make multi-language applications more feasible, such as by providing
Fortran intrinsics that get Fortran unit numbers given C `FILE *'
descriptors.
-
Possibly related to a new library,
g77
should produce the equivalent
of a gcc
`main(argc, argv)' function when it compiles a
main program unit, instead of compiling something that must be
called by a library
implementation of `main()'.
This would do many useful things such as
provide more flexibility in terms of setting up exception handling,
not requiring programmers to start their debugging sessions with
breakpoint MAIN__ followed by run, and so on.
-
The GBE needs to understand the difference between alignment
requirements and desires.
For example, on Intel x86 machines,
g77
currently imposes
overly strict alignment requirements, due to the back end, but it
would be useful for Fortran and C programmers to be able to override
these recommendations as long as they don't violate the actual
processor requirements.
These extensions are not the sort of things users ask for "by name",
but they might improve the usability of g77
, and Fortran in
general, in the long run.
Some of these items really pertain to improving g77
internals
so that some popular extensions can be more easily supported.
-
Consider adding a `NUMERIC' type to designate typeless numeric constants,
named and unnamed.
The idea is to provide a forward-looking, effective
replacement for things like the old-style `PARAMETER' statement
when people
really need typelessness in a maintainable, portable, clearly documented
way.
Maybe `TYPELESS' would include `CHARACTER', `POINTER',
and whatever else might come along.
(This is not really a call for polymorphism per se, just
an ability to express limited, syntactic polymorphism.)
-
Support `OPEN(...,KEY=(...),...)'.
-
`OPEN(NOSPANBLOCKS,...)' is treated as
`OPEN(UNIT=NOSPANBLOCKS,...)', so a
later `UNIT=' in the first example is invalid.
Make sure this is what users of this feature would expect.
-
Currently
g77
disallows `READ(1'10)' since
it is an obnoxious syntax, but
supporting it might be pretty easy if needed.
More details are needed, such
as whether general expressions separated by an apostrophe are supported,
or maybe the record number can be a general expression, and so on.
-
Support `STRUCTURE', `UNION', `MAP', and `RECORD'
fully.
Currently there is no support at all
for `%FILL' in `STRUCTURE' and related syntax,
whereas the rest of the
stuff has at least some parsing support.
This requires either major
changes to
libf2c
or its replacement.
-
F90 and
g77
probably disagree about label scoping relative to
`INTERFACE' and `END INTERFACE', and their contained
procedure interface bodies (blocks?).
-
`ENTRY' doesn't support F90 `RESULT()' yet,
since that was added after S8.112.
-
Empty-statement handling (10 ;;CONTINUE;;) probably isn't consistent
with the final form of the standard (it was vague at S8.112).
-
It seems to be an "open" question whether a file, immediately after being
`OPEN'ed,is positioned at the beginning, the end, or wherever--it
might be nice to offer an option of opening to "undefined" status, requiring
an explicit absolute-positioning operation to be performed before any
other (besides `CLOSE') to assist in making applications port to systems
(some IBM?) that `OPEN' to the end of a file or some such thing.
This items pertain to generalizing g77
's view of
the machine model to more fully accept whatever the GBE
provides it via its configuration.
-
Switch to using `REAL_VALUE_TYPE' to represent floating-point constants
exclusively so the target float format need not be required.
This
means changing the way
g77
handles initialization of aggregate areas
having more than one type, such as `REAL' and `INTEGER',
because currently
it initializes them as if they were arrays of `char' and uses the
bit patterns of the constants of the various types in them to determine
what to stuff in elements of the arrays.
-
Rely more and more on back-end info and capabilities, especially in the
area of constants (where having the
g77
front-end's IL just store
the appropriate tree nodes containing constants might be best).
-
Suite of C and Fortran programs that a user/administrator can run on a
machine to help determine the configuration for
g77
before building
and help determine if the compiler works (especially with whatever
libraries are installed) after building.
Better info on how g77
works and how to port it is needed.
Some more items that would make g77
more reliable
and easier to maintain:
-
Generally make expression handling focus
more on critical syntax stuff, leaving semantics to callers.
For example,
anything a caller can check, semantically, let it do so, rather
than having `expr.c' do it.
(Exceptions might include things like
diagnosing `FOO(I--K:)=BAR' where `FOO' is a `PARAMETER'---if
it seems
important to preserve the left-to-right-in-source order of production
of diagnostics.)
-
Come up with better naming conventions for `-D' to establish requirements
to achieve desired implementation dialect via `proj.h'.
-
Clean up used tokens and `ffewhere's in `ffeglobal_terminate_1'.
-
Replace `sta.c' `outpooldisp' mechanism with `malloc_pool_use'.
-
Check for `opANY' in more places in `com.c', `std.c',
and `ste.c', and get rid of the `opCONVERT(opANY)' kludge
(after determining if there is indeed no real need for it).
-
Utility to read and check `bad.def' messages and their references in the
code, to make sure calls are consistent with message templates.
-
Search and fix `&ffe...' and similar so that
`ffe...ptr...' macros are
available instead (a good argument for wishing this could have written all
this stuff in C++, perhaps).
On the other hand, it's questionable whether this sort of
improvement is really necessary, given the availability of
tools such as Emacs and perl, which making finding any
address-taking of structure members easy enough?
-
Some modules truly export the member names of their structures (and the
structures themselves), maybe fix this, and fix other modules that just
appear to as well (by appending `_', though it'd be ugly and probably
not worth the time).
-
Implement C macros `RETURNS(value)' and `SETS(something,value)'
in `proj.h'
and use them throughout
g77
source code (especially in the definitions
of access macros in `.h' files) so they can be tailored
to catch code writing into a `RETURNS()' or reading from a `SETS()'.
-
Decorate throughout with `const' and other such stuff.
-
All F90 notational derivations in the source code are still based
on the S8.112 version of the draft standard.
Probably should update
to the official standard, or put documentation of the rules as used
in the code...uh...in the code.
-
Some `ffebld_new' calls (those outside of `ffeexpr.c' or
inside but invoked via paths not involving `ffeexpr_lhs' or
`ffeexpr_rhs') might be creating things
in improper pools, leading to such things staying around too long or
(doubtful, but possible and dangerous) not long enough.
-
Some `ffebld_list_new' (or whatever) calls might not be matched by
`ffebld_list_bottom' (or whatever) calls, which might someday matter.
(It definitely is not a problem just yet.)
-
Probably not doing clean things when we fail to `EQUIVALENCE' something
due to alignment/mismatch or other problems--they end up without
`ffestorag' objects, so maybe the backend (and other parts of the front
end) can notice that and handle like an `opANY' (do what it wants, just
don't complain or crash).
Most of this seems to have been addressed
by now, but a code review wouldn't hurt.
These are things users might not ask about, or that need to
be looked into, before worrying about.
Also here are items that involve reducing unnecessary diagnostic
clutter.
Go to the first, previous, next, last section, table of contents.