GCC for the Atari ST Using the GNU C Compiler on the Atari ST by Frank Ridderbusch 1. Draft March 9, 1990 Copyright _1988, 1989, 1990 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the section entitled GNU CC General Public License is included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the section entitled GNU CC General Public License and this permission notice may be included in translations approved by the Free Software Foundation instead of in the original English. PUBLIC LICENSE The license agreements of most software companies keep you at the mercy of those companies. By contrast, our general public license is intended to give everyone the right to share GNU CC. To make sure that you get the rights we want you to have, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. Hence this license agreement. Specifically, we want to make sure that you have the right to give away copies of GNU CC, that you receive source code or else can get it if you want it, that you can change GNU CC or use pieces of it in new free programs, and that you know you can do these things. To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of GNU CC, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. Also, for our own protection, we must make certain that everyone finds out that there is no warranty for GNU CC. If GNU CC is modified by someone else and passed on, we want its recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation. Therefore we (Richard Stallman and the Free Software Foundation, Inc.) make the following terms which say what you must do to be allowed to distribute or change GNU CC. COPYING POLICIES 1. You may copy and distribute verbatim copies of GNU CC source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy a valid copyright notice Copyright _1988 Free Software Foundation, Inc. (or with whatever year is appropriate); keep intact the notices on all files that refer to this License Agreement and to the absence of any warranty; and give any other recipients of the GNU CC program a copy of this License Agreement along with the program. You may charge a distribution fee for the physical act of transferring a copy. 2. You may modify your copy or copies of GNU CC or any portion of it, and copy and distribute such modifications under the terms of Paragraph 1 above, provided that you also do the following: 1. cause the modified files to carry prominent notices stating that you changed the files and the date of any change; and 2. cause the whole of any work that you distribute or publish, that in whole or in part contains or is a derivative of GNU CC or any part thereof, to be licensed at no charge to all third parties on terms identical to those contained in this License Agreement (except that you may choose to grant more extensive warranty protection to some or all third parties, at your option). 3. You may charge a distribution fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. Mere aggregation of another unrelated program with this program (or its derivative) on a volume of a storage or distribution medium does not bring the other program under the scope of these terms. 3. You may copy and distribute GNU CC (or a portion or derivative of it, under Paragraph 2) in object code or executable form under the terms of Paragraphs 1 and 2 above provided that you also do one of the following: 1. accompany it with the complete corresponding machine- readable source code, which must be distributed under the terms of Paragraphs 1 and 2 above; or, 2. accompany it with a written offer, valid for at least three years, to give any third party free (except for a nominal shipping charge) a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Paragraphs 1 and 2 above; or, 3. accompany it with the information you received as to where the corresponding source code may be obtained. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form alone.) For an executable file, complete source code means all the source code for all modules it contains; but, as a special exception, it need not include source code for modules which are standard libraries that accompany the operating system on which the executable file runs. 4. You may not copy, sublicense, distribute or transfer GNU CC except as expressly provided under this License Agreement. Any attempt otherwise to copy, sublicense, distribute or transfer GNU CC is void and your rights to use the program under this License agreement shall be automatically terminated. However, parties who have received computer software programs from you with this License Agreement will not have their licenses terminated so long as such parties remain in full compliance. 5. If you wish to incorporate parts of GNU CC into other free programs whose distribution conditions are different, write to the Free Software Foundation at 675 Mass Ave, Cambridge, MA 02139. We have not yet worked out a simple rule that can be stated here, but we will often permit this. We will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software. Your comments and suggestions about our licensing policies and our software are welcome! Please contact the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, or call (617) 876-3296. NO WARRANTY BECAUSE GNU CC IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY NO WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC, RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE GNU CC "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF GNU CC IS WITH YOU. SHOULD GNU CC PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M. STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY WHO MAY MODIFY AND REDISTRIBUTE GNU CC AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR OTHER SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) GNU CC, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY. Contributors to GNU CC In addition to Richard Stallman, several people have written parts of GNU CC. The idea of using RTL and some of the optimization ideas came from the U. of Arizona Portable Optimizer, written by Jack Davidson and Christopher Fraser. See "Register Allocation and Exhaustive Peephole Optimization", Software Practice and Experience 14 (9), Sept. 1984, 857-866. Paul Rubin wrote most of the preprocessor. Leonard Tower wrote parts of the parser, RTL generator, RTL definitions, and of the Vax machine description. Ted Lemon wrote parts of the RTL reader and printer. Jim Wilson implemented loop strength reduction and some other loop optimizations. Nobuyuki Hikichi of Software Research Associates, Tokyo, contributed the support for the SONY NEWS machine. Charles LaBrec contributed the support for the Integrated Solutions 68020 system. Michael Tiemann of MCC wrote most of the description of the National Semiconductor 32000 series cpu. He also wrote the code for inline function integration and for the SPARC cpu and Motorola 88000 cpu and part of the Sun FPA support. Jan Stein of the Chalmers Computer Society provided support for Genix, as well as part of the 32000 machine description. Randy Smith finished the Sun FPA support. Robert Brown implemented the support for Encore 32000 systems. David Kashtan of SRI adapted GNU CC to the Vomit-Making System. Alex Crain provided changes for the 3b1. Greg Satz and Chris Hanson assisted in making GNU CC work on HP-UX for the 9000 series 300. William Schelter did most of the work on the Intel 80386 support. Christopher Smith did the port for Convex machines. Paul Petersen wrote the machine description for the Alliant FX/8. The following people contributed specially to the version for the Atari ST. John R. Dunning did the original port to the Atari ST. Jwahar R. Bammi improved the port. Jwahar Bammi and Eric Smith put together and maintain the current libraries. The Atari ST port has greatly benefited from contributions and ideas of Edgar Roeder, Dale Schumacher, Kai-Uwe Bloem, Allan Pratt, John Dunning, Henry Spencer and many enthusiastic users in the Atari community. Frank Ridderbusch compiled the manual for the Atari ST. Introduction This manual documents how to install and run the GNU C compiler on the Atari ST. It does not give an introduction in C or M68000 assembler. There is enough material on both subjects available. The user who is familiar with a C compiler that runs on a U**x system should have no trouble at all getting GNU C running on the Atari ST. This manual was compiled from existing GNU manuals and various bits and pieces from John R. Dunning and Jwahar R. Bammi. The sections that describe the compiler driver and the preprocessor are nearly verbatim copies of sections in the respective manuals. The original manuals (Using and Porting GNU CC and The C Preprocessor), were written by Richard M. Stallmann. I modified these sections by removing material which described features of GNU C for systems like Vaxen or Suns. To keep this manual resonably compact, I extracted only the sections, which describe the supported command options (and predefined macros in case of the preprocessor). If the user is interested in the extensions and details which are implemented in GNU C, he has to refer to the original manuals. Whether all described options are usefull on the Atari has yet to be decided. The facts which are presented in the assembler and utility sections are mostly derived from the sources of the repective programs (from a cross compiler kit by J. R. Bammi based on GNU C 1.31), which were available to me. Other facts were gathered by trial and error, so these sections may be a bit shaky. Please send any comments, corrections, etc concering this manual to: Snail: Frank Ridderbusch Sander Str. 17 4790 Paderborn, West Germany Email: BIX: fridder UUCP: !USA ...!unido!nixpbe!ridderbusch.pad !USA:...!uunet!philabs!linus!nixbur!ridderbusch.pad 1. Installing GCC The compressed archive of the GNU C compiler binary distribution contains the common executables of the GNU compiler. That means the compiler driver (gcc.ttp), the preprocessor (gcc- cpp.ttp), the main body (gcc-cc1.ttp), the assembler (gcc-as.ttp) and the linker (gcc-ld.ttp). It also contains the following support programs: 1. gcc-ar.ttp is the object library maintainer. 2. gdb.ttp is the GNU debugger modified for the Atari ST. John Dunning did the port to the Atari. 3. sym-ld.ttp creates the symbol file needed with gdb. 4. gcc-nm.ttp prints the symbols of a GNU archive or object file. 5. cnm.ttp prints the symbol table of a GEMDOS executable. 6. fixstk.ttp & printstk.ttp are used to modify and print the current stack size of an executable. 7. toglclr.ttp allows TOS 1.4 users to toggle the clear above BSS to end of TPA flag for the GEMDOS loader. 8. gnu.g is a sample file for the GULAM PD shell. 9. COPYING explains your rights and responsibilities as a GNU user. 10. readme & readme.st are notes from Jwahar R. Bammi and John R. Dunning. All the executables should go in \gnu\bin on your gnu disk, or wherever the environment variable GCCEXEC will point at. The executables assume that you're running an approprate shell, one that will pass command line arguements using either the Mark Williams or Atari conventions. Gulam and Master are two such shells, Gemini a german alternate shareware desktop also works. Other CLI's may also work. The compiler driver gcc.ttp should be in the search PATH of your shell. The next step is to define GCCEXEC. gcc.ttp uses this variable to locate the preprocessor, compiler, assembler and the linker. GCCEXEC contains a device\dir\partial-pathname. Assuming you also put the executables in the directory c:\gnu\bin, GCCEXEC would contain c:\gnu\bin\gcc-. The value is the same as you would specify in the -B option to the compiler driver. Then you should define a variable called TEMP. During compilation the ouput of the various stages is kept here. The variable must not contain a trailing backslash. If you have enough memory, TEMP should point to a ramdisk. The next thing to do is to install the libraries. The distributed archive should contain the following libraries: 1. README the obvious. 2. crt0.o is the startup module. 3. gcrt0.o is the startup module for profiling runs. Automatically selected by the compiler driver when you specify the -pg option. 4. gnu.olb & gnu16.olb are the standard libraries, the usual libc on other systems. 5. curses.olb & curses16.olb are ports of the BSD curses. 6. gem.olb & gem16.olb contain the Atari ST Aes/Vdi bindings. 7. iio.olb & iio16.olb contain the integer only printf and scanf functions. 8. pml.olb & pml16.olb are the portable math libraries. 9. widget.olb & widget16.olb are a small widget set based on curses All these libraries go into a subdirectory described by the environment variable GNULIB. Again this variable must not contain a trailing backslash. Staying with the above example, I've set the variable to c:\gnu\lib. The libraries, which have a 16 in their names were compiled with the -mshort option. This makes integers the same size as shorts. The last bit to install are the header files. They are contained in an archive of their own. The preprocessor now knows about the variable GNUINC. Earlier version had to use the - Iprefix option, to get to the header files. According to the above examples, the files would be put in the directory c:\gnu\include. GNUINC has to be set accordingly. With earlier versions of GNU CC it was only allowed to put one path into the variables GNULIB and GNUINC. GCC 1.37 allows you to put several paths into these variables, which are separated by either a "," or a ";". All the mentioned paths are searched in order to locate a specific file. However the startup modules crt0.o or gcrt0.o are only looked for in the first directory specified in GNULIB. If the preprocessor can't find a include file in one of the directories specified by GNUINC, it will also search the paths listed in GNULIB. The programs which come with the GCC distribution also understand filenames which use the slash (/) as a separator. If Gulam is your favorite CLI you will stick to the backslashes, since you otherwise lose the feature of command line completition. If you are using Gulam, you can define aliases to reach the executables under more common names: alias cc c:\gnu\bin\gcc.ttp alias ar c:\gnu\bin\gcc-ar.ttp alias as c:\gnu\bin\gcc-as.ttp Now you should be able to say cc foo.c -o foo.ttp and the obvious thing should happen. If you still have trouble, compare your settings with the ones from the sample file gulam.g. That should give you the right idea. One additional note to Gulam. crt0.o is currently set up to understand the MWC convention of passing long command lines (execpt it doesn't look into the _io_vector part). Gulam users should set env_style mw, if you want to give long command lines to gcc.ttp. Memory Requirements GCC loves memory. A lot. It loves to build structures. Lots of them. All versions of GCC run in 1 MB of memory, but to get compile any real programs at least 2 MB are recommended. In versions before 1.37 the gcc-cc1.ttp had 1/2 meg stack, and needs it for compiling large files with optimization turned on. Happily, it doesn't need all that stack for smaller files, or even big files without the -O option, so it should be feasible to make a compiler with a smaller stack (with fixstk.ttp). GCC version 1.37 uses another scheme for memory allocation. The programs gcc-cpp.ttp and gcc-cc1.ttp are setup for _stksize==1L. This means, that an executable will use all available memory, doing mallocs from an internal heap (as opposed to the system heap via Malloc), with SP initially set at the top, and the heap starting just above the BSS. So if the compiler runs out of memory, you probably need more memory (or get rid of accessories, tsr's etc and try). 2. Controlling the Compiler Driver The GNU C compiler uses a command syntax much like the U**x C compiler. The gcc.ttp program accepts options and file names as operands. Multiple single-letter options may not be grouped: -dr is very different from -d -r. When you invoke GNU CC, it normally does preprocessing, compilation, assembly and linking. File names which end in .c are taken as C source to be preprocessed and compiled; file names ending in .i are taken as preprocessor output to be compiled; compiler output files plus any input files with names ending in .s are assembled; then the resulting object files, plus any other input files, are linked together to produce an executable. Command options allow you to stop this process at an intermediate stage. For example, the -c option says not to run the linker. Then the output consists of object files output by the assembler. Other command options are passed on to one stage of processing. Some options control the preprocessor and others the compiler itself. Yet other options control the assembler and linker; these are not documented here, but you rarely need to use any of them. Here are the options to control the overall compilation process, including those that say whether to link, whether to assemble, and so on. -o file Place output in file file. This applies regardless to whatever sort of output is being produced, whether it be an executable file, an object file, an assembler file or preprocessed C code. If -o is not specified, the default is to put an executable file in a.out, the object file source.c in source.o, an assembler file in source.s, and preprocessed C on standard output. -c Compile or assemble the source files, but do not link. Produce object files with names made by replacing .c or .s with .o at the end of the input file names. Do nothing at all for object files specified as input. -S Compile into assembler code but do not assemble. The assembler output file name is made by replacing .c with .s at the end of the input file name. Do nothing at all for assembler source files or object files specified as input. -E Run only the C preprocessor. Preprocess all the C source files specified and output theresults to standard output. -v Compiler driver program prints the commands it executes as it runs the preprocessor, compiler proper, assembler and linker. Some of these are directed to print their own version numbers. -s The executable is stripped from the DRI compatible symbol table. Certain symbolic debuggers like sid.prg work with this symbol table. Also the programs printstk.ttp and fixstk.ttp (See see chapter 5 [The Utilities], page 36, for more info) lookup the symbol _stksize in this table. -Bprefix Compiler driver program tries prefix as a prefix for each program it tries to run. These programs are gcc- cpp.ttp, gcc-cc1.ttp, gcc-as.ttp and gcc-ld.ttp. For each subprogram to be run, the compiler driver first tries the -B prefix, if any. If that name is not found, or if -B was not specified, the driver tries two standard prefixes, which are /usr/lib/gcc- and /usr/local/lib/gcc-. If neither of those results in a file name that is found, the unmodified program name is searched for using the directories specified in your PATH environment variable. The run-time support file gnu.olb is also searched for using the -B prefix, if needed. If it is not found there, the two standard prefixes above are tried, and that is all. The file is left out of the link if it is not found by those means. Most of the time, on most machines, you can do without it. You can get a similar result from the environment variable GCCEXEC. If it is defined, its value is used as a prefix in the same way. If both the "-B" option and the GCCEXEC variable are present, the -B option is used first and the environment variable value second. These options control the details of C compilation itself. -ansi Support all ANSI standard C programs. This turns off certain features of GNU C that are incompatible with ANSI C, such as the asm, inline and typeof keywords, and predefined macros such as unix and vax that identify the type of system you are using. It also enables the undesirable and rarely used ANSI trigraph feature. The -ansi option does not cause non-ANSI programs to be rejected gratuitously. For that, - pedantic is required in addition to -ansi. The macro __STRICT_ANSI__ is predefined when the -ansi option is used. Some header files may notice this macro and refrain from declaring certain functions or defining certain macros that the ANSI standard doesn't call for; this is to avoid interfering with any programs that might use these names for other things. -traditional Attempt to support some aspects of traditional C compilers. Specifically: 1. All extern declarations take effect globally even if they are written inside of a function definition. This includes implicit declarations of functions. 2. The keywords typeof, inline, signed, const and volatile are not recognized. 3. Comparisons between pointers and integers are always allowed. 4. Integer types unsigned short and unsigned char promote to unsigned int. 5. Out-of-range floating point literals are not an error. 6. All automatic variables not declared register are preserved by longjmp. Ordinarily, GNU C follows ANSI C: automatic variables not declared volatile may be clobbered. 7. In the preprocessor, comments convert to nothing at all, rather than to a space. This allows traditional token concatenation. 8. In the preprocessor, macro arguments are recognized within string constants in a macro definition (and their values are stringified, though without additional quote marks, when they appear in such a context). The preprocessor always considers a string constant to end at a newline. 9. The predefined macro __STDC__ is not defined when you use -traditional, but __GNUC__ is (since the GNU extensions which __GNUC__ indicates are not affected by -traditional). If you need to write header files that work differently depending on whether -traditional is in use, by testing both of these predefined macros you can distinguish four situations: GNU C, traditional GNU C, other ANSI C compilers, and other old C compilers. -O Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function. Without -O, the compiler's goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you would expect from the source code. Without -O, only variables declared register are allocated in registers. The resulting compiled code is a little worse than produced by PCC without -O. With -O, the compiler tries to reduce code size and execution time. Some of the -f options described below turn specific kinds of optimization on or off. -g Produce debugging information in the operating system's native format (for DBX or SDB). GDB also can work with this debugging information. Unlike most other C compilers, GNU CC allows you to use -g with -O. The shortcuts taken by optimized code may occasionally produce surprising results: some variables you declared may not exist at all; flow of control may briefly move where you did not expect it; some statements may not be executed because they compute constant results or their values were already at hand; some statements may execute in different places because they were moved out of loops. Nevertheless it proves possible to debug optimized output. This makes it reasonable to use the optimizer for programs that might have bugs. -gg Produce debugging information in GDB's own format. This requires the GNU assembler and linker in order to work. -w Inhibit all warning messages. -W Print extra warning messages for these events: An automatic variable is used without first being initialized. These warnings are possible only in optimizing compilation, because they require data flow information that is computed only when optimizing. They occur only for variables that are candidates for register allocation. Therefore, they do not occur for a variable that is declared volatile, or whose address is taken, or whose size is other than 1, 2, 4 or 8 bytes. Also, they do not occur for structures, unions or arrays, even when they are in registers. Note that there may be no warning about a variable that is used only to compute a value that itself is never used, because such computations may be deleted by the flow analysis pass before the warnings are printed. These warnings are made optional because GNU CC is not smart enough to see all the reasons why the code might be correct despite appearing to have an error. Here is one example of how this can happen: int x; switch (y) case 1: x = 1; break; case 2: x = 4; break; case 3: x = 5; " foo (x); " If the value of y is always 1, 2 or 3, then x is always initialized, but GNU CC doesn't know this. Here is another common case: int save_y; if (change_y) save_y = y, y = new_y; " if (change_y) y = save_y; " This has not a bug because save_y is used only if it is set. Some spurious warnings can be avoided if you declare as volatile all the functions you use that never return. A nonvolatile automatic variable might be changed by a call to longjmp. These warnings as well are possible only in optimizing compilation. The compiler sees only the calls to setjmp. It cannot know where longjmp will be called; in fact, a signal handler could call it at any point in the code. As a result, you may get a warning even when there is in fact no problem because longjmp cannot in fact be called at the place which would cause a problem. A function can return either with or without a value. (Falling off the end of the function body is considered returning without a value.) For example, this function would inspire such a warning: foo (a) if (a > 0) return a; Spurious warnings can occur because GNU CC does not realize that certain functions (including abort and longjmp) will never return. An expression-statement contains no side effects. In the future, other useful warnings may also be enabled by this option. -Wimplicit Warn whenever a function is implicitly declared. -Wreturn-type Warn whenever a function is defined with a return- type that defaults to int. Also warn about any return statement with no return-value in a function whose return-type is not void. -Wunused Warn whenever a local variable is unused aside from its declaration, and whenever a function is declared static but never defined. -Wcomment Warn whenever a comment-start sequence /* appears in a comment. -Wall All of the above -W options combined. -Wwrite-strings Give string constants the type const char[length] so that copying the address of one into a non-const char * pointer will get a warning. These warnings will help you find at compile time code that can try to write into a string constant, but only if you have been very careful about using const in declarations and prototypes. Otherwise, it will just be a nuisance; this is why we did not make -Wall request these warnings. -p Generate extra code to write profile information suitable for the analysis program prof. -pg Generate extra code to write profile information suitable for the analysis program gprof. -llibrary Search a standard list of directories for a library named library, which is actually a file named $GNULIB\library.olb. The linker uses this file as if it had been specified precisely by name. The directories searched include several standard system directories plus any that you specify with -L. Normally the files found this way are library files and archive files whose members are object files. The linker handles an archive file by scanning through it for members which define symbols that have so far been referenced but not defined. But if the file that is found is an ordinary object file, it is linked in the usual fashion. The only difference between using an -l option and specifying a file name is that -l searches several directories. -Ldir Add directory dir to the list of directories to be searched for -l. -nostdlib Don't use the standard system libraries and startup files when linking. Only the files you specify (plus gnulib) will be passed to the linker. -mmachinespec Machine-dependent option specifying something about the type of target machine. These options are defined by the macro TARGET_SWITCHES in the machine description. The default for the options is also defined by that macro, which enables you to change the defaults. These are the -m options defined in the 68000 machine description: -m68000 & -mc68000 Generate output for a 68000. This is the default on the Atari ST. -m68020 & -mc68020 Generate output for a 68020 (rather than a 68000). -m68881 Generate output containing 68881 instructions for floating point. -msoft-float Generate output containing library calls for floating point. This is the default on the Atari ST. -mshort Consider type int to be 16 bits wide, like short int and causes the macro __MSHORT__ to be defined. Using this option also causes the library library16.olb to be linked. (Also see section 3.2 [Predefined Macros], page 24, for more info) -mnobitfield Do not use the bit-field instructions. -m68000 implies -mnobitfield. This is the default on the Atari ST. -mbitfield Do use the bit-field instructions. -m68020 implies -mbitfield. -mrtd Use a different function-calling convention, in which functions that take a fixed number of arguments return with the rtd instruction, which pops their arguments while returning. This saves one instruction in the caller since there is no need to pop the arguments there. This calling convention is incompatible with the one normally used on U**x, so you cannot use it if you need to call libraries compiled with the U**x compiler. Also, you must provide function prototypes for all functions that take variable numbers of arguments (including printf); otherwise incorrect code will be generated for calls to those functions. In addition, seriously incorrect code will result if you call a function with too many arguments. (Normally, extra arguments are harmlessly ignored.) The rtd instruction is supported by the 68010 and 68020 processors, but not by the 68000. -fflag Specify machine-independent flags. Most flags have both positive and negative forms; the negative form of -ffoo would be -fno-foo. In the table below, only one of the forms is listed--the one which is not the default. You can figure out the other form by either removing no- or adding it. -ffloat-store Do not store floating-point variables in registers. This prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs. -fno-asm Do not recognize asm, inline or typeof as a keyword. These words may then be used as identifiers. -fno-defer-pop Always pop the arguments to each function call as soon as that function returns. Normally the compiler (when optimizing) lets arguments accumulate on the stack for several function calls and pops them all at once. -fstrength-reduce Perform the optimizations of loop strength reduction and elimination of iteration variables. -fcombine-regs Allow the combine pass to combine an instruction that copies one register into another. This might or might not produce better code when used in addition to -O. I am interested in hearing about the difference this makes. -fforce-mem Force memory operands to be copied into registers before doing arithmetic on them. This may produce better code by making all memory references potential common subexpressions. When they are not common subexpressions, instruction combination should eliminate the separate register-load. I am interested in hearing about the difference this makes. -fforce-addr Force memory address constants to be copied into registers before doing arithmetic on them. This may produce better code just as -fforce-mem may. -fomit-frame-pointer Don't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible. On some machines, such as the Vax, this flag has no effect, because the standard calling sequence automatically handles the frame pointer and nothing is saved by pretending it doesn't exist. The machine- description macro FRAME_POINTER_REQUIRED controls whether a target machine supports this flag. -finline-functions Integrate all simple functions into their callers. The compiler heuristically decides which functions are simple enough to be worth integrating in this way. If all calls to a given function are integrated, and the function is declared static, then the function is normally not output as assembler code in its own right. -fkeep-inline-functions Even if all calls to a given function are integrated, and the function is declared static, nevertheless output a separate run- time callable version of the function. -fwritable-strings Store string constants in the writable data segment and don't uniquize them. This is for compatibility with old programs which assume they can write into string constants. Writing into string constants is a very bad idea; "constants" should be constant. -fcond-mismatch Allow conditional expressions with mismatched types in the second and third arguments. The value of such an expression is void. -fno-function-cse Do not put function addresses in registers; make each instruction that calls a constant function contain the function's address explicitly. This option results in less efficient code, but some strange hacks that alter the assembler output may be confused by the optimizations performed when this option is not used. -fvolatile Consider all memory references through pointers to be volatile. -fshared-data Requests that the data and non-const variables of this compilation be shared data rather than private data. The distinction makes sense only on certain operating systems, where shared data is shared between processes running the same program, while private data exists in one copy per process. -funsigned-char Let the type char be the unsigned, like unsigned char. Each kind of machine has a default for what char should be. It is either like unsigned char by default or like signed char by default. (Actually, at present, the default is always signed.) The type char is always a distinct type from either signed char or unsigned char, even though its behavior is always just like one of those two. Note that this is equivalent to -fno-signed-char, which is the negative form of - fsigned-char. -fsigned-char Let the type char be signed, like signed char. Note that this is equivalent to -fno-unsigned-char, which is the negative form of -funsigned-char. -ffixed-reg Treat the register named reg as a fixed register; generated code should never refer to it (except perhaps as a stack pointer, frame pointer or in some other fixed role). Reg must be the name of a register. The register names accepted are machine-specific and are defined in the REGISTER_NAMES macro in the machine description macro file. This flag does not have a negative form, because it specifies a three-way choice. -fcall-used-reg Treat the register named reg as an allocatable register that is clobbered by function calls. It may be allocated for temporaries or variables that do not live across a call. Functions compiled this way will not save and restore the register reg. Use of this flag for a register that has a fixed pervasive role in the machine's execution model, such as the stack pointer or frame pointer, will produce disastrous results. This flag does not have a negative form, because it specifies a three-way choice. -fcall-saved-reg Treat the register named reg as an allocatable register saved by functions. It may be allocated even for temporaries or variables that live across a call. Functions compiled this way will save and restore the register reg if they use it. Use of this flag for a register that has a fixed pervasive role in the machine's execution model, such as the stack pointer or frame pointer, will produce disastrous results. A different sort of disaster will result from the use of this flag for a register in which function values may be returned. This flag does not have a negative form, because it specifies a three-way choice. -pedantic Issue all the warnings demanded by strict ANSI standard C; reject all programs that use forbidden extensions. Valid ANSI standard C programs should compile properly with or without this option (though a rare few will require "-ansi'). However, without this option, certain GNU extensions and traditional C features are supported as well. With this option, they are rejected. There is no reason to use this option; it exists only to satisfy pedants. These options control the C preprocessor, which is run on each C source file before actual compilation. If you use the -E option, nothing is done except C preprocessing. Some of these options make sense only together with -E because they request preprocessor output that is not suitable for actual compilation. -C Tell the preprocessor not to discard comments. Used with the "-E" option. -Idir Search directory dir for include files. -I- Any directories specified with -I options before the -I- option are searched only for the case of #include "file"; they are not searched for #include . If additional directories are specified with -I options after the -I-, these directories are searched for all #include directives. (Ordinarily all -I directories are used this way.) In addition, the -I- option inhibits the use of the current directory as the first search directory for #include "file". Therefore, the current directory is searched only if it is requested explicitly with -I.. Specifying both -I- and -I allows you to control precisely which directories are searched before the current one and which are searched after. -nostdinc Do not search the standard system directories for header files. Only the directories you have specified with -I options (and the current directory, if appropriate) are searched. Between -nostdinc and -I-, you can eliminate all directories from the search path except those you specify. -M Tell the preprocessor to output a rule suitable for make describing the dependencies of each source file. For each source file, the preprocessor outputs one make-rule whose target is the object file name for that source file and whose dependencies are all the files #included in it. This rule may be a single line or may be continued with \-newline if it is long. -M implies -E. -MM Like -M but the output mentions only the user-header files included with #include "file". System header files included with #include are omitted. -MM implies -E. -Dmacro Define macro macro with the empty string as its definition. -Dmacro=defn Define macro macro as defn. -Umacro Undefine macro macro. -T Support ANSI C trigraphs. You don't want to know about this brain-damage. The -ansi option also has this effect. 3. The Preprocessor 3.1 Invoking the C Preprocessor Most often when you use the C preprocessor you will not have to invoke it explicitly: the C compiler will do so automatically. However, the preprocessor is sometimes useful individually. The C preprocessor expects two file names as arguments, infile and outfile. The preprocessor reads infile together with any other files it specifies with #include. All the output generated by the combined input files is written in outfile. Either infile or outfile may be -, which as infile means to read from standard input and as outfile means to write to standard output. Also, if outfile or both file names are omitted, the standard output and standard input are used for the omitted file names. Here is a table of command options accepted by the C preprocessor. Most of them can also be given when compiling a C program; they are passed along automatically to the preprocessor when it is invoked by the compiler. -P Inhibit generation of #-lines with line-number information in the output from the preprocessor. This might be useful when running the preprocessor on something that is not C code and will be sent to a program which might be confused by the #-lines -C Do not discard comments: pass them through to the output file. Comments appearing in arguments of a macro call will be copied to the output before the expansion of the macro call. -T Process ANSI standard trigraph sequences. These are three- character sequences, all starting with ??, that are defined by ANSI C to stand for single characters. For example, ??/ stands for /, so ??/n is a character constant for Newline. Strictly speaking, the GNU C preprocessor does not support all programs in ANSI Standard C unless -T is used, but if you ever notice the difference it will be with relief. You don't want to know any more about trigraphs. -pedantic Issue warnings required by the ANSI C standard in certain cases such as when text other than a comment follows #else or #endif. -I directory Add the directory directory to the end of the list of directories to be searched for header files. This can be used to override a system header file, substituting your own version, since these directories are searched before the system header file directories. If you use more than one -I option, the directories are scanned in left-to-right order; the standard system directories come after. -I- Any directories specified with -I options before the -I- option are searched only for the case of #include "file"; they are not searched for #include . If additional directories are specified with -I options after the -I-, these directories are searched for all #include directives. In addition, the -I- option inhibits the use of the current directory as the first search directory for #include "file". Therefore, the current directory is searched only if it is requested explicitly with -I.. Specifying both -I- and -I. allows you to control precisely which directories are searched before the current one and which are searched after. -nostdinc Do not search the standard system directories for header files. Only the directories you have specified with -I options (and the current directory, if appropriate) are searched. D name Predefine name as a macro, with definition 1. -D name=definition Predefine name as a macro, with definition definition. There are no restrictions on the contents of definition, but if you are invoking the preprocessor from a shell or shell-like program you may need to use the shell's quoting syntax to protect characters such as spaces that have a meaning in the shell syntax. -U name Do not predefine name. If both -U and -D are specified for one name, the -U beats the -D and the name is not predefined. -undef Do not predefine any nonstandard macros. -d Instead of outputting the result of preprocessing, output a list of #define commands for all the macros defined during the execution of the preprocessor. -M Instead of outputting the result of preprocessing, output a rule suitable for make describing the dependencies of the main source file. The preprocessor outputs one make rule containing the object file name for that source file, a colon, and the names of all the included files. If there are many included files then the rule is split into several lines using \-newline. This feature is used in automatic updating of makefiles. -MM Like -M but mention only the files included with #include "file". System header files included with #include are omitted. -i file Process file as input, discarding the resulting output, before processing the regular input file. Because the output generated from file is discarded, the only effect of -i file is to make the macros defined in file available for use in the main input. 3.2 Predefined Macros The standard predefined macros are available with the same meanings regardless of the machine or operating system on which you are using GNU C. Their names all start and end with double underscores. Those preceding __GNUC__ in this table are standardized by ANSI C; the rest are GNU C extensions. __FILE__ This macro expands to the name of the current input file, in the form of a C String constant. __LINE__ This macro expands to the current input line number, in the form of a decimal integer constant. While we call it a predefined macro, it's a pretty strange macro, since its "definition" changes with each new line of source code. This and __FILE__ are useful in generating an error message to report an inconsistency detected by the program; the message can state the source line at which the inconsistency was detected. For example, fprintf (stderr, "Internal error: negative string length ""%d at %s, line %d.",length, __FILE__, __LINE__); A #include command changes the expansions of __FILE__ and __LINE__ to correspond to the included file. At the end of that file, when processing resumes on the input file that contained the #include command, the expansions of __FILE__ and __LINE__ revert to the values they had before the #include (but __LINE__ is then incremented by one as processing moves to the line after the #include). The expansions of both __FILE__ and __LINE__ are altered if a #line command is used. __DATE__ This macro expands to a string constant that describes the date on which the preprocessor is being run. The string constant contains eleven characters and looks like "Jan 29 1987" or "Apr 1 1905". __TIME__ This macro expands to a string constant that describes the time at which the preprocessor is being run. The string constant contains eight characters and looks like "23:59:01". __STDC__ This macro expands to the constant 1, to signify that this is ANSI Standard C. (Whether that is actually true depends on what C compiler will operate on the output from the preprocessor.) __GNUC__ This macro is defined if and only if this is GNU C. This macro is defined only when the entire GNU C compiler is in use; if you invoke the preprocessor directly, __GNUC__ is undefined. __STRICT_ANSI__ This macro is defined if and only if the - ansi switch was specified when GNU C was invoked. Its definition is the null string. This macro exists primarily to direct certain GNU header files not to define certain traditional U**x constructs which are incompatible with ANSI C. __VERSION__ This macro expands to a string which describes the version number of GNU C. The string is normally a sequence of decimal numbers separated by periods, such as "1.18". The only reasonable use of this macro is to incorporate it into a string constant. __OPTIMIZE__ This macro is defined in optimizing compilations. It causes certain GNU header files to define alternative macro definitions for some system library functions. It is unwise to refer to or test the definition of this macro unless you make very sure that programs will execute with the same effect regardless. __CHAR_UNSIGNED__ This macro is defined if and only if the data type char is unsigned on the target machine. It exists to cause the standard header file limit.h to work correctly. It is bad practice to refer to this macro yourself; instead, refer to the standard macros defined in limit.h. __MSHORT__ This macro is defined, if gcc.ttp is invoked with the -mshort option, which causes integers to be 16 bit. Please carefully examine the prototypes in the #include <> headers for types before using -mshort. Apart from the above listed macros, there are usually some more to to indicate what type of system and machine is in use. For example unix is normally defined on all U**x systems. Other macros decribe more or less the type of CPU the system runs on. GNU CC for the Atari ST has the following macros predefined: atarist gem m68k Please keep in mind, that these macros are only defined if the preprocessor is invoked from the compiler driver gcc.ttp. These predefined symbols are not only nonstandard, they are contrary to the ANSI standard because their names do not start with underscores. However, the GNU C preprocessor would be useless if it did not predefine the same names that are normally predefined on the system and machine you are using. Even system header files check the predefined names and will generate incorrect declarations if they do not find the names that are expected. The -ansi option which requests complete support for ANSI C inhibits the definition of these predefined symbols. 4. The GNU Assembler (GAS) Most of the time you will be programming in C. But there may certain situations, where it is feasible to write in assembler. Time is usually a main reason to dive into assembler programming, when you have to squeeze the last redundant machine cycle out of your routine, to meet certain time limits. Another reason might be, that you have to do very low level stuff like fiddling with bits in the registers of a peripheral chip. If you already have some experience in assembler programming, you might miss the feature of creating macros. This is not really a lack given the fact, that the assembler originated from an U**x environment. Under this operating system there is a tools for nearly every purpose. If you were in the need of an extensive macros facility, you would use the M4 macro processor. A public domain version of the M4 macro processor exists. It should be no problem to port it to the Atari with GCC. For some macro processing tasks you just as well use the C preprocessor. What I personally miss is the ability to produce a listing. 4.1 Invoking the Assembler gcc-as.ttp supports the following command line options. The output is written to "a.out" by default. -G assembles the debugging information the C compiler included into the output. Without this flag the debugging information is otherwise discarded. -L Normaly all labels, that start with a L are discarded and don't show up as symbols in the object code module. They are local to that assembler module. If the -L option is given, all local labels will be included in the object code module. -m68000, -m68010, -m68020 These options modify the behavior of assembler in respect of the used CPU. The M68020, for example, allows relative branches with 32-bit offset. -ofilename writes the output to filename instead of a.out. -R The information, which normally would be assembled into the data section of the program, is moved into the text section. -v displays the version of the assembler. -W suppresses all warning messages. 4.2 Syntax The assembler uses a slightly modified syntax from the one you might know from other 68000 assemblers, which use the original Motorola syntax. The next sections try to describe the syntax GAS uses. The most obvious differences are the missing period (.) and the usage of the at sign (@). The original Motorola syntax uses the period to separate the size modifier (b, w, l) from the main instruction. In Motorola syntax one would write move.l #1,d0 to move a long word with value 1 into register d0. With GAS you simple write movel #1,d0. The @ is used to mark an indirection equivalent to the Motorola parentheses. To move a long word of value 1 to the location addressed by a0, you have to write movel #1,a0@. The equivalent instruction expressed in Motorola syntax is move.l #1,(a0). The # indicates immediate data in both cases. Register Names and Addressing Modes The register mnemonics are d0-d7 for the data registers and a0-a7 or sp for address register and the stack pointer. pc is the program counter, sr the status register, ccr the condition code register and usp the user stack pointer. The following table shows the operands GAS can parse. (The first part part describes the used abreviations. The second part shows the addressing modes with a equivalent C expression.) numb: an 8 bit number numw: a 16 bit number numl: a 32 bit number dreg: data register 0: :7: reg: address or data register areg: address register 0: :7: apc: address register or PC num: a 16 or 32 bit number num2: a 16 or 32 bit number sz: w or l, if omitted, l is assumed. scale: 1 2 4 or 8. If omitted, 1 is assumed. Immediate Data #num --> NUM Dataor Address Register Direct dreg --> dreg areg --> areg Address Register Indirect areg@-->*(areg) Address Register Indirect with Postincrement or Predecrement areg@+-->*(areg++) areg@--->*(--areg) Address Register (or PC) Indirect with Displacement apc@(numw)-->*(apc+numw) Address Register (or PC) Indirect with Index (8-Bit Displacement) (M68020 only) apc@(num,reg:sz:scale)-->*(apc+num+reg*scale) apc@(reg:sz:scale)-->same with num=0 Memory Indirect Postindexed (M68020 only) apc@(num)@(num2,reg:sz:scale)-- >*(*(apc+num)+num2+reg*scale) apc@(num)@(reg:sz:scale)-->same with num2=0 apc@(num)@(num2)-->*(*(apc+num)+num2) (previous mode without an index reg) Memory Indirect Preindexed (M68020 only) apc@(num,reg:sz:scale)@(num2)-- >*(*(apc+num+reg*scale)+num2) apc@(reg:sz:scale)@(num2)-->same with num=0 Absolute Addressnum sz-->*(num)num-->*(num) (sz L assumed) Labels and Identifiers User defined identifiers are basically defined by the same rules as C identifiers. They may contain the digits 0-9, the letters A-zand the underscore and must not start with a digit. Identifiers, which end with : are labels. A special form of labels starts with a L or consists of only a digit. Both types are local labels, which disappear, when the assembly is complete (unless the -L option was specified). They can't be used to resolve external references. The L type label are referenced by their name, just as any other label. The digit type labels form a special kind of local labels. You might also call them temporary labels. They are especially useful when you have to create small loops, which poll a peripheral or fill a memory area. They are referenced by appending either a f, for a forward reference, or a b, for a backward reference, to the digit. Lets look at the following example, which is used to split a memory area starting at 0x80000. All data on an even addresses is copied to the area starting at 0x70000; all data from odd addresses goes to the area starting at 0x78000. start: lea 0x80000,a0 lea 0x70000,a1 lea 0x78000,a2 movel #0x7fff,d5 0: _ label 0 is defined moveb a0@+,a1@+ moveb a0@+,a2@+ dbra d5,0b _ reference of label 0 The label 0 is referenced 3 lines later by 0b, since the reference is backward. You can use the label 0 again at a later time to construct more such loops. Since this temporary labels are restricted to one digit in length, you can only build constructs, which use 10 temporary labels at the same time. Comments The above example also shows, that comments start with a _. # is also used to mark a comments. The C compiler and the preprocessor generate lines that start with a #. Numerical and String Constants Numerical values are given the same way as in a C programs. By default number are taken to be decimal. A leading 0 denotes an octal and a 0x a hexadecimal value. Floating point numbers start with a 0f. The optional exponent starts with a e or E. String constants are equivalent to C defined. They are enclosed in "s. Some special character constants are defined by \ and a following letter. These characters are possible: \b Backspace Code 0x08 \t Tab Code 0x09 \n Line Feed Code 0x0a \f Form Feed Code 0x0c \r Carriage Return Code 0x0d \\ Backslash \" Double Quote itself \num where num is a octal number with up to 3 digits specifying the character code. Assignments and Operators An = is used to assign a value to a Symbol. Lexp_frame = 8 This is equivalent to the equ directive other assemblers use. GAS supports addition (+), subtraction (-), multiplication(*), division (/), rigth shift (>), left shift (<), and (&), or (_), not (!), xor (^) and modulo (%) in expressions. The order of precedence is: lowest 0 operand, (expression) 1 + - 2 & ^ ! _ 3 * / % < > Parentheses are used to coerce the order of evaluation. Segments, Location Counters and Labels A program written in assembly language may be broken into three different segments; the TEXT, DATA and BSS sections. Pseudo opcodes are used to switch between the sections. The assembler maintains a location counter for each segment. When a label is used in the assembler input, it is assigned the current value of the active location counter. The location counter is incremented with every byte, that the assembler outputs. GAS actually allows you to have more than one TEXT or DATA segment. This is so to ease code generation by high level compilers. The assembler concatenates the different sections in the end to form continuous regions of TEXT and/or DATA. When you do assembly programming by hand you would stick to the pseudo obcodes .text or .data, which use text or data segment with number 0 by default. Types Symbol and labels can be of one of three types. A symbol is absolute; when it's values is known at assembly time. A assignment like "Lexp_frame = 8" gives the symbol "Lexp_frame" the absolute value 8. A symbol or label, which contains an offset from the beginning of a section, is called relocatable. The actual value of this symbol can only be determined after the linking process or when the program is running in memory. The third type of symbols are undefined externals. The actual value of this symbol is defined in an other program. When different types of symbols are combined to form expressions the following rules apply: (abs = absolute, rel = relocatable, ext = undefined external) abs + abs => abs abs + rel = rel + abs => rel abs + ext = ext + abs => ext abs - abs => abs rel - abs => rel ext - abs => ext rel - rel => abs (makes only sense, when both relocatable expression are relative to same segment) All other possible operators are only useful to form expressions with absolute values or symbols. 4.3 Supported Pseudo Opcodes (Directives) All pseudo opcodes start with a period (.). They are followed by 0, 1 or more expressions separated by commas (depending on the directive). The following table omits the pseudo opcodes, which include special information for debugging purposes (for GDB). .abort aborts the assembly on the point. .align integer aligns the current segment in size to integer power of 2. The maximum value of integer is 15. The lines .text some code .align 10 _ 2^10 = 1024 .data some more code .align 10 _ 2^10 = 1024 will create text and data sections, which both have the size 1024, although the actual code, that goes into the sections may be smaller. .ascii string[,string,] includes the string('s) in the assembly output. .asciz string[,string,] This directive is the same as above, but additionally appends a 0 character to the string. .byte expr[,expr,] puts consecutive bytes with value expr into the output. .comm identifier,integer creates a common area of integer bytes in the current segment, which is referenced by identifier. The identifier is visible from the outside of the module. It can therefore be used to resolve external reference from other modules. .data [integer] switches to DATA section integer. If integer is omitted, data section 0 is selected. .desc this sets the n_desc field in the symbol table. it is used to manupulate debugging information in the object file. .double double[,double,] puts consecutive doubles with value double into the output. .even sets the location counter of the current segment to the next even value. .file .line If a file is assembled which was generated by a compiler or preprocessed by the C preprocessor, the input may contain lines like # 132 stdio.h. These lines are changed by the assembler to the form .line 132. .file stdio.h .fill count,size,expr puts count areas with size into the output. Each area contains the value expr. Size may be an even number upto or equal to 8. The line .fill 3, 4, 0xa5a would put the following byte sequence in the output 00 00 0a 5a _ 00 00 0a 5a _ 00 00 0a 5a .float float[,float,] puts consecutive floats with value float into the output. .globl identifier[,identifier,] When labels or identifiers are assigned, they are only locally defined. The .globl directive gives identifier external scope. The label can therefore be used to resolve external references from other modules. Identifiers don't have to be assigned in the current module, but can be defined in another module. .int expr[,expr,] puts consecutive integers (32 bit) with value expr into the output. .lcomm identifier,integer is basically the same as .comm, except that area is allocated in the BSS segment. The scope of identifier is only local (only visible in the module, where it is defined). .long expr[,expr,] same as int. .lsym identifier,expr sets the local identifier to the value of expr. The identifier is referenced by preceeding it with a "L'. (Lidentifier) .octa .quad bignums are specified using this construct. The usual octal, hex or decimal conventions may be used. .org expr sets the location counter of the current segment to expr. .set identifier,expr sets identifier to the value of expr. If identifier is not explicitly marked external by the .globl directive, is has only local scope. .short expr[,expr,] puts consecutive shorts (16 bit) with value expr into the output. .space count, expr puts count consecutive number of bytes with value expr into the output. The line.space 5,3 is equivalent to.byte 3, 3, 3, 3, 3. The space directive is a special form of the fill directive. .text [integer] switches to TEXT section integer. If integer is omitted, text section 0 is selected. .word expr[,expr,] same as .short. 5. The Utilities This chapter describes the programs, which don't actually convert the source code into object code, but instead combine several object code modules to a runnable program or an object code library. Other programs can be used to print symbol information from either the object code or the executable. The last group of utility programs modify the executables in terms of memory usage and startup time. 5.1 The Linker gcc-ld.ttp A linker combines several object modules and extracts modules from a library to produce a runnable program. During this process all undefined symbol references are resolved. Additionally all sections from the object modules, which belong to either the TEXT, DATA or BSS are moved to the correct program segment. For example, all areas of all the object code modules, which have the type TEXT, are moved to form one large TEXT section. The same applies to the DATA and BSS sections. Most of the time you don't have invoke the linker explicitly. The compiler driver does the job for you. But in case you have to, the general syntax is: gcc-ld [options] $GNULIB\crt0.o file.o -llibrary or gcc-ld [options] $GNULIB\crt0.o @varlinkfile -llibrary The above syntax assumes that the executable is produced from C source code, which normally makes it necessary to link a startup module and a library. If an executable from a self contained assembler text is to be created, the startup module crt0.o and the library might be missing. gcc-ld.ttp creates a file a.out by default. The linker also appends a DRI compatible symbol table to the executable. The second command line from the above examples uses the character @ to indicate a file, which contains a list of all object modules to be linked. This is especially useful, if you have a large bunch of modules to form a program and the command line would otherwise be too short to specify all names. gcc-ld.ttp supports the following command line options. -llibrary Search library to satisfy unresolved references. The environment variable GNULIB is used to locate the library. GNULIB contains a comma or semi-colon separated list of paths, each path without a trailing slash or backslash. -Ldirectory Includes directory in the searchpath to locate a library. -M During the linking process extensive information about the encountered symbols is displayed. -ofilename The resulting output of the linking process is written to filename instead to a.out. -s prevents the linker from attaching a DRI compatible symbol table to the executable. This symbol table is only of limited use, since the symbol name is restricted to eight characters in length. Actually you have only seven valid characters, since the C compiler preceeds every symbol it generates with an underscore. -t During the linking process the files loaded and the modules extracted from a library are displayed. -x This option discards all local symbols from the DRI symbol table. All global symbols are left in place. 5.2 sym-ld.ttp sym-ld.ttp is a special version of the linker. Its sole purpose is to create a special symbol file used by the GNU debugger. The following example shows the usage. ($ is the prompt of a CLI, * is the GDB prompt, # marks a comment) $gcc -c -gg foo.c $gcc -o foo.prg foo.o $sym-ld -r -o foo.sym $(GNULIB)\crt0.o foo.o -lgnugdb (or -lgdb) $gdb *exec-file foo.prg *symbol-file foo.sym *run * *q $# back Note the line in the example, where sym-ld.ttp is invoked. A library gnugdb.olb is used to create the symbol file. This is just like the normal library gnu.olb except that is was compiled with the -gg option. If you don't have this library, use the normal library ("-lgnu'). In this case you can't single step through library functions at the source level. 5.3 The Archiver gcc-ar.ttp The archiver's main purpose is to make things in programming life easier. The archiver combines several object modules into one large library. At a later time the linker will then retrieve the modules needed to resolve all references. Without the library you would have to supply all modules by hand on the command line or the linker would have to search through all the files to resolve the references (The library gnu.olb contains around 150 modules). The general syntax for invoking gcc-ar.ttp is: gcc-ar option [position] library [module] The option specifies the action to be taken on the library or a module of that library. option also includes modifiers for the action. The optional position argument is a member of the library. It is used, to mark a specific position in the library; an add operation would than place a new module before or after that position. The next argument specifies the library. The recommended naming convention for the creation of a new libraries is library.olb. If you don't use this convention, the compiler driver gcc.ttp will have trouble finding them. Module is usually an object code file generated by the compiler. gcc-ar.ttp supports the following command line options. If you don't use a position the named module is appended or moved to the end of the library a The add, replace or move operation should place the module after position. b The add, replace or move operation should place the module before position. c If the specified library does not exist, it is silently created. Without this option gcc-ar.ttp would give you a notice, that it created a new library. d deletes module from the library. i This is the same as option b. l This option is ignored. m Move a member around inside the library. o preserves the modification time of a module, that is extracted from the library. p This option pipes the specified module directly to . q A quick append is performed. r causes module to be replaced. If the named module is not already present, it is appended. This is also the default action, when no option is given. s creates member in the library called __.SYMDEF, which contains the table of contents of global symbols in the archive. This table is used by the linker to quickly extract a module defining a symbol. This feature of gcc-ar eliminates the need for a ranlib utility commonly found on U**x systems. t lists the members, that are currently in the library. If the option v is also given, additional information about file permissions, user- and group-id's and last modification date of the members are displayed. Of course, file permissions and user- and group-id's don't make much sense on the Atari ST. u If this option is given, an existing module in the library is only replaced, if the modification time of the new module is newer than the modification time of the one already in the library. v gives you some addtional information depending on the operation that is performed. x Extract module from the library. 5.3 Listing Symbols There are two programs available for printing symbols; each for symbols of a different kind. gcc-nm.ttp list symbols in GNU object files and object libraries. cnm.ttp lists symbols from a DRI compatible symbol table attached to an executable. gcc-nm.ttp The output of gcc-nm.ttp looks like the following sample: 00000870 b _Lbss U _alloca 000003b4 t _glob_dir_to_array 00000532 T _glob_filename 00000248 T _glob_vector U _malloc 0000086c D _noglob_dot_filenames U _opendir U _readdir 00000000 t gcc_compiled. The first column displays the relative address of that symbol in the object file. If the symbol has the type U (undefined external) the space in left blank. The next column shows the type of the symbol. In general, symbols, which have an external scope (visible for other object module) are marked with an uppercase letter. Symbols, which are local to the object file are marked with lowercase letters. The following letters are possible: C marks variables which are defined in that source module, but not initialized. A declaration like int variable; would create a line marked with a C. The first column would show the size of that variable in bytes instead of the relative address in the object module. b variables, which are declared withstatic int variable; are displayed with a b. D marks variables, which are initialized at declaration time. A declaration likeint variable = 1; would show as a line with a D in it. d Variables which are initialized at declaration time declared are displayed with a d. A declaration likestatic int variable = 1; would create a line marked with a d. t,T mark text (in other words: actual program code). Functions in your C source, which have the storage class static, would be display with a t. All other functions in that source module, which are visible to other modules, would show up with a T. U All functions which are defined in other modules and referenced in this module, are displayed with a "U'. The last column shows the symbol name. gcc-nm.ttp supports the following command line options. -a In case a file is compiled with the -g or -gg option, special information for debugging purposes is included in the object code. This information is listed by supplying the -a option. -g This option restricts the output to include only symbols, which have an external scope. -n Without any options the output is sorted in ascii order. By supplying the -n, the listing is sorted in numerical order by the addresses in first column. -o If this option is given, every output line is preceeded by a filename in the form file:, naming the file in which the symbol appears. If the file to be listed, is an archive, the line begins in the form library(member):. -p The symbols are listed in the order as they appear in the object code module. -r The output is sorted in reverse ascii order. -s Archives may contain a special member called __.SYMDEF. Don't ask me about its purpose. Anyway, using this option shows the content of this member. -u Only undefined symbols are listed. cnm.ttp cnm.ttp prints the symbols which are attached to an executable. 5.4 Modifying the Executables The programs which are described in the following sections can be used to modify an already existing executable, but this only works under the assumption that the symbol table is still attached to the executable. So if you want to modifiy the memory usage of a program at a later time, you should keep the unstripped executables around. printstk.ttp printstk.ttp prints the current stacksize of an exectuable. It does this by looking up the symbol _stksize in the symbol table portion of the file and than prints the values of the location where _stksize points to. The usage is: printstk [filename] If filename is no specified it defaults to gcc-cc1.ttp. If printstk.ttp is used on some of the executables of the GCC distribution, you should see a value of -1, which means that all available memory is used by the program (at least for the programs gcc-cpp.ttp and gcc-cc1.ttp). fixstk.ttp fixstk.ttp works basicly the same way as printstk.ttp, but lets you modify the value at the location _stksize. The usage is: fixstk size [filename] size is the stacksize in Bytes, KBytes or MBytes. To specify size in Kbytes or Mbytes, append a K or an M to the integer number. fixstk 128K gcc-as.ttp sets the stacksize of gcc-as.ttp to 128 Kbytes. toglclr.ttp When TOS launches an application, it clears all memory from above the BBS to the end of the TPA. With earlier TOS versions (pre TOS 1.4) this could take quite a considerable amount of time. The clearing algorithm was improved during the different TOS releases, but it is still used, although most of the existing programs don't need a cleared memory. Well, most is not all; therefore for compatibilty's sake the feature will stay in place. With TOS 1.4 you can keep the gemdos loader from clearing all memory. The least significant bit of the long word with offset 0x16 in the program header is used to determine whether the memory should be cleared or not. Setting this bit to 1 prevents the loader from clearing all memory. toglclr.ttp serves exactly that purpose, namely toggling this bit.