---------------------------------------------------------------------- Copyright (C) 1991 by Natrlich! This file is copyrighted! Refer to the documentation for details. ---------------------------------------------------------------------- ====================================================================== Hacking NASM65 *** READ THIS IF YOU WANT TO PORT OR IMPROVE NASM *** RANDOM THOUGHTS TABs @ 3, 6, 9 ... ====================================================================== STEP 1 L O O K A T Y O U R M A C H I N E . D O Y O U T R U S T I T T O C O M P I L E A N D E X E C U T E A L A R G E C P R O G R A M P R O P E R L Y ? H I N T : I F Y O U A R E A M S - D O S U S E R Y O U P R O B A B L Y D O N ' T . S K I P T O T H E E N D O F T H I S T E X T STEP 2 E D I T P O R T A B L E . C A N D T H E N T R Y T O C O M P I L E I T F I R S T ! I F Y O U C A N ' T G E T P O R T A B L E T O R U N , F O R G E T A B O U T C O M P I L I N G T H E P A C K A G E ! L O O K C A R E F U L L Y A T T H E O U T P U T ! STEP 3 If you get mysterious compile warnings after portable compiled successfully, it may very well be that your system doesn't like the order in which the files are included. Especially bothersome are "nasm.h" and "xosbind.h" try to juggle them around 'til it works. **** IMPORTANT **** IMPORTANT **** IMPORTANT **** IMPORTANT **** If you port this to another machine make sure that the object files produced are compatible with the TOS version!! THIS IS THE MOST IMPORTANT THING TO WATCH OUT FOR WHEN PORTING!! IF YOU CAN'T MAKE IT object file compatible, but you get it working THEN YOU MUST SET #define INCOMPATIBLE 1. This will at least tell other machinery with correct ports that they can't use your objects. File compatible, does not mean file identical. If it looks different but it still links with ST produced object files and produces the same binary, then that's OK. **** IMPORTANT **** IMPORTANT **** IMPORTANT **** IMPORTANT **** /* more random thoughts */ Use GCC, if you are running Unix. The HP cc compiler I know, doesn't compile anything, beyond "hello world". Compile and run portable.c on a new system, to see where the differences between Atari ST and the new machinery lie. Portable produces "localdef.h" for your defines.h file. If your system doesn't have set SIGNAL to 0 (it doesn't matter for NASM65, but you can only guess from crashes, whether WORDs have to lie on even boundaries). If you know that you don't have UNIX-like IO, set IOCHECK to 0 as well. If portable crashes set #define WORD_EVEN 1 in the "localdef.h" file. NASM65 puts heavy strain on the compiler and the preprocessor. If your preprocessor is apparently NOT up to standards you might want to use an ANSI cpp to preprocess all .c files and compile those on your lesser C-compiler. Change the file xosbind.h, to fit your system. Check out the way lseek operates. Files you may want to change are... defines.h xosbind.h nmalloc.h / nmalloc.sun If you can't set "byte" to 8 bit, "word" to 16 bit or "lword" to 32 bit, then you need to look into structs.h AND INTO OBJECT.H!!. Also most probably yylval will need retyping. (struct ??) Some programs like disasm65 expect 16bit datatypes and will produce unsightly results. Bottomline: If you don't have 8 and 16bit things FORGET IT! Almost every pointer has a "huge" modifier, which is completely meaningless except if you are MS-DOS based. In the MS-DOS world far doesn't give you true 32-bit performance, but huge comes much closer. If you do mind the style this is written in, than go port/hack something else. I do not (repeat: do not) want any improvements in the visual style this is written in. I also don't think leaving out the various little jokes makes your port a good port. If you can hack up something better, GO AHEAD! (WITHNOISY not required BTW). And please.. NO FUCKING PROTOTYPING! I'd appreciate if you could #if around the code, that creates problems, so that the source will still be useable for the ST and other already ported machinery. (except if your code is definitely more correct, like f.i. a missing cast or so..) If you have successfully ported please send me your diffs so that I can update my sources with them. Indicate which version you hacked on BTW. For development set VERSION to 0, and all revisons to 0 (at first). This makes things a lot easier, and enables debugging for those w/o a sourcelevel debugger. Improving NASM to handle the C64 or AppleII, ought to involve 'just' changing the FF FF headers of the executables, and changing |MEMLO to the right value. If they can't take segmented binary files, it might be wise to disallow the '-r' flag. Some files are compiled twice, once for the linker and once for the assembler. Try to keep those as short as possible. This is taken from NERD but it is valid nevertheless: One more note to the comments, don't trust them if they don't make sense to you. Most of them are there just for my entertainment and as reminders of historic failures. (Some of them are plain wrong) To understand the comments a complete CD-collection of bands mentioned below and the books of R.A. Lafferty are essential. (-zusammenfassend: Kommentare sind fr mich, nicht fr dich. Der Sinn l„žt sich manchmal nur mit einigem Hintergrundswissen er- schliessen. Viel Glck trotzdem..) Getting more speed out of NASM: This program has tought me something about what's really important in an assembler and what's not. All important is the character get routine!! Unfortunately due to my efforts in OO-designing the I/O system, this is one of the slower components. INPUTFST.H & FASTEXPR.H yielded a speed increase of 10% (difference to VERSION 0)). Needless to say this is less than satisfactory. (A further rewrite in v1.2 which had the looks that kill, didn't do anything...) Yacc might be a bit overkill for an assembler. Some code has not seen any optimization although it should, like the stuff in MACRO.C and PROCESS.C. With the help of the generic nmalloc and some inline #defines, creation and handling of all the lists, should be quite speedy, but I suspect it isn't (More fun ahead). Strings could be aligned on odd addresses only, which would make hashing easier since, you'd just need to pick a LONG from &string + 3. This isn't that good though, cause that means that local labels only get 3 chars for hashing. The point of having 6 characters is that many labels differ only at the end (like f.i. COLPL0 COLPF1). On the other hand those aren't local labels and the hash routine, although written in assembler takes an estimated 300 cycles ~ 50 microsecs, so may be it WOULD be worthwhile. There is one thing bugging me still, no matter what find_label does at least one strcmp must be done, when a matching label is found. Since the hash values are spread over several characters using HASH alone for the first 6-7 characters is not conclusive. Maybe use 64 bit for hashing.. >version 1.2 has been successfully ported to MSDOS, (thanks to J.Richter >for lending me his machine for a week and helping me read the >80xxx ML code, he also pointed out some problems with prototyping). >One good thing about this affair was that I got access to a profiler. >The results have been quite pleasant so far, the bottle neck being the >parser and not my code, as was feared. The find_label routine is quite >speedy and doesn't warrant further optimization. v1.2 probably was never a successful port. It just looked that way. (THIS IS VERY OUTDATED AND JUST KEPT IN HERE FOR HISTORICAL AND AMUSEMENT VALUE (nat/10/91)) (SKIP UNTIL "<<>>" AND IGNORE THIS) ------------------------------------------------------------------------- Some comments start of with **, this means usually that at this point in the code some forseeable problems may arise when porting this code. **-PORT #1-** means: casting was used just to get rid of warnings. They can be thrown out if needed. **-PORT #2-** means: Be very careful here. A structural change might lead to imcompatibilites. (This should have been put into more places) **-PORT #3-** means: Change this value according to your machine. **-PORT #4-** means: There could be problems here with BITwise BIGENDIAN machinery. **-PORT #5-** means: This shouldn't be a problem, since $0000 is $0000 whether BIGENDIAN or LITTLEENDIAN **-PORT #6-** means: If this doesn't work, omit it. Note it in the run notice though. **-PORT #7-** means: This (probably) only works with two's complement. Some porting hints: So you've got some bastard machinery that makes life miserable. I can anticipate some problems and point you into the right directions... Case1: Bytes are 7 Bit (or less ?). Jeeeeeesus. You really are some diehard fanatic. You probably are on some sort of mainframe and VERY bored with life. My advice: Forget it. If you make it nevertheless PLEASE PLEASE send me the source. Case2: If you have a 68xxx or 80xxx machine, NASM should compile fine. Only the I/O-calls might give you problems. Case3: Pointer trouble. If unsigned long is too small to pack any kind of pointer into, you will have to use a union for yylval in lexer.c asm65.y macro.c md_suck.c. I figured a compiler, which can't implement unions as register variables (as I fathom some can't) would compile better code this way. (Who knows...). The rest should be OK, since I ought to have casted every pointer in sight. Case4: Portable tells you: NO 16-Bits around. This is BAD!! Look around, maybe you can construe something out of two chars. If you take a look at code.h, you will see that this code is VERY dependent, that you can put 2 8-Bits together, calc with them as if they were 16-Bit (unsigned natch!) and then pull them apart again. If you can't make 16-Bits somehow, try for 24 or 32 bit and rewrite code.h to calc the new offsets (possibly + 2 bytes). ------------------------------------------------------------------------- (<<>> OK, HERE COMES SOME LESS CRUFTY ASCII) Adding your own directives is EASY. First make up a name. It must start with a '.'. Check out the way I did it with .undef. First I put in a line in ASM65.Y in nline: ... | T_UNDEF T_IDENT { undefine( $2); } ... this basically means, when a line contains a .undef directive token, which must be followed by an identifier token, call undefine() with the identifier. Now register T_UNDEF as a token with %token ... T_UNDEF /* line 72 in ASM65.Y or so... */ basically means, we don't care about the of T_UNDEF, we DO care about the value of T_IDENT (--> $2) but T_IDENT is already handled so we don't need to do anything more in ASM65.Y. If you want to be neat do a little function definition in NASM.H. Make it: void undefine(); Now we need to generate the T_UNDEF token, this is done in md_suck (M acro & D irective SUCK er). Since there aren't yet any other directives starting with U, we have to create a du_[] array for ourselves, that ends with a EOA (end of array) field. If there had been a du_[] array already we'd just needed to insert our bit { "UNDEF", T_UNDEF, 0 } at the lexically correct position. Since du_[] didn't exist before we also need to update *d_arr and insert du_ inbetween dt_, and dv_. Then we also need to update (because we changed d_arr) d_starters. We locate the free entry for U /* A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */ 1,2,3,4,5,6,0,0,7,0,0,8,9,10,11,12,0,13,14,15,0,16,17,0,0,18, and update the table to /* A B C D E F G H I J K L M N O P Q R S T U V W X Y Z */ 1,2,3,4,5,6,0,0,7,0,0,8,9,10,11,12,0,13,14,15,16,17,18,0,0,19, (we could have done this a bit faster, by appending du_ to d_arr and putting a 19 into d_starters, but it would have looked ugly...) Now we need to update two more files, which are ndebug.c, we just insert case T_UNDEF : x = "T_UNDEF"; break; anywhere it looks OK (in the large switch statement), and errorasm.c we insert a case T_UNDEF : return( ".UNDEF"); so that we can create somewhat more readable error messages. And we are done! (Don't forget to write the C-routine.. (har har)) PROBLEMS THAT I RUN INTO WHENEVER I AM TRYING TO PORT NASM ---------------------------------------------------------- o Forgot to run portable, when copying over new files and thereby clobbering LOCALDEF.H o Set TEST in nmalloc.h to 1 and expect big macros to be swallowed. ---------------------------------------------------------------------- A little bonus for those who read thru this cruft till the end: (for seasoned readers this is v1.2) of course you never could put in an AUTO folder, cause GEM isn't running yet... It's ABE encoded (really obscure) ;ABE ASCII-Binary-Encoding (by Brad Templeton) ;Use 'sort' and/or 'dabe' to decode T./z$$filecount=1 T.0N##S1000,1000,1000,ABE1 T.1N$$blocking=false T.2r$$uname=x.arc T.3a$$os=gemdos T.4c$$fname=x.arc T.55$$date=665419678 T.6d$$perm=0 T.7C$$size=1904 T.8V""%0+_(&,zO=&=:'j(2\B)--?Z T.AT""'@ABC%DEFG%HIJK1LMNO'PQRS(TUVW;XYZ[V\]^_i T.Bg""(`abc1defgDhijkGlmno5pqrs,tuvw-xyzukXpGSu T.CM""))YR*&VDUN8OPl[WqlSZW5-U^3+_tcMx:g>Jfirj2 T.Dp""*;n1v)[e342H`Nzk\FcKZ6s40G@baIZkLM`?9oihZ T.EH""+r>m8'10w@o?K.g6oxE7*X(EC>,M]mQhk:P>Qq;6W T.FJ"",TYCWlGV/HY8325Z'%y0"Q9q0:a+!ZS!J]|:t\WI{<.0="1))}_]p0-{YY4|12-Qf$QW{HJ T.JWg|[F{/0P!Mwc{)(0.P#evqwR=.{?g0`W!1K|?u"7FB0[~5wEn0~V,0|,A}kRe($Sj T.KKe#4,?Cv.h{9:COV{:7ih}+]C@+)]~a.M!y*d{dy3e@7"IC[H5r"y.X{IQb)!FLE}q,` T.LnmV}lj?N|H2T0d=NFNTL|OAD{.S000|;O00A"Aeu|9-|7\002..%}xSf2}i4o{Q)7|^ T.MEU2/6e{G4P!6(DjHG&~V2C|/\{%44cXV./{fod-.4R=_}bN> T.Nd7lh|,LH}Nx>7!dL|40]'\Z8e?)|xcwoB9"o:}ToZO,{:vt{bR8S'|Zj-TX[}:3MM!W2 T.OC"o.oq}sRSN1aA{1ocW#1i4|*A{+>C!4q{>9i|'t$pY!XE$.^"]V{+Agwx{:9K|8k{- T.PIg5Ma#ho!G\!ak&{@NJ{I[^":r!n'g{_'h$G'!'HMev.|G)H_m!m,|H.0jwS!08"HZ T.Qo"kQU1j$CP{M1&|V-J#LV{4QR(n'!e)}p?F*.!89+]"35Zv"U8!@y#3to$kA#*W*!j[ T.Re#I7o|cTA"@3*!;o}Cm/~8T@Z{afQU|BvABr-@:eO!VZx+{RP.}8X>/|n^u}+TDWC^YSj!7JQw/$+(U{fc))BG:u!eW!Wb)"e*B{)kHDIlAY{&Cew[{CE T.U3/})p^I|rXY;_+{).Uc!.6z}K0o(&{=HHM{,'/ck.F@,f`ED{O%,B2!.qJw{g*s{GW T.V0L!2ir_2{FMT}hDO}Usg!jmf|2RP.X)BI{7[rBI}5,,@Ek{nTF/{m?6!^yCgG!Nu"^i T.W6&Gg|XB}1@z"lL.J={?Xrm"\bO{G@E1{&tSi({5uOdk!`3#D1!E1|AOZ5}S-(\{8*7 T.XSFbpHbX0{??r_/[6|5'bt,$zK/{a[+so4|<&!Px.|\ T.Zi:~]b?{mO_~ue1![^Z_#)l{s@A|w^~rZ@<{:^>u0{Bj1!=G1=F}9+_b"BN$gcRyO4AEG^|(D|6S!+E}0P*$&5tbihM{6>QMy#2@ T.cs|9y!u*};c_Y"[8#Km|v[_"B4$'457!C'tED!q4/}8,_q!C,!0Da$-J}7l;9J|`RN",A T.di}`UT"*nn!Dpd$n?V#=:{YJ?PC-!;W$>%=JX;!/H$ER|\E!XT!z5!+E1!4?@|2.sv- T.er{@yu#OO!iE"fw!/Px\$wM|+h>T$/,$.Y"SS1)>;}nz*)7;"Lw!8l{o[B{*%)F#Bd` T.fCKe|.C{N^d&{^W3|\D!6L"=01y{V%VV,#>f!9r?nKg!-J2|%Ws{DQl{Nni>m-{c1W!Jk T.gWc1?#6=Q0E9Hq}22?!J'}^.*"A1`|_6={O+*B|L]{K'i}yHxQD~4F)Nj)7!6&0M|]' T.h46{r*8ksT"%T-!/2{O>*^!?*\{X7GB{wK\!T2!Q\L!t6|PL{-(8}B.\pu|F0B!b[L+ T.iV$Wmp"C*?i<}V>(`|a?}W:{I0f!.]z!zA5.OCIm"\&Hm)J.{mElA"%:p{v9pz9}tqZ/;&|H+#'S\5YT}9NK T.lKKI!QJgO;"h(|jrR$d332$X'I#IU{g;Q!3/!g\{1'N8i|7Ih`!jY!Tc#R^6)!xK~olf T.mLV>#0O|IU|_s~k70{B2={d/m"in!'cnDL{Rr6$7<~YVS38#W/z=_(4}3\U}KejL|AM T.ng}MQFRWf?C]~sS8x|J\a}B]0}wD7|`>n})L_f_a{'Zz~&yP$rT]{';7,"fo{p6U~P%b T.ox|r;o"Z=3Yy$CT"0G#f,3#:@!`w"uNGj#+0$7Y!5L{m8l!W4|3r|jF7!a]4Z"vz}3;F T.paV{.CK{PXB4I}&Q-@"s/Z}MDE|2x!h3[|5l!4c|&`{9^k"AaA5$%\ T.q41::!(Q!AR'p_<{l`V.TY:TpY