
                    >>  AMOS Code Optimisation Tips  <<
                       -----------------------------

Aminet - dev/amos/Optim_Tips.lha

Compiled by:

   Ben Wyatt       bwyatt@paston.co.uk

Contributors:

   Garfield Benjamin  gbenjam@ix.netcom.com
   Ben Wyatt          bwyatt@paston.co.uk
   Paul Hickman       ph@doc.ic.ac.uk
   Tim Lewis          T.Lewis@bton.ac.uk
   Petri Hakkinen     mystic@tlti.tokem.fi
   Mark de Jong       JP.de.Jong@inter.nl.net
   Paul Heald         u5o83@cc.keele.ac.uk

Here are a few optimisation tips for those of you trying to squeeze that
little bit more speed out of code. With the follow examples, the speed
increase may not be noticeable, but they will all make it that little bit
faster.

Important: Do not go through your entire AMOS program collection, converting
all your code so it's slightly faster. Only alter code which you think is
/too/ slow. For example, don't change your 3D renderer just to cut down the
rendering time 1/50 of a second or something silly. And similarly, don't
bother fiddling with the main loop of a game when it already runs in a Vbl!
A lot of these optimisation tips produce some messy code so often it isn't
worth changing things!

Note: Some of these speed increases only work when compiled, mainly with
the AMOSPRO Compiler. A lot of them are only tested this way. It must also
be stated that these tips may be incorrect for some CPUs - they have mainly
been tested on an 68020, but caches and different instruction cycle timings
could alter the results. Some sort of chart may be available in future
releases of this document - the best idea is to experiment on your system.

Each tip is marked by a *, followed by the amount of speed increase, shown
by a S, M, L or V. These stand for Small, Medium, Large and Varies.

------------------------------------------------------------------------------

                            >>  COMPARISONS  <<
                               -------------

*S Don't use Sgn(n), use n<0 and n>0 instead.

   Eg.
      Don't use:
         If Sgn(n)=-1 Then blah blah
         If Sgn(n)=1 Then blah de blah

      Use this instead:
         If n<0 Then blah blah
         If n>0 Then blah de blah

*M Use Abs(n)<v instead of n>-v and n<v.

   Eg.
      Don't use:
         If n>-5 and n<5 Then blah

      Use this instead:
         If Abs(n)<5 Then blah

*S Use < and > instead of <= and >= whenever possible. They are slightly
   faster. Strange, but true.

   Eg. When comparing immediate-values
      Don't use:
         If n<=10 Then blah

      Use this instead:
         If n<11 Then blah

   But, when using variables, don't use this kind of thing:
      A=10
      If B>A-1 Then etc
   instead of:
      A=10
      If B>=A Then etc

*M Don't do stuff like If n=True, If n=False, or If n<>0.

   Eg.
      Don't use:
         If a=True or b=False or c<>0 Then blah blah

      Use this instead:
         If a or Not(b) or c Then blah blah

   But ONLY do this if you are sure b is a boolean value. "Not" simply
   bitflips an integer so Not -1 is 0, but Not 2 is $FFFFFFFE.

   AMOSPro is rather buggy with the Not() command, so you may have to
   experiment a bit.

   Also, using =False instead of =0 is faster. Another trick you should be
   aware of is that you can use the optimised version of <>0 where you
   would normally use >0 (or <0) if the routine only returns values >=0.

   Eg.
      Don't use:
         If Instr(a$,"Something")>0 Then blah

      Do this instead:
         If Instr(a$,"Something") Then blah

------------------------------------------------------------------------------

                           >>  ARRAY ACCESS  <<
                              --------------

*M For your most used array of 8 (or less) elements, use Dreg() instead of
   it (it can still hold the same range of values). Note: The Dreg() array
   isn't actually using the data registers; it's an AMOS internal data
   structure. However, it's still faster than a standard array.

   Eg.
      Don't use:
         Dim n(7)
         For a=0 to 7:Print n(a):Next a

      Use this instead:
         For a=0 To 7:Print Dreg(a):Next a

*S Use single dimensional arrays instead of two dimensional ones if
   appropriate.

    Eg.
       Don't use:
          Dim A(2,100)

       Use this instead:
          Dim A0(100),A1(100),A2(100)

------------------------------------------------------------------------------

                          >>  MATH OPERATIONS  <<
                             -----------------

*L Whenever possible use powers of 2 when multiplying or dividing. The
   AMOSPro Compiler will automatically optimise these to lsl and lsr.
   DON'T use the lsl or lsr commands in extensions, as they are over
   TWICE as slow!

   Eg.
      Don't use:
         n=a*30

      Use this instead (if suitable):
         n=a*32

   As well as this, you can also do:
      n=x*160 -> n=x*(128+32) -> n=x*128+x*32 (tadaa)
      Also n=x*192 -> n=x*128+x*64 etc.

   But this kind of thing is not faster:
      a=32 : b=7 : c=a*b

*L Never use floating points. Use integers multiplied by a power of 2. You
   can actually decide what sort of precision you want in your decimal places
   and then multiply them out. For example, if you decide on about 2 point
   accuracy, you can multiply all your values by 128 (2^7, close enough to
   100) and then do the calculations. When you have the results simply divide
   by 128 to get the required result.

   Eg.
      Don't use:
         x#=x#*1.5
         Plot x#,100

      Use this instead:
         x=x*(3*128) : Rem Same as x=x*(1.5*256)
                       Rem 3*128 is calculated at compilation time
         Plot X/256,100

*L Predefine as much as possible. Especially useful for Sin and Cos etc.

   Eg.
      Don't use:
         Repeat
            If Sqr(x*x+y*y)=10 Then blah blah blah
         Until Something

      Use this instead:
         ' This code should be placed at the beginning of your code.
         xmax=Maximum value of x : ymax=Maximum value of y
         Dim QUICKSQR(xmax,ymax)
         For x=0 to xmax
            For y=0 to ymax
               QUICKSQR(x,y)=SQR(x*x+y*y)
            Next y
         Next x
         ' [Other code]
         Repeat
            If QUICKSQR(Abs(x),Abs(y))=10 Then blah blah blah
         Until Something

*S Use N=N+A instead of using Inc, Dec or Add. Although the manual claims
   they are faster, when using the AMOSPro Compiler, they actually turn
   out slightly slower.

   Eg.
      Don't use:
         Inc a : Dec b : Add c,10
      Use this instead:
         a=a+1 : b=b-1 : c=c+10

   Note: Although Add is slower in it's short form, the full version
   with the base To top part is faster than the equivilant code.

   This only applies to standard variables - DON'T use this with arrays!

------------------------------------------------------------------------------

                           >>  MISCELLANEOUS  <<
                              ---------------

*L Get yourself a fast extension, such as AMCAF or Turbo Plus. This will
   only speed things up where an extension command replaces an internal
   command with a faster one - Easylife and AMCAF will do this for many
   string operations and, AMCAF, Turbo and Powerbobs do it for graphics.

   Eg.
      Don't use:
         Plot x,y,c

      Use this instead:
         Turbo Plot x,y,c (for AMCAF)
      or F Plot x,y,c (for Turbo Plus)

   Many extensions can be found on Aminet (FTP from wuarchive.wustl.edu
   in systems/amiga/aminet/dev/amos).

*V Use assembler procedures for inner loops. Beware though, it does take
   a small amount of CPU time to jump to an assembler procedure, so with
   small routines, it's simply not worth it.

*S Use "Copper Off" to disable the screen while doing intense communication.
   The copper then stops stealing clock cycles from the processor (Unlike
   Multi No and such instructions, this really does work). Use "Copper On"
   to re-enable the normal system.

*L Use Fill to clear an array to a certain value, rather than the
   equivilant For / Next loop.

   Eg.
      Don't use:
         For n=0 to 1000
            A(n)=27
         Next n

      Use this instead:
         Fill Varptr(A(0)) To Varptr(A(1000)),27

*S Use global variables to return parameters from procedures, rather than
   Param.

   Eg.
      Don't use:
         _HELLO[10,20] : pram=Param : Print pram
         Procedure _HELLO[x,y]
            temp=0
            For n=x To y
               If a(n)=0 Then temp=n : Exit
            Next n
         End Proc[temp]

      Use this instead:
         Global pram
         _HELLO[10,20] : Print pram
         Procedure _HELLO[x,y]
            pram=0
            For n=x To y
               If a(n)=0 Then pram=n : Pop Proc
            Next N
         End Proc

   However, don't do things like this, which are actually slower.

   Don't do:
      _HELLO[10,20] : Print A
      Procedure _HELLO[X,Y]
         Shared A
         A=X*Y
      End Proc

   Do this:
      _HELLO[10,20] : Print Param
      Procedure _HELLO[X,Y]
      End Proc[X*Y]

*L Use subroutines instead of procedures. Although they're messier, they are
   several times faster! Of course, with this method, you can not pass
   parameters, but all the variables you were using will still be accessible.

   Eg.
      Don't use:
         _SOMETHING
         Procedure _SOMETHING
            Code
         End Proc

      Use this instead:
         Gosub _SOMETHING
         _SOMETHING:
            Code
         Return

------------------------------------------------------------------------------

                     >>  TRADITIONAL OPTIMISATIONS  <<
                        ---------------------------

*V Move invariant statement outside loops (Standard optimising compiler
   stuff).

   Eg.
      If you have something like this:
         Repeat
            If x<a*20 Then do something groovy
         Until something

      If the rest of the loop does not modify a, this can become:
         temp=a*20
         Repeat
            If x<temp Then do something groovy
         Until something

*V Unroll your Loops!! Cuts down on Stack-Access eliminating Loop-overhead
   and replacing it with dedicated code. Don't go TOO overboard though.
   Normally five iterations is enough to gain a bit of speed. Too many
   iterations unrolled will actually lose speed on systems with CPU caches.

   Eg.
      Don't use:
         For X=1 To 20
            UPDATE_ALIEN[X]
         Next X

       Use this instead:
         For X=1 To 16 Step 5
            UPDATE_ALIEN[X]
            UPDATE_ALIEN[X+1]
            UPDATE_ALIEN[X+2]
            UPDATE_ALIEN[X+3]
            UPDATE_ALIEN[X+4]
         Next X

   Note: You should only carry out this optimisation on small segments of
   code, such as plotting pixels, otherwise your code can get very bloated.

*M Eliminate stack access (the reason Procedures are slower than Gosubs)

   Eg.
      Don't use:
         For X=1 To 20
            Gosub MOVE_ALIEN : Rem This uses the stack 40 times
         Next X
         ' Other code
         MOVE_ALIEN:
         Moves Alien X
         Return

      Use this instead:
         Gosub MOVE_ALIENS : Rem This only uses the stack twice.
         ' Other code
         MOVE_ALIENS:
         For X=1 To 20
            Move Alien X
         Next X
         Return

   Also, try to move local variables out of procedures to make them global.
   They are significantly faster to use.

------------------------------------------------------------------------------

                             >>  GRAPHICS  <<
                                ----------

*M Use Cls col,x1,y1 To x2,y2 instead of Bar x1,y1 To x2,y2.
   Note: Use R Bar in Turbo instead if possible.

*S Use Polyline to draw boxes instead of Box.
   Note: Use R Box in Turbo instead if possible.

   Eg.
      Don't use:
         Box 10,10 To 10,10

      Use this instead:
         Polyline 10,10 To 20,10 To 20,20 To 10,20

------------------------------------------------------------------------------

                              >>  STRINGS  <<
                                 ---------

*L Use Peek(Varptr(a$)+x-1) instead of Asc(Mid$(a$,x,1)) and
   Poke Varptr(a$)+x-1,Asc(b$) instead of Mid$(a$,x,1)=b$.

   Eg.
      Don't use:
         A=Asc(Mid$(a$,3,1))
         Mid$(a$,6,1)=b$

      Use this instead:
         A=Peek(Varptr(a$)+2)
         Poke Varptr(a$)+5,Asc(b$) : Rem Asc(b$) could be predefined

   Note: This is just under three times as fast!

*M This routine is for making the string 1 character shorter. It may be
   helpful in an program to replace the input command (if you'd ever need
   to optimise such a routine which is unlikely).

   Eg.
      Don't use:
         S$=Space$(1000)
         For X=1 To 1000
            S$=Left$(S$,Len(S$)-1)
         Next X

      Use this instead:
         S$=Space$(1000)
         POS=Varptr(S$) : Rem Put this statement inside the loop if you
                          Rem do any standard string operations.
         For X=1 To 1000
            Doke POS-2,Deek(POS)-1
         Next X

   Note: This routine could be rather risky, as it is done outside the AMOS
   string handling system.

*S Don't use Chr$(code) in routines but replace them with a string.

   Eg.
      Don't use:
         I=Instr(ST$,Chr$(0))

      Use this instead:
         C0$=Chr$(0)      : Rem At the beginning of your prog
         I=Instr(ST$,C0$) : Rem In the routine

*V A quasi-optimisation: Use "n=Free" at points where you don't care about
   the speed, to reduce the chances of the AMOS string garbage collector
   being called when you do care about the speed. This will only have an
   effect when strings are used in the routine.

------------------------------------------------------------------------------

                         >>  STYLE OPTIMISING  <<
                            ------------------

*S Use a Repeat / Until loop instead of a For / Next loop.

   Eg.
      Don't use:
         For n=0 to 10
            [code]
         Next n

      Use this instead:
         n=0
         Repeat
            [code]
            n=n+1
         Until n=11 : Rem Must be 1 more than with the original For

*S Use "If condition Then code" instead of "If condition : code : End If",
   if the code is short.

   Eg.
      Don't use:
         If a=2
            la-de-da
         End If

      Use this instead:
         If a=2 Then la-de-da

*S Don't use Start(BANK), Screen Width or Screen Height in a routine that
   must be fast. In fact, do the same with any AMOS variable that won't
   change in the loop, including Joy(), JoyUp(), Fire(), etc. if you have
   several references to these in the same "loop".

   Eg.
      Don't use:
         For X=1 To 10000
            PE=Peek(Start(10)+X)
         Next X

      Use this instead:
         ST=Start(BANK)
         For X=1 To 10000
            PE=Peek(ST+X)
         Next X

*M Use the value of expressions, rather than testing what they mean.

   Eg.
      Don't use:
         If Jup(1) Then Y=Y-1
         If Jdown(1) Then Y=Y+1

      Use this instead:
         Y=Y+Jup(1)-Jdown(1)

    Remember: If it is True, -1 is returned, otherwise 0 is returned.

    The Turbo (Plus) extension has a useful command called "Texp" which
    returns specified values for True and False. Although it isn't very
    fast, it produces neater code.

*M Don't use parameters in Def Fns. As functions are local to procedures,
   any variables used, will be known in the function.

   Eg.
      Don't use:
         Def Fn C(x,y)=(x*y*y+x) mod 16
         For x=0 To 10
            For y=0 To 10
               Plot x,y,Fn C(x,y)
            Next y
         Next x

      Use this instead:
         Def Fn C=(x*y*y+x) mod 16
         For x=0 To 10
            For y=0 To 10
               Plot x,y,Fn C
            Next y
         Next x

*S Use as few brackets in equations and If structures as possible.

   Eg.
      Don't use:
         A=(B*C)-(D*E)-(F+G)
         C=(A=2 or B=100)

      Use this instead:
         A=B*C-D*E-F-G
         C=A=2 or B=100

*M The fastest way to display a map, using a few of the above tips:

      ' This is the slow way, that I've seen in multiple sources of games
      For Y=0 To FHV
         For X=0 To FWV
            Paste Icon 16+X*16,16+Y*16,Peek((FPOSY+Y)*FW+Start(BLS1)+FPOSX)+1
         Next X
      Next Y

      ' Now my quick version... Note that the start(bank)
      ' is placed in front of the routine.
      ' FHV   = Field Height View
      ' FHV   = Field Width  View
      ' FPOSX = Field POSition X (Where the player is looking in the map)
      ' FPOSY = Field POSition Y
      POS=Start(BLS1)+FPOSX
      For Y=0 To FHV
         MPY=(FPOSY+Y)*FW+POS
         SPY=16+Y*16
         For X=0 To FWV
            ' Using a Turbo Plus extension command
            F Paste Icon 16+X*16,SPY,Peek(MPY+X)+1
         Next X
      Next Y

*V Try various ways of writing code to see if you can find the quickest.
   THIS is probably the SINGLE most important TIP for blazing speed.
   You can fine-tune, even convert it to pure Assembler but, it may be MUCH
   slower than another method you have not yet tried!!

   Find that faster method!!

   Something like this is a nice way to test your improvements:

      Timer=0
      For n=0 To 10000
         [unoptimised code]
      Next n
      Print "Time for unoptimised code:";Timer

      Timer=0
      For n=0 To 10000
         Optimised code
      Next n
      Print "Time for optimised code:";Timer

   Note: This kind of test becomes quite useless if you have a CPU cache -
   The second loop may report to be slower even if it's exactly the same.
   When doing such tests, you should disable the cache if possible.

   The key to gaining the absolute BEST speed is to ask yourself WHY is this
   routine so slow? WHAT is slowing it down? The excellent tips above will
   probably provide you with the answer more often than not but the secret
   is you have to know what to look for. Think about what the computer does
   BEHIND THE SCENES.

   For example, why are two-dimensional arrays slower than single-dimensional
   arrays? Or, for that matter, why are even one-dimensional arrays slow?

   Think of it like this... When you access an element in a single-
   dimensional array the computer has to find the the Base address of the
   array. Then it must find the offset to reference to element you requested.

   If you had something like A=MyTable(10) the computer would process this
   along these lines:

      1. Find base address of array MyTable.
      2. Calculate offset of element 10 by multiplying 10 X 4 (4 is size of
         integer).
      3. Now add offset to array base to determine the element's memory
         address.
      4. Finally, extract the value held in this memory location.

   As you can see that's quite a bit of work (particularly in a loop) just
   to do something that appears so simple on this end.

   Now, for a two dimensional array, the computer has twice the work, two
   multiplies to calculate to determine the address of any given two
   dimensional array elements such as A=MyTable(10,10).

   Like the Gosub replacing Procedure tip, this is a bit messier but,
   everything has a price so try this:

   MyTableBase=VarPtr(MyTable)
   Now instead of accessing the array through AMOS with A=MyTable(10)
   access it directly yourself like this A=Leek(MyTableBase+40). The 40 is
   the offset for element 10, multiplied by the integer size, 4.

   Used in a loop this access a one-dimensional array 20% faster than
   the standard AMOS code (ie A=Array(element)) and is 33% faster at
   accessing two-dimensional arrays.

   If you don't need long-word size variables, you can substitute Deek
   for Leek for a TINY bit more speed (allowing 25% faster access of
   one-dimensional arrays and 40% faster access of two-dimensional arrays).

   Should be better but, AMOS's Peek, Deek and Leek are kind of slow
   themselves!!

------------------------------------------------------------------------------

If you find any general purpose tips, please send them to me (at
bwyatt@paston.co.uk) and I'll add them to the list!

If you want proof that these tips do work, then take a look at two of my
recent games, Borisball and Knockout 2. Apart from being tremendously good
fun, they demonstrate how the AMOS speed limit can be broken!! Their Aminet
positions are:
   game/demo/BorisBall.lha
   game/2play/KnockOut2.lha
