                          __
                         /_/\__
                         \ \ \ |\
                      ____\ \ \||__
                     / /  \\ \  __/
                    | | /\ \\ \ \
                    | | \/  \\_\/
                 ____\ \  /\ |
                / /  \\ \ \/ |
               | | /\/_\_\__/
               | | \/  \
            ____\ \  /\ |
           / /  \\ \ \/ |          ASP68K PROJECT
          /_/ /\ |\_\__/
          \ \ \/ |              Sixth Edition
       ____\ \  /
      / /  \\ \ \            by Michael Glew
     | | /\/_\ \ \           mglew@laurel.ocs.mq.edu.au
     | | \/  \\_\/           Technophilia BBS +61-2-8073563
  ____\_\__/\ |
 / /  \ /_/\/ |           January 1994
| | /\ \\_\__/
| | \/  \
 \ \  /\ \
  \ \ \_\/
   \ \ \
    \_\/


---------------------------------------------------------------------------
                          C O N T R I B U T O R S
---------------------------------------------------------------------------


Erik Bakke, Robert Barton, Bernd Blank, Kasimir Blomstedt, Frans Bouma,
David Carson, Nicolas Dade, Aaron Digulla, Irmen de Jong, Andy Duplain,
Denis Duplan, Steven Eker, Calle Englund, Alexander Fritsch, Charlie Gibbs,
Kurt Haenen, Jon Hudson, Kjetil Jacobsen, Olav Kalgraf, Makoto Kamada,
Markku Kolkka, John Lane, Jonathan Mahaffy, Dave Mc Mahan, Lindsay Meek,
Walter Misar, Boerge Noest, Gunnar Rxnning, Jay Scott, Olaf Seibert,
Peter Simons.


---------------------------------------------------------------------------
                          I N T R O D U C T I O N
---------------------------------------------------------------------------


A while back, I was quite interested to find that there was an electronic
magazine called "howtocode" that included lots of interesting hints and
tips of coding.  In the fifth edition, there was a list of optimizations
that really got be thinking.  "What if there was a proggy that you could
put an assembler program through, that would speed it up, taking out all
the stupid things output by compilers, and over-tired coders?" 8).  I
started combing the networks, and came across one such program, called
the "SELCO Source Optimizer".  It only had four optimizations, so I set
to writing my own.

Step one was to collect as many optimization ideas as I could.  I posted
to Usenet and got an impressive response, and the contributors are listed
above.  I promised a report on the optimizations recieved, and here it
is.  My aim now is to write a program to make these optimizations, and
to distribute it.  Contributers will recieve a copy of the final archive,
to thank them for their time and energy.  Further contributions will be
welcomed, so rather than making changes yourself tell me what you want
changed, and i'll distribute it with the next update.


---------------------------------------------------------------------------
                               C H A N G E S
---------------------------------------------------------------------------


2nd Edition

The second edition incorporated a hell of a lot of corrections.  Double
copies of some optimizations were incorporated in to just one copy, and
a few additions were made.  Sorry that the first edition was not sent
out to all contributors, but I was a tad busy. 8)


3rd Edition

Due to the distribution of the second edition document, many comments were
recieved and a couple of the "optimizations" were found to be incorrect.
Analysis of the mul/div optimizations ended in a few modifications for
safety.  They still save a huge number of clock cycles, so it is better to
be safe than sorry.

Also, I have made it so that the number of words of space saved or
increased is shown.  Space savings are positive, increases are negative.
Zero means no change.


4th Edition

Some minor changes and additions as well as the addition of columns for
'030 and '040 CPUs - whole new format was required...


5th Edition

Eric Bakke released his docs on 020+ CPUs and 881/882 FPUs.  I have been
given premission to use these docs to further the capabilities of asp68k.
Thanks Eric...  I really would like to get a hold of the 020,030,040
Programmer Reference Cards or manuals, so if anyone has any copies they
wanna send me, let me know...  Local Motorola Distributers are not too
helpful.


6th Edition

Aaron Digulla advised that it would be helpful if the optimizations were
sorted somehow.  I will sort by the the first letters of the first line
of the optimizations.  Also a special thanks to Makoto Kamada for his
detailed contributions, without such this text would have died long ago..


7th Edition (UNOFFICIAL, but have to be done IMHO)

 Added 68060 timings
 Removed "RTD dx" and "ASL #n,az" like instructions, and some other
  useless optimizations.
 Added some tips for 68030/68040/68060 coding

As far as I know, 68020 instruction timings are very closed to 68030
instruction timings. Since I'm not 100% sure about that, I leave the
68020 column empty...

---------------------------------------------------------------------------
                         O P T I M I Z A T I O N S
---------------------------------------------------------------------------


Note:-

    m?      = memory operand
    dx      = data register
    ds      = data register (scratch)
    ax      = address register
    rx      = either a data or address register
    #n      = immediate operand
    ??,?1,?2= address label
    *       = anything
    .x      = any size
    b<cc>   = branch commands

    Opt     = optimization
    Notes   = notes about where optimization is valid, and misc notes
    Speed   = are clock periods saved? ("Y" = yes
                                        "y" = in some cases
                                        "N" = no
                                        "*" = increase
                                        "-" = cannot be used on this cpu
                                        "!" = must be used on this cpu
    Size    = how many bytes are saved?

-----------------------------------------------------------------
Opt                                            Speed         Size
                                     000 010 020 030 040 060
------------------------------------+---+---+---+---+---+---+----
* ??* -> * n(pc)*                   | Y | Y | ? | N | * | N | 2
------------------------------------+---+---+---+---+---+---+----
 n = ??-pc, n < 32768
------------------------------------+---+---+---+---+---+---+----
*0(ax)* -> *(ax)*                   | Y | Y | ? | y | y | y | 2
------------------------------------+---+---+---+---+---+---+----
add*.x #0,dx -> tst.x dx            | Y | Y | ? | Y | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
add.x #n,* -> addq.x #n,*           | Y | Y | ? | Y | y | y | 2/4
------------------------------------+---+---+---+---+---+---+----
 if 1 <= n <= 8
------------------------------------+---+---+---+---+---+---+----
add.x #n,* -> subq.x #-n,*          | Y | Y | ? | Y | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
 -8 <= n <= -1
------------------------------------+---+---+---+---+---+---+----
add.x #n,ax -> lea n(ax),ax         | Y | Y | ? | y | * | N | 0/2
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= -9, 9 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
addq.l #n,ax -> addq.w #n,ax        | Y | Y | ? | N | N | N | 0
------------------------------------+---+---+---+---+---+---+----
addq.l #n,ry -> add.l #(n+m),ry     | Y | Y | ? | Y | * | * |-2
addq.l #m,ry                        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
addq.x #2,ax   -> move.w *,(ax)     | Y | Y | ? | Y | Y | Y | 2
move.w *,-(ax)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
------------------------------------+---+---+---+---+---+---+----
addq.x #4,ax   -> move.l *,(ax)     | Y | Y | ? | Y | Y | Y | 2
move.l *,-(ax)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
------------------------------------+---+---+---+---+---+---+----
addq.x #6,ax    -> move.w *1,4(ax)  | Y | Y | ? | Y | y | y | 0
move.w *1,-(ax)    move.l *2,(ax)   |   |   |   |   |   |   |
move.l *2,-(ax)                     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 *1 and *2 do not contain ax
------------------------------------+---+---+---+---+---+---+----
addq.x #6,ax    -> move.l *1,2(ax)  | Y | Y | ? | Y | y | y | 0
move.l *1,-(ax)    move.w *2,(ax)   |   |   |   |   |   |   |
move.w *2,-(ax)                     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 *1 and *2 do not contain ax
------------------------------------+---+---+---+---+---+---+----
addq.x #8,ax    -> move.l *1,4(ax)  | Y | Y | ? | Y | y | y | 0
move.l *1,-(ax)    move.l *2,(ax)   |   |   |   |   |   |   |
move.l *2,-(ax)                     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 *1 and *2 do not contain ax
------------------------------------+---+---+---+---+---+---+----
addq.x #4,sp -> move.l ax,(sp)      | Y | Y | ? | Y | Y | Y | 2
pea (ax)                            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
addq.x #6,sp   -> move.w *,4(sp)    | Y | Y | ? | Y | Y | Y | 0
move.w *,-(sp)    move.l ax,(sp)    |   |   |   |   |   |   |
pea (ax)                            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
addq.x #6,sp   -> move.l ax,2(sp)   | Y | Y | ? | Y | Y | Y | 0
pea (ax)          move.w *,(sp)     |   |   |   |   |   |   |
move.w *,-(sp)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
addq.x #8,sp   -> move.l *,4(sp)    | Y | Y | ? | Y | Y | Y | 0
move.l *,-(sp)    move.l ax,(sp)    |   |   |   |   |   |   |
pea (ax)                            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
addq.x #8,sp   -> move.l ax,4(sp)   | Y | Y | ? | Y | Y | Y | 0
pea (ax)          move.l *,(sp)     |   |   |   |   |   |   |
move.l *,-(sp)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
addq.x #8,sp -> move.l ax,4(sp)     | Y | Y | ? | Y | Y | Y | 0
pea (ax)        move.l ay,(sp)      |   |   |   |   |   |   |
pea (ay)                            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 .x is .w or .l
 ax,ay are not a7(=sp)
------------------------------------+---+---+---+---+---+---+----
and.l #n,dx -> bclr.l #b,dx         | Y | Y | ? | N | * | N | 2
------------------------------------+---+---+---+---+---+---+----
not(n) = 2^b (only 1 bit off)
------------------------------------+---+---+---+---+---+---+----
asl.b #2,dy -> add.b dy,dy          | Y | Y | ? | Y | Y | * |-2
               add.b dy,dy          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
asl.b #n,dx -> clr.b dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=8
------------------------------------+---+---+---+---+---+---+----
asl.l #16,dx -> swap dx             | Y | Y | ? | N | N | * |-2
                clr.w dx            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
asl.l #n,dx -> asl.w #(n-16),dx     | Y | Y | ? | * | * | * |-4
               swap dx              |   |   |   |   |   |   |
               clr.w dx             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, 16<n<32
------------------------------------+---+---+---+---+---+---+----
asl.l #n,dx -> moveq #0,dx          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=32
------------------------------------+---+---+---+---+---+---+----
asl.w #2,dy -> add.w dy,dy          | Y | Y | ? | Y | Y | * |-2
               add.w dy,dy          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
asl.w #n,dx -> clr.w dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=16
------------------------------------+---+---+---+---+---+---+----
asl.x #1,dy -> add.x dy,dy          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
asr.b #n,dx -> clr.b dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=8
------------------------------------+---+---+---+---+---+---+----
asr.l #16,dx -> swap dx             | Y | Y | ? | * | * | * |-2
                ext.l dx            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
asr.l #n,dx -> moveq #0,dx          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=32
------------------------------------+---+---+---+---+---+---+----
asr.l #n,dx -> swap dx              | Y | Y | ? | * | * | * |-4
               asr.w #(n-16),dx     |   |   |   |   |   |   |
               ext.l dx             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, 16<n<32
------------------------------------+---+---+---+---+---+---+----
asr.w #n,dx -> clr.w dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=16
------------------------------------+---+---+---+---+---+---+----
b<cc>.w ?? -> b<cc>.s ??            | Y | Y | ? | Y | N | N | 2
------------------------------------+---+---+---+---+---+---+----
 abs(??-pc)<128
------------------------------------+---+---+---+---+---+---+----
bclr.l #n,dx -> and.w #m,dx         | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 0 <= n <= 15, m = 65535-(2^n)
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bra ?? -> (nothing)                 | Y | Y | Y | Y | Y | Y | 2/4
??        ??                        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 remove null branches, but keep the label
------------------------------------+---+---+---+---+---+---+----
bset.b #7,m? -> tas m?              | y | y | ? | * | * | * | 2
beq ??          bpl ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 m? must be address allowing read-modify-write transfer.
 Status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.b #7,m? -> tas m?              | y | y | ? | * | * | * | 2
bne ??          bmi ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 m? must be address allowing read-modify-write transfer.
 Status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.b #7,m? -> tas m?              | y | y | ? | * | * | * | 2
------------------------------------+---+---+---+---+---+---+----
 m? must be address allowing read-modify-write transfer.
 Status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.l #7,dx -> tas dx              | Y | Y | ? | Y | y | N | 2
beq ??          bpl ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.l #7,dx -> tas dx              | Y | Y | ? | Y | y | N | 2
bne ??          bmi ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.l #7,dx -> tas dx              | Y | Y | ? | Y | y | N | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bset.l #n,dx -> or.w #m,dx          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 0 <= n <= 15, m = 2^n
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
bsr ?? -> bra ??                    | Y | Y | ? | Y | Y | Y | 2
rts                                 |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 different stack depth
------------------------------------+---+---+---+---+---+---+----
btst.b #7,m? -> tst.b m?            | Y | Y | ? | Y | Y | y | 2
beq ??          bpl ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.  Not valid for Dn, d16(PC), d8(PC,Xn)
 dest address modes.
------------------------------------+---+---+---+---+---+---+----
btst.b #7,m? -> tst.b m?            | Y | Y | ? | Y | Y | y | 2
bne ??          bmi ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.  Not valid for Dn, d16(PC), d8(PC,Xn)
 dest address modes.
------------------------------------+---+---+---+---+---+---+----
btst.l #7,dx -> tst.b dx            | Y | Y | ? | Y | N | N | 2
beq ??          bpl ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.
------------------------------------+---+---+---+---+---+---+----
btst.l #7,dx -> tst.b dx            | Y | Y | ? | Y | N | N | 2
bne ??          bmi ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.
------------------------------------+---+---+---+---+---+---+----
btst.l #15,dx -> tst.w dx           | Y | Y | ? | Y | N | N | 2
beq ??           bpl ??             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.
------------------------------------+---+---+---+---+---+---+----
btst.l #15,dx -> tst.w dx           | Y | Y | ? | Y | N | N | 2
bne ??           bmi ??             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.
------------------------------------+---+---+---+---+---+---+----
btst.l #31,dx -> tst.l dx           | Y | Y | ? | Y | N | N | 2
beq ??           bpl ??             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 Status flags are wrong.
------------------------------------+---+---+---+---+---+---+----
btst.l #31,dx -> tst.l dx           | Y | Y | ? | Y | N | N | 2
bne ??           bmi ??             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
clr.b mn   -> clr.w mn              | Y | Y | ? | Y | Y | Y |2/4/6
clr.b mn+1                          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 best if mn is longword aligned
------------------------------------+---+---+---+---+---+---+----
clr.l dx -> moveq #0,dx             | Y | Y | ? | N | N | N | 0
------------------------------------+---+---+---+---+---+---+----
clr.w mn   -> clr.l mn              | Y | Y | ? | Y | Y | Y |2/4/6
clr.w mn+2                          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 best if mn is longword aligned
------------------------------------+---+---+---+---+---+---+----
clr.x -(ax) -> move.x ds,-(ax)      | Y | Y | ? | Y | N | N | 0
------------------------------------+---+---+---+---+---+---+----
 ds must equal zero
------------------------------------+---+---+---+---+---+---+----
clr.x n(ax,rx) -> move.x ds,n(ax,rx)| Y | Y | ? | Y | N | N | 0
------------------------------------+---+---+---+---+---+---+----
 ds must equal zero
------------------------------------+---+---+---+---+---+---+----
cmp.x #0,ax -> move.x ax,ds         | Y | Y | ? | Y | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
 move ax to scratch register
------------------------------------+---+---+---+---+---+---+----
cmp.x #0,ax -> tst.x ax             | - | - | ? | Y | N | N | ?
------------------------------------+---+---+---+---+---+---+----
for .w and .l
------------------------------------+---+---+---+---+---+---+----
cmp.x #0,dx -> tst.x dx             | Y | Y | ? | N | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
cmp.x #0,m? -> tst.x m?             | Y | Y | ? | Y | Y | Y | 2/4
------------------------------------+---+---+---+---+---+---+----
may not be legal on some early '000 CPUs
------------------------------------+---+---+---+---+---+---+----
divu.l #n,dx -> lsr.l #m,dx         | ! | ! | ? | ! | ! | ! | 4
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= 8
------------------------------------+---+---+---+---+---+---+----
divu.l #n,dx -> moveq #0,dx         | ! | ! | ? | ! | ! | ! | 4
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, m>=32
------------------------------------+---+---+---+---+---+---+----
divu.l #n,dx -> moveq #m,ds         | ! | ! | ? | ! | ! | ! | 2
                lsr.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8<m<32
------------------------------------+---+---+---+---+---+---+----
divu.w #n,dx -> lsr.l #m,dx         | Y | Y | ? | ! | ! | ! | 2
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= 8, ignore remainder
------------------------------------+---+---+---+---+---+---+----
divu.w #n,dx -> moveq #0,dx         | Y | Y | ? | ! | ! | ! | 2
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, m>=32
------------------------------------+---+---+---+---+---+---+----
divu.w #n,dx -> moveq #m,ds         | Y | Y | ? | ! | ! | ! | 0
                lsr.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8<m<32, ignore remainder
------------------------------------+---+---+---+---+---+---+----
eor.x #-1,* -> not.x *              | Y | Y | ? | Y | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
ext.w dx -> extb.l dx               | - | - | ? | ? | Y | Y | 2
ext.l dx                            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
jmp ?? -> bra.w ??                  | Y | Y | ? | y | Y | y | 2
------------------------------------+---+---+---+---+---+---+----
 abs(??-pc) < 32768, same section
------------------------------------+---+---+---+---+---+---+----
jsr * -> jmp *                      | Y | Y | ? | Y | Y | Y | 2
rts                                 |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 different stack depth
------------------------------------+---+---+---+---+---+---+----
jsr ?1 -> pea ?2                    | y | y | ? | ? | Y | y | 0
jmp ?2    jmp ?1                    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 same time if jsr is abs.l (68000/68010)
------------------------------------+---+---+---+---+---+---+----
jsr ?? -> bsr.w ??                  | Y | Y | ? | y | Y | Y | 2
------------------------------------+---+---+---+---+---+---+----
 abs(??-pc) < 32768, same section
------------------------------------+---+---+---+---+---+---+----
lea (ax),ax -> (nothing)            | Y | Y | Y | Y | Y | Y | 2
------------------------------------+---+---+---+---+---+---+----
 delete
------------------------------------+---+---+---+---+---+---+----
lea 0.w,ax -> sub.l ax,ax           | Y | Y | ? | Y | * | N | 2
------------------------------------+---+---+---+---+---+---+----
lea n(ax),ax -> addq.w #n,ax        | Y | Y | ? | Y | Y | N | 2
------------------------------------+---+---+---+---+---+---+----
 if 1 <= n <= 8
------------------------------------+---+---+---+---+---+---+----
lea n(ax),ax -> subq.w #-n,ax       | Y | Y | ? | Y | Y | N | 2
------------------------------------+---+---+---+---+---+---+----
 if -8 <= n <= -1
------------------------------------+---+---+---+---+---+---+----
lsl.b #2,dy -> add.b dy,dy          | Y | Y | ? | Y | N | * |-2
               add.b dy,dy          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
lsl.b #n,dx -> clr.b dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=8
------------------------------------+---+---+---+---+---+---+----
lsl.l #16,dx -> swap dx             | Y | Y | ? | N | * | * |-2
                clr.w dx            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
lsl.l #n,dx -> lsl.w #(n-16),dx     | Y | Y | ? | * | * | * |-4
               swap dx              |   |   |   |   |   |   |
               clr.w dx             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, 16<n<32
------------------------------------+---+---+---+---+---+---+----
lsl.l #n,dx -> moveq #0,dx          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=32
------------------------------------+---+---+---+---+---+---+----
lsl.w #2,dy -> add.w dy,dy          | Y | Y | ? | Y | N | * |-2
               add.w dy,dy          |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
lsl.w #n,dx -> clr.w dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=16
------------------------------------+---+---+---+---+---+---+----
lsl.x #1,dy -> add.x dy,dy          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
lsr.b #n,dx -> clr.b dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=8
------------------------------------+---+---+---+---+---+---+----
lsr.l #16,dx -> clr.w dx            | Y | Y | ? | Y | N | * |-2
                swap dx             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
lsr.l #n,dx -> clr.w dx             | Y | Y | ? | * | * | * |-4
               swap dx              |   |   |   |   |   |   |
               lsr.w #(n-16),dx     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, 16<n<32
------------------------------------+---+---+---+---+---+---+----
lsr.l #n,dx -> moveq #0,dx          | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=32
------------------------------------+---+---+---+---+---+---+----
lsr.w #n,dx -> clr.w dx             | Y | Y | ? | Y | Y | N | 0
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong, n>=16
------------------------------------+---+---+---+---+---+---+----
move.b #-1,(ax) -> st (ax)          | Y | Y | ? | * | * | N | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,(ax)+ -> st (ax)+        | N | N | ? | * |Y/*| N | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,-(ax) -> st -(ax)        | N | N | ? | * |Y/*| N | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,?? -> st ??              | Y | Y | ? | * |Y/*| Y | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,dx -> st dx              | Y | Y | ? | N | * | N | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,n(ax) -> st n(ax)        | Y | Y | ? | * |Y/N| Y | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #-1,n(ax,rx) -> st n(ax,rx)  | Y | Y | ? | * | * | Y | 2
------------------------------------+---+---+---+---+---+---+----
 status flags are wrong
------------------------------------+---+---+---+---+---+---+----
move.b #x,mn   -> move.w #xy,mn     | Y | Y | ? | Y | Y | Y |4/6/8
move.b #y,mn+1                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 best if mn is longword aligned
------------------------------------+---+---+---+---+---+---+----
move.l #n,-(sp) -> pea n.w          | Y | Y | ? | N | N | N | 2
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
move.l #n,ax -> move.w #n,ax        | Y | Y | ? | Y | N | N | 2
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #-128,dx      | Y | Y | ? | Y | * | * | 2
                subq.l #n+128,dx    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 -136 <= n <= -129
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | Y | Y | ? | Y | * | * | 2
                not.b dx            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 128 <= n <= 255, m = 255-n
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | Y | Y | ? | Y | * | * | 2
                not.w dx            |   |   |   |   |   |   |
                                    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 65534 <= n <= 65408 or -65409 <= n <= -65536, m = 65535-abs(n)
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | Y | Y | ? | N | * | * | 2
                swap dx             |   |   |   |   |   |   |
                                    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 -8323073 <= n <= -65537 or 4096 <= n <= 8323072, n = m*65536
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #n,dx         | Y | Y | ? | Y | N | N | 4
------------------------------------+---+---+---+---+---+---+----
 if -128 <= n <= 127
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #y,dx         | * | * | ? | N | * | * | 2
                lsl.l #z,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n = y * 2^z
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | Y | Y | ? | Y | * | * | 2
                add.b dx,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 (128 <= n <= 254 or -256 <= n <= -130) and n is even, m = n/2
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | Y | Y | ? | * | * | * | 2
                bchg.l dx,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n = -32881 -> m = -113
 n = -32849 -> m = -81
 n = -32817 -> m = -49
 n = -32785 -> m = -17
 n = -16498 -> m = -114
 n = -16466 -> m = -82
 n = -16434 -> m = -50
 n = -16402 -> m = -18
 n = -8307 -> m = -115
 n = -8275 -> m = -83
 n = -8243 -> m = -51
 n = -8211 -> m = -19
 n = -4212 -> m = -116
 n = -4180 -> m = -84
 n = -4148 -> m = -52
 n = -4116 -> m = -20
 n = -2165 -> m = -117
 n = -2133 -> m = -85
 n = -2101 -> m = -53
 n = -2069 -> m = -21
 n = -1142 -> m = -118
 n = -1110 -> m = -86
 n = -1078 -> m = -54
 n = -1046 -> m = -22
 n = -631 -> m = -119
 n = -599 -> m = -87
 n = -567 -> m = -55
 n = -535 -> m = -23
 n = -376 -> m = -120
 n = -344 -> m = -88
 n = -312 -> m = -56
 n = -280 -> m = -24
 n = 264 -> m = 8
 n = 296 -> m = 40
 n = 328 -> m = 72
 n = 360 -> m = 104
 n = 521 -> m = 9
 n = 553 -> m = 41
 n = 585 -> m = 73
 n = 617 -> m = 105
 n = 1034 -> m = 10
 n = 1066 -> m = 42
 n = 1098 -> m = 74
 n = 1130 -> m = 106
 n = 2059 -> m = 11
 n = 2091 -> m = 43
 n = 2123 -> m = 75
 n = 2155 -> m = 107
 n = 4108 -> m = 12
 n = 4140 -> m = 44
 n = 4172 -> m = 76
 n = 4204 -> m = 108
 n = 8205 -> m = 13
 n = 8237 -> m = 45
 n = 8269 -> m = 77
 n = 8301 -> m = 109
 n = 16398 -> m = 14
 n = 16430 -> m = 46
 n = 16462 -> m = 78
 n = 16494 -> m = 110
 n = 32783 -> m = 15
 n = 32815 -> m = 47
 n = 32847 -> m = 79
 n = 32879 -> m = 111
------------------------------------+---+---+---+---+---+---+----
move.l #n,dx -> moveq #m,dx         | N | N | ? | * | * | * | 2
                bchg.l dx,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n = -2147483617 -> m = 31
 n = -2147483585 -> m = 63
 n = -2147483553 -> m = 95
 n = -2147483521 -> m = 127
 n = -1073741922 -> m = -98
 n = -1073741890 -> m = -66
 n = -1073741858 -> m = -34
 n = -1073741826 -> m = -2
 n = -536871011 -> m = -99
 n = -536870979 -> m = -67
 n = -536870947 -> m = -35
 n = -536870915 -> m = -3
 n = -268435556 -> m = -100
 n = -268435524 -> m = -68
 n = -268435492 -> m = -36
 n = -268435460 -> m = -4
 n = -134217829 -> m = -101
 n = -134217797 -> m = -69
 n = -134217765 -> m = -37
 n = -134217733 -> m = -5
 n = -67108966 -> m = -102
 n = -67108934 -> m = -70
 n = -67108902 -> m = -38
 n = -67108870 -> m = -6
 n = -33554535 -> m = -103
 n = -33554503 -> m = -71
 n = -33554471 -> m = -39
 n = -33554439 -> m = -7
 n = -16777320 -> m = -104
 n = -16777288 -> m = -72
 n = -16777256 -> m = -40
 n = -16777224 -> m = -8
 n = -8388713 -> m = -105
 n = -8388681 -> m = -73
 n = -8388649 -> m = -41
 n = -8388617 -> m = -9
 n = -4194410 -> m = -106
 n = -4194378 -> m = -74
 n = -4194346 -> m = -42
 n = -4194314 -> m = -10
 n = -2097259 -> m = -107
 n = -2097227 -> m = -75
 n = -2097195 -> m = -43
 n = -2097163 -> m = -11
 n = -1048684 -> m = -108
 n = -1048652 -> m = -76
 n = -1048620 -> m = -44
 n = -1048588 -> m = -12
 n = -524397 -> m = -109
 n = -524365 -> m = -77
 n = -524333 -> m = -45
 n = -524301 -> m = -13
 n = -262254 -> m = -110
 n = -262222 -> m = -78
 n = -262190 -> m = -46
 n = -262158 -> m = -14
 n = -131183 -> m = -111
 n = -131151 -> m = -79
 n = -131119 -> m = -47
 n = -131087 -> m = -15
 n = -65648 -> m = -112
 n = -65616 -> m = -80
 n = -65584 -> m = -48
 n = -65552 -> m = -16
 n = 65552 -> m = 16
 n = 65584 -> m = 48
 n = 65616 -> m = 80
 n = 65648 -> m = 112
 n = 131089 -> m = 17
 n = 131121 -> m = 49
 n = 131153 -> m = 81
 n = 131185 -> m = 113
 n = 262162 -> m = 18
 n = 262194 -> m = 50
 n = 262226 -> m = 82
 n = 262258 -> m = 114
 n = 524307 -> m = 19
 n = 524339 -> m = 51
 n = 524371 -> m = 83
 n = 524403 -> m = 115
 n = 1048596 -> m = 20
 n = 1048628 -> m = 52
 n = 1048660 -> m = 84
 n = 1048692 -> m = 116
 n = 2097173 -> m = 21
 n = 2097205 -> m = 53
 n = 2097237 -> m = 85
 n = 2097269 -> m = 117
 n = 4194326 -> m = 22
 n = 4194358 -> m = 54
 n = 4194390 -> m = 86
 n = 4194422 -> m = 118
 n = 8388631 -> m = 23
 n = 8388663 -> m = 55
 n = 8388695 -> m = 87
 n = 8388727 -> m = 119
 n = 16777240 -> m = 24
 n = 16777272 -> m = 56
 n = 16777304 -> m = 88
 n = 16777336 -> m = 120
 n = 33554457 -> m = 25
 n = 33554489 -> m = 57
 n = 33554521 -> m = 89
 n = 33554553 -> m = 121
 n = 67108890 -> m = 26
 n = 67108922 -> m = 58
 n = 67108954 -> m = 90
 n = 67108986 -> m = 122
 n = 134217755 -> m = 27
 n = 134217787 -> m = 59
 n = 134217819 -> m = 91
 n = 134217851 -> m = 123
 n = 268435484 -> m = 28
 n = 268435516 -> m = 60
 n = 268435548 -> m = 92
 n = 268435580 -> m = 124
 n = 536870941 -> m = 29
 n = 536870973 -> m = 61
 n = 536871005 -> m = 93
 n = 536871037 -> m = 125
 n = 1073741854 -> m = 30
 n = 1073741886 -> m = 62
 n = 1073741918 -> m = 94
 n = 1073741950 -> m = 126
 n = 2147483551 -> m = -97
 n = 2147483583 -> m = -65
 n = 2147483615 -> m = -33
 n = 2147483647 -> m = -1
------------------------------------+---+---+---+---+---+---+----
move.l #n,m? -> moveq  #n,ds        | Y | Y | ? | Y |N/*| y | 2
                move.l ds,m?        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 -128 <= n <= 127
------------------------------------+---+---+---+---+---+---+----
move.l (ax),ay -> move.x ([ax],n),dz| - | - | ? | * | * | * | 0
move.x n(ay),dz                     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l (ax),ay -> move.x ([ax]),dz  | - | - | ? | * | * | * | 0
move.x (ay),dz                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l (bd.x,ax),dy ->              | - | - | ? | Y | Y | Y | 2
                     move.l bd.x,dy |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l (n.w,ax),dy ->               | N | N | N | N | N | N | 0
                    move.l n(ax),dy |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l (sp),(n,sp) -> rtd #n        | - | - | ? | Y | Y | Y | 6
lea (n,sp),sp                       |   |   |   |   |   |   |
rts                                 |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l 12(ax),12(ay) -> move16      | - | - | - | - | y | ? | 22
move.l 8(ax),8(ay)      (ax)+,(ay)+ |   |   |   |   |   |   |
move.l 4(ax),4(ay)                  |   |   |   |   |   |   |
move.l (ax)+,(ay)+                  |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l ax,-(sp) -> link ax,#n       | Y | Y | ? | Y | N | Y | 4
move.l sp,ax                        |   |   |   |   |   |   |
add.w #n,sp                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
move.l ax,-(sp) -> pea -n(ax)       | Y | Y | ? | Y | Y | N | 0/4
sub*.l #n,(sp)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l ax,-(sp) -> pea n(ax)        | Y | Y | ? | Y | Y | N | 0/4
add*.l #n,(sp)                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.l ax,sp -> unlk ax             | Y | Y | ? | N | y | N | 2
move.l (sp)+,ax                     |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
move.w #x,mn   -> move.l #xy,mn     | Y | Y | ? | Y | Y | Y |2/4/6
move.w #y,mn+2                      |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 best if mn is longword aligned
------------------------------------+---+---+---+---+---+---+----
move.x #0,ax -> sub.l ax,ax         | Y | Y | ? | Y | * | N | 2/4
------------------------------------+---+---+---+---+---+---+----
move.x #n,ax -> lea n,ax            | Y | Y | ? | N | N | N | 0
------------------------------------+---+---+---+---+---+---+----
 n <> 0
------------------------------------+---+---+---+---+---+---+----
move.x ax,ay -> lea n(ax),ay        | Y | Y | ? | Y | Y | Y | 2/4
add.x #n,ay                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
move.x ax,az -> lea -n(ax,dx),az    | Y | Y | ? | Y | Y | Y | 2
sub.x #n,az                         |   |   |   |   |   |   |
add.x dx,az                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 az=n+ax+dx, n<=32767
------------------------------------+---+---+---+---+---+---+----
move.x ax,az -> lea n(ax,dx),az     | Y | Y | ? | Y | Y | Y | 2
add.x #n,az                         |   |   |   |   |   |   |
add.x dx,az                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 az=n+ax+dx, n<=32767
------------------------------------+---+---+---+---+---+---+----
movem.l (ax)+,registers             | * | * | ? | ? | Y | N | *
                 -> move.l (ax)+,ry |   |   |   |   |   |   |
                       for each reg |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
movem.x *,@ -> move.x *,@           | Y | Y | ? | Y | Y | N | 2
                                    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 @ = a single register, not (@=dx & .x=.w)
------------------------------------+---+---+---+---+---+---+----
movem.x @,* -> move.x @,*           | Y | Y | ? | Y | Y | N | 2
                                    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 @ = a single register, status flags are wrong
------------------------------------+---+---+---+---+---+---+----
moveq #n,az -> lea n(ax,ay.l*2),az  | - | - | ? | Y | Y | Y | 4
add.x ay,az                         |   |   |   |   |   |   |
add.x ax,az                         |   |   |   |   |   |   |
add.x ay,az                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 az=n+ax+2*ay, -128<=n<=127
------------------------------------+---+---+---+---+---+---+----
mul*.l #1,dx -> (nothing)           | ! | ! | ! | ! | ! | Y | 6
------------------------------------+---+---+---+---+---+---+----
 delete
------------------------------------+---+---+---+---+---+---+----
mul*.l #10,dx -> add.l dx,dx        | ! | ! | ? | Y | Y | * |-2
                 move.l dx,ds       |   |   |   |   |   |   |
                 asl.l #2,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #12,dx -> asl.l #2,dx        | ! | ! | ? | Y | Y | * |-2
                 move.l dx,ds       |   |   |   |   |   |   |
                 add.l dx,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #2,dx -> add.l dx,dx         | ! | ! | ? | ! | ! | Y | 4
------------------------------------+---+---+---+---+---+---+----
mul*.l #3,dx -> move.l dx,ds        | ! | ! | ? | ! | ! | * | 0
                add.l dx,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #5,dx -> move.l dx,ds        | ! | ! | ? | ! | ! | * | 0
                asl.l #2,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #6,dx -> add.l dx,dx         | ! | ! | ? | ! | ! | * |-2
                move.l dx,ds        |   |   |   |   |   |   |
                add.l dx,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #7,dx -> move.l dx,ds        | ! | ! | ? | ! | ! | * | 0
                asl.l #3,dx         |   |   |   |   |   |   |
                sub.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #9,dx -> move.l dx,ds        | ! | ! | ? | ! | ! | * | 0
                asl.l #3,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mul*.l #n,dx -> moveq #m,ds         | ! | ! | ? | ! | ! | N | 2
                asl.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8<m<14
------------------------------------+---+---+---+---+---+---+----
muls.l #0,dx -> moveq #0,dx         | ! | ! | ? | ! | ! | Y | 4
------------------------------------+---+---+---+---+---+---+----
muls.l #n,dx -> asl.l #m,dx         | ! | ! | ? | ! | ! | Y | 4
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= 8
------------------------------------+---+---+---+---+---+---+----
muls.w #0,dx -> moveq #0,dx         | Y | Y | ? | ! | ! | Y | 2
------------------------------------+---+---+---+---+---+---+----
muls.w #1,dx -> ext.l dx            | Y | Y | ? | ! | ! | Y | 2
------------------------------------+---+---+---+---+---+---+----
muls.w #10,dx -> ext.l dx           | Y | Y | ? | ! | ! | * |-6
                 add.l dx,dx        |   |   |   |   |   |   |
                 move.l dx,ds       |   |   |   |   |   |   |
                 asl.l #2,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #11,dx -> ext.l dx           | Y | Y | ? | ! | ! | * |-8
                 move.l dx,ds       |   |   |   |   |   |   |
                 add.l dx,dx        |   |   |   |   |   |   |
                 add.l dx,ds        |   |   |   |   |   |   |
                 asl.l #3,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #12,dx -> ext.l dx           | Y | Y | ? | ! | ! | * |-6
                 asl.l #2,dx        |   |   |   |   |   |   |
                 move.l dx,ds       |   |   |   |   |   |   |
                 add.l dx,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #2,dx -> ext.l dx            | Y | Y | ? | ! | ! | N | 0
                add.l dx,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #3,dx -> ext.l dx            | Y | Y | ? | ! | ! | * |-4
                move.l dx,ds        |   |   |   |   |   |   |
                add.l dx,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #5,dx -> ext.l dx            | Y | Y | ? | ! | ! | * |-4
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #2,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #6,dx -> ext.l dx            | Y | Y | ? | ! | ! | * |-6
                add.l dx,dx         |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #7,dx -> ext.l dx            | Y | Y | ? | ! | ! | * |-4
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #3,dx         |   |   |   |   |   |   |
                sub.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #9,dx -> ext.l dx            | Y | Y | ? | ! | ! | * |-4
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #3,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
muls.w #n,dx -> ext.l dx            | Y | Y | ? | ! | ! | N | 0
                asl.l #m,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= 8
------------------------------------+---+---+---+---+---+---+----
muls.w #n,dx -> moveq #m,ds         | Y | Y | ? | ! | ! | * |-2
                ext.l dx            |   |   |   |   |   |   |
                asl.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8<m<14
------------------------------------+---+---+---+---+---+---+----
muls.w #n,dx -> swap dx             | Y | Y | ? | ! | ! | * |-2
                clr.w dx            |   |   |   |   |   |   |
                asr.l #(16-m),dx    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8 <= m <= 15
------------------------------------+---+---+---+---+---+---+----
mulu.l #0,dx -> moveq #0,dx         | ! | ! | ? | ! | ! | Y | 4
------------------------------------+---+---+---+---+---+---+----
mulu.l #n,dx -> lsl.l #m,dx         | ! | ! | ? | ! | ! | Y | 4
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= ?
------------------------------------+---+---+---+---+---+---+----
mulu.w #0,dx -> moveq #0,dx         | Y | Y | ? | ! | ! | Y | 2
------------------------------------+---+---+---+---+---+---+----
mulu.w #1,dx -> swap dx             | Y | Y | ? | ! | ! | * |-2
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #12,dx -> swap dx            | Y | Y | ? | ! | Y | * |-10
                 clr.w dx           |   |   |   |   |   |   |
                 swap dx            |   |   |   |   |   |   |
                 asl.l #2,dx        |   |   |   |   |   |   |
                 move.l dx,ds       |   |   |   |   |   |   |
                 add.l dx,dx        |   |   |   |   |   |   |
                 add.l ds,dx        |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #2,dx -> swap dx             | Y | Y | ? | ! | ! | * |-4
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                add.l dx,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #3,dx -> swap dx             | Y | Y | ? | ! | ! | * |-8
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                add.l dx,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #5,dx -> swap dx             | Y | Y | ? | ! | ! | * |-8
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #2,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #6,dx -> swap dx             | Y | Y | ? | ! | ! | * |-10
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                add.l dx,dx         |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #7,dx -> swap dx             | Y | Y | ? | ! | ! | * |-8
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #3,dx         |   |   |   |   |   |   |
                sub.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #9,dx -> swap dx             | Y | Y | ? | ! | ! | * |-8
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                move.l dx,ds        |   |   |   |   |   |   |
                asl.l #3,dx         |   |   |   |   |   |   |
                add.l ds,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
mulu.w #n,dx -> swap dx             | Y | Y | ? | ! | ! | * |-4
                clr.w dx            |   |   |   |   |   |   |
                swap dx             |   |   |   |   |   |   |
                lsl.l #m,dx         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 1 <= m <= 8
------------------------------------+---+---+---+---+---+---+----
mulu.w #n,dx -> swap dx             | Y | Y | ? | ! | ! | * |-2
                clr.w dx            |   |   |   |   |   |   |
                lsr.l #(16-m),dx    |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, 8 <= m <= 15
------------------------------------+---+---+---+---+---+---+----
neg.x dx    -> add.x dx,dy          | Y | Y | Y | Y | Y | Y | 2
sub.x dx,dy                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 dx is trashed
------------------------------------+---+---+---+---+---+---+----
neg.x dx    -> eor.x #n-1,dx        | Y | Y | ? | Y | Y | Y | 2
add.x #n,dx                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 n is 2^m, dx<n
------------------------------------+---+---+---+---+---+---+----
neg.x dx    -> sub.x dx,dy          | Y | Y | Y | Y | Y | Y | 2
add.x dx,dy                         |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 dx is trashed
------------------------------------+---+---+---+---+---+---+----
nop -> (nothing)                    | Y | Y | ? | Y | Y | Y | 2
------------------------------------+---+---+---+---+---+---+----
 remove nops
------------------------------------+---+---+---+---+---+---+----
or.l #n,dx -> bset.l #b,dx          | Y | Y | ? | Y | * | N | 2
------------------------------------+---+---+---+---+---+---+----
 n = 2^b (only 1 bit set)
------------------------------------+---+---+---+---+---+---+----
sub*.x #0,dx -> tst.x dx            | Y | Y | ? | Y | N | N | 2/4
------------------------------------+---+---+---+---+---+---+----
sub.x #n,* -> addq.x #-n,*          | Y | Y | ? | Y | y | N | 2/4
------------------------------------+---+---+---+---+---+---+----
 -8 <= n <= -1
------------------------------------+---+---+---+---+---+---+----
sub.x #n,* -> subq.x #n,*           | Y | Y | ? | Y | y | N | 2/4
------------------------------------+---+---+---+---+---+---+----
 if 1 <= n <= 8
------------------------------------+---+---+---+---+---+---+----
sub.x #n,ax -> lea -n(ax),ax        | Y | Y | ? | Y | y | N | 0/2
------------------------------------+---+---+---+---+---+---+----
 -32767 <= n <= -9, 9 <= n <= 32767
------------------------------------+---+---+---+---+---+---+----
subq.l #n,ax -> subq.w #n,ax        | Y | Y | ? | N | N | N | 0
------------------------------------+---+---+---+---+---+---+----
subq.w #1,dx -> db<cc> dx,??        | y | y | ? |y/*|N/*|y/*|-2
b<cc> ??        b<cc> ??            |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 if dx=0 then will be slower
------------------------------------+---+---+---+---+---+---+----
subq.w #1,dx -> dbf dx,??           | Y | Y | ? |y/*|N/*|y/*|-2
bra ??          bra ??              |   |   |   |   |   |   |
------------------------------------+---+---+---+---+---+---+----
 if dx=0 then will be slower

---------------------------------------------------------------------------
                          H I N T S   &   T I P S
---------------------------------------------------------------------------


This new section is for stuff that cannot be included in the above tables.
This can include pipelining optimizations and other stuff.

020+    Sequential memory accesses can cause pipeline stalls, so try and
        rearrange code so memory accesses do not immediately follow each
        other.  The same problem occurs if an address register updated
        in one line is accessed in the next line.

ALL     Include small routines as macros, because inline routines will
        be much faster, and in extreme cases smaller.

ALL     If a subroutine is only called from one position, either move
        it inline, or only use jmp/bra commands.

# 7th edition :

The 020+ hint stated above should only apply to 68020. Since 68030, 68040
and 68060 have a data cache, sequential memory accesses speed things up.

ALL	Keep all datas aligned on their respective boundaries : word aligned
	for words, long-word aligned for long-word.

ALL	Keep branch target addresses to an even multiple of 8.

040+	Don't use FPU registers for temporary values : You should expect
	rounding problems, and fmove dx,fpx/fmove fpx,dx are slower then
	move dx,mem/move mem,dx when mem is cached.

060	Use mulu.x and muls.x when the factor is not an even multiple of 2.
	This takes only 2 cycles !

030+	Use LSL instead of ASL when possible.

020+	Scale factor with indirect addressing modes doesn't add any time
	penalty.

040	Don't use PC indirect addressing modes.

040+	Avoid NOP instruction for timing purpose. This takes 8 cycles (040)
	or 9 cycles (060). On 68040, this instruction also interlocks the
	effective address calculate and execute stages and synchronizes some
	portions of the processor before execution. It may have the same
	side effect on 68060 (I'm not sure about that).

040+	Only use move16 on large blocks. It's not really faster than 4
	successive move, but it bypass the data cache (this avoid multiple
	reads/writes of cache lines, and keeps the cache valid for further
	local/global data accesses). Data blocks should be aligned on a
	16 bytes boundary.
	 move16 still read/write cached datas.

030	B<cc>.s takes less time when the branch is not taken (4 cycles). All
	other conditions take 6 cycles.

040	Branches taken need 2 cycles. Branches not taken need 3 cycles.

020+	Avoid bit fields instructions. But some of them may be faster in
	some rare situations.

060	Avoid the use of the same register(s) in two consecutive lines. This
	may avoid the second instruction to be dispatch in the second
	pipeline. Only most of the arithmetical/logical instructions, and
	move from/to registers can be dispatched.


030+ Instruction and data cache tips.
-------------------------------------

This is a small explanation of how the caches work. This may help you to
optimize your code to take advantage of these caches.

These processors always fill their caches on a line basis : They load a
16 bytes wide memory block, aligned on a 16 bytes boundary for each
cache read/write operation. These caches are 4Kb wide (68040) or 8kb
wide (68060).
When a cache line read is initiated, the first memory cycle attenmps to load
the line entry corresponding to the instruction half-line (8 bytes) or data
item requested by the integer unit. Subsequent transfers are for the
remaining entries in the cache.
WriteThrough and CopyBack cache modes work differently when a write occur.
In WriteThrough mode, the integer unit updates both the cache entry and the
memory. A write miss never cause a cache line fill.
In CopyBack mode, the integer unit only refresh the cache entry. The memory
will be updated when the line will be replace by another line. A write miss
causes an entire line read, then the corresponding cache entry is updated.

The 68030 data cache burst mode work as stated above. If the cache burst
mode is disabled, the 68030 only read/write partial cache lines.


By now,  we can see that (assuming cache burst mode for the 68030):
- After a data read operation, subsequent accesses within the same data cache
  line are faster. The same is true after a write access in CopyBack mode.
  BTW, these subsequent accesses should be defered to be sure that the entire
  cache line is ready.
- Sporadic writes should be done in a WriteThrough memory page to avoid
  uneeded cache fills. WriteThrough mode can be enable on 040/060 based
  machines : you have to modify the MMU page descriptors to do so.
- Keep your local datas together. Doing so, you need less cache lines to
  cache these datas. i.e., if you read/write the same 4 longwords in a loop,
  it's better to have them in the same cache line.
- Keep your functions together too (040/060). Since the MMU is used to select 
  the cache mode on a page basis, this may avoid uneeded MMU page descriptors
  fetches. These MMU pages are most often 4Kb wide, aligned on a 4Kb boundary.
- A linear code may be worst than a loop, due to instruction cache line
  read timings. As a side note, it may be better to use small slow
  instructions than large quicker ones. The 68060 can execute an entire
  instruction cache line in less than 4 cycles, which is not enough to load 
  the next cache entry : the instruction unit will have to wait for the next
  instruction !
- Don't flush the caches all the time. (5394 cycles for the whole data 
  cache, on 68060, no wait states)


	 
---------------------------------------------------------------------------
                            C O N C L U S I O N
---------------------------------------------------------------------------


There are the optimizations i've come up with so far.  If you could check
what i've done, and report any errors, that would make this list better.  I
only have so much time to spend on this, and many hands make light work.
Also, stats (and more optimizations) for 68020+ CPU's would be welcomed.
Currently this list is only for simple peephole optimization stuff, but I
will hopefully get around to more extensive optimizations.  Pipeline
optimization is on the way, so look out.  Any info on the 68020+ pipelines
would be appreciated.

Optimizations with ?question-marks? in the boxes next to them, I do not
have the data to check yet.

The latest version of the asp68k archive is available by anonymous ftp from
ftp.mq.edu.au in the /home/mglew/ directory or by calling Technophilia BBS
on +61 2 807 3563 (or (02) 807 3563 in Australia).


===========================================================================
EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF
===========================================================================
