Article 10885 of comp.sys.amiga.tech:
Path: liuida!sunic!mcsun!uunet!aplcen!uakari.primate.wisc.edu!sdd.hp.com!ucsd!helios.ee.lbl.gov!nosc!crash!jcs
From: jcs@crash.cts.com (John Schultz)
Newsgroups: comp.sys.amiga.tech
Subject: Re: Using CPU instead of Blitter for speed
Message-ID: <3022@crash.cts.com>
Date: 6 Jun 90 20:50:03 GMT
References: <1990Jun3.164446.12193@ameristar> <1990Jun4.134811.12142@watdragon.waterloo.edu> <30974@ut-emx.UUCP>
Distribution: comp
Organization: Crash TimeSharing, El Cajon, CA
Lines: 45
X-Local-Date: 6 Jun 90 13:50:03 PDT
In article <30974@ut-emx.UUCP> jonabbey@walt.cc.utexas.edu (Jonathan Abbey) writes:
>I'd like some hard data on the blitter vs. the CPU for polygon rendering..
>I've heard people here assert that the CPU is faster, even on a 68000.
>
>How much faster?  Are you using the blitter to copy the polygon into the
>appropriate bitplanes once it is drawn, or are you using the processor to
>fill the bitplanes in parallel?

  I compute the specific mask for the specific long word, then write the
mask appropriately into the bitplanes (move.l, or.l, or not-and.l). The
blitter is never used.

>I'm assuming you're using an ordered edge list?

  I'm using a highly efficient table fill algorithm I developed. The
trick is to fill the minX/maxX tables as rapidly as possible, then
draw lines from minX to maxX, for the polygon's minY to maxY.
This only works for convex polygons.
The "lines" are 32 bit masks, so worst case for a 320 wide screen would
take ten 32 bit writes. The reason the processor is faster than the blitter
in filling polygons, is that the blitter must clear a temprast the
size of the extent of the polygon, draw the outline with the blitter,
xor maximum y points with the processor, fill the mask with the blitter,
then blit the mask to each bitplane. If you want polygons that don't have
broken or coarse edges, you must re-outline the mask with lines using 
the blitter.  For a four bitplane display, this is equivalent to writing
to a six bitplane display, as well as having to draw a wireframe polygon
to two bitplanes. Furthermore, the blitter must work with a rectangle, and
thus much data movement is wasted, as most polygons are not rectangles.
The processor only moves data where the polygon exists.
  I just tested out my latest code on a nofastmem 500, and it was about
1 frame/sec slower for small polygons, and 1-2 frames/sec faster on large
polygons.  I measure speed differences real time with a frame rate counter.
I just press a key to toggle between rendering methods. Using the blitter
I'll get a frame rate of 17 f/s, while the processor cranks out
24 f/s. That's about a 41% improvement. Some cases the blitter will
bog down to 8 f/s while the processor cranks out 24 f/s, for a 300%
improvement. This is comparing my custom bliter code that *does not*
re-outline the blitter mask, so the polygons look ugly. The rom code
is much slower as it re-outlines the masks. The processor filled polygons
look great. On an Amiga 3000, the processor is at least another 40% faster
than my 25mhz GVP because of the 32 bit chip ram.


  John


