From: dstamp@watserv1.waterloo.edu (Dave Stampe-Psy+Eng)
Subject: BSP and sorting discussion (longish)
Date: Fri, 20 Dec 1991 17:34:48 GMT
Message-ID: <1991Dec20.173448.17862@watserv1.waterloo.edu>
Organization: University of Waterloo


Some thoughts on BSP trees:

This discussion is based on the use of BSP trees as a poly sorting
technique for VR drawing.  All timings are based on 16-bit assembly
code on a 486/25 PC. (No, I haven't written the code-- this is an
estimate).

The idea behind a BSP tree is to subdivide the 3D space based on the
planes of the polys to be drawn (i.e consider the poly's surface to
be part of a plane that extends to the limits of the "world".  Now,
the cost of building this structure (and the fact that it tends to
chop up all the polys) is offset by the fact that you can use the
tree structure developed to "walk" thru the world, and quickly find
out what order to draw the polys in to get proper overlapping from
any viewpoint (by an inorder tree traversal).

So much for the introduction.  Now the technical bits, based on these
papers:

Near Real-time Display of Rigid Objects
Fuchs, Abram, Grant
Computer Graphics, July '83, v17n3

Optimization of the Binary Space Partition Algorithm (BSP)
for the Visualization of Dynamic Scenes
Enric Torres
EuroGraphics '90

Computer Graphics: Principles and Practice
Foley, van Dam, Feiner, Hughes
pp 673-680

The "decision" technique used in BSP trees is the concept of the front
and rear "space" of a poly.  Given a point (either a viewpoint or one
of the vertices of another poly) that you wish to find which space it's
in: subtract the point from one point on the poly (creating a vector)
then take the dot product of this vector with the normal of the poly.
We will assume that the normal of a poly is precomputed and stored with
the poly data.	This will only change if the poly's object is rotated,
so it's the "mover's" responsibility to keep the normals current.

The above operation takes 3 multiplies and 6 adds/subtracts.  This can be
done in 5 uS in assembler (assuming 16 bit math for now).  This is the
basis of much of the BSP timing.

Walking the tree:

Assume we have a BSP tree: how long does it take to "walk" it?	For
a simple walk (find every poly's order, decide whether to draw it later)
we do one test and some recursive stuff, so call it 7.5 uS per poly.
For a tree of 1000 polys, that's 7500 uS (not bad).  But you must still
eliminate polys outside the visible space, clip polys, and translate to
screen coords.	The BSP walk can eliminate back-of-object polys (as they
"face" away), and you can drop about 40% of the polys this way.

Now, why am I talking about 1000 polys when the world database might be
only 500 polys typically?  Well, during creation of the BSP tree, you
end up with about twice the number of polys that you started with: some
polys get split.  This is one of the disadvantages of the BSP method, as
there is some overhead (about 30-50 us) if you draw 2 polys instead of 1.

You can eliminate some of the polys (in fact, entire subtrees) that
aren't in the viewed space by testing to see if the plane of the poly
currently being traversed intersects the view "volume".  This volume
is typically a pyramid with the "point" at the viewpoint and sides
going out to some great depth (not infinity, as you need to have 4
points at the base of the pyramid).  If the plane of the poly does
not intersect this pyramid (test to see if all 5 points of the pyramid
are on the same "side" of the poly's plane) then ignore one subbranch
of the tree from that node.  This adds 4 point tests (an average of 2,
if you use a self-treminating test sequence) because the viewpoint has
been tested for the tree traversal.  I suspect this test will only
eliminate 1/2 of the poly tests, and that many of the polys cut would
have been dropped in the first clip test anyway.  You now need 13000 uS
to walk the tree.  Overall, it may not be worth it, depending on how
much clipping time is saved.

Now that the polys are sorted into drawing order, and clipped etc. you
need to draw ALL of them.  This means that the screen may be drawn over
about 3.5 times (2 times if backward polys are dropped) so you have an
added cost over algorithms such as scan-line, which draw the screen once.
This cost is 35 mS (estimated) for 320x200x16 screen modes.

Building the tree:

The big problem with a BSP tree is that if anything in the world moves,
you must regenerate the BSP tree.  There are methods to reduce the cost
of rebuilding the tree, but let's look at a "from scratch" build with
500 polys.

The most efficient build (from the 1983 paper) is:
- select a poly from the list of unused polys to use as a "root".
  Try several at random, and check against the other polys to see
  if this poly would force them to be split (the plane of the
  candidate root passes between the vertices of the poly).  This
  takes 2.5 tests per poly, or 3x2.5x(nxLOG(n)) in total.  You can reuse
  these test results to build the tree.  But every poly is split
  (on average) once, you you have (avg) 1.5x500 or 750 polys to
  test.  That's a total of 56250 tests or 280 mS.

- build a tree from the polys, dividing them into 2 halves based on
  which side of the selected "root" they are.  Some polys will have
  to be split, and put into both lists (say 20 uS for the split, and
  one split for each poly adds 10mS.  List stuff for the polys adds
  1 uS x n x LOG(n) or 7.5 mS

- repeat till done.  Total time: 280+7.5+10 or (about) 300 mS.	This is
  the best case: the BSP tree is rarely balanced, so assume twice the
  build time (600 mS).

This is far too expensive!  So you need methods that let things move about
properly.  For example, you can generate BSP trees for single objects
and either divide the objects with seperating planes (a bit costly to
find these planes maybe?)  or use some other method to decide what order
to draw the objects in, then use the BSP tree of the object to draw it.
This will result in some visibility errors, unless you keep objects either
convex or keep them from getting too close to (or even inside) each other.
In those cases, you might generate a new tree for each object (or
combination of 2 objects).

Let's look at the cost of generating a BSP tree for a 50-poly object.
We have 2.5x3xnxLOG(n) or 2250 tests, and 50 splits and 300 list ops.
This is about 3.5 mS for 50 polys, or 8 mS for 100 polys.  Not great,
but not too bad.

The paper by Torres goes into a number of methods to generate BSP trees
using seperating planes, object BSP trees, etc.  I can't comment yet
on how feasible some of this stuff is.	But it would be useful if you had
areas of the "world" where nothing ever moved, or if you could determine
what areas to spilt up beforehand.

Other methods:

Z-buffer algorithms just are'nt an option for this project.  It adds about
0.6 uS per pixel per poly (tripling rendering time) to do the Z-buffer stuff.

Priority sorting first sorts the polys by deepest point.  This in itself
is not enough, so it must perform a series of tests on all polys that
overlap each poly in depth.  This can be costly for large polys: for
example, say 25 polys ovelap a poly in depth.  The test for X overlap
checks 50 "extents" (max and min X) leaving (say) 6 polys to be checked
for Y extent (another 12 extents). This leaves an average of 2 polys
to be tested for slope etc.  Total time is about 80 uS per poly, or
40 mS for 500 polys.  This excludes the Z sort time (another 10 mS or
so).  This is too long to get 10 frames/sec.

Scan-line algorithms guarantee that each screen pixel is drawn to once,
saving 35 mS or more over other methods.  Also, you can spend more time
drawing each pixel, so texture-mapping or Gourand shading becomes an
option.  The Hamlin-Gear "crossing" algorithm looks to be the best bet.
The problem with scanline algorithms is that you have to process each poly
edge once per line of vertical extent.	If the average height of a poly is
25 pixels, then you have 2x25x300 edge process steps for 300 polys (a
clipped and back-poly processed 800 or so poly database), or 15000 steps!
You can only afford 3.3 uS per step on average!  Need very careful coding
(I'm looking at it now).

I'd appreciate any feedback on the above, and discussion of other
techniques.  I will be gone till Dec. 28, so have a virtual Xmas and a 3D
New Year.


--------------------------------------------------------------------------
| My life is Hardware,                    |                              | 
| my destiny is Software,                 |         Dave Stampe          |
| my CPU is Wetware...                    |                              | 
| Anybody got a SDB I can borrow?         | dstamp@watserv1.uwaterloo.ca |
__________________________________________________________________________