From: dstamp@watserv1.waterloo.edu (Dave Stampe-Psy+Eng) Subject: BSP and sorting discussion (longish) Date: Fri, 20 Dec 1991 17:34:48 GMT Message-ID: <1991Dec20.173448.17862@watserv1.waterloo.edu> Organization: University of Waterloo Some thoughts on BSP trees: This discussion is based on the use of BSP trees as a poly sorting technique for VR drawing. All timings are based on 16-bit assembly code on a 486/25 PC. (No, I haven't written the code-- this is an estimate). The idea behind a BSP tree is to subdivide the 3D space based on the planes of the polys to be drawn (i.e consider the poly's surface to be part of a plane that extends to the limits of the "world". Now, the cost of building this structure (and the fact that it tends to chop up all the polys) is offset by the fact that you can use the tree structure developed to "walk" thru the world, and quickly find out what order to draw the polys in to get proper overlapping from any viewpoint (by an inorder tree traversal). So much for the introduction. Now the technical bits, based on these papers: Near Real-time Display of Rigid Objects Fuchs, Abram, Grant Computer Graphics, July '83, v17n3 Optimization of the Binary Space Partition Algorithm (BSP) for the Visualization of Dynamic Scenes Enric Torres EuroGraphics '90 Computer Graphics: Principles and Practice Foley, van Dam, Feiner, Hughes pp 673-680 The "decision" technique used in BSP trees is the concept of the front and rear "space" of a poly. Given a point (either a viewpoint or one of the vertices of another poly) that you wish to find which space it's in: subtract the point from one point on the poly (creating a vector) then take the dot product of this vector with the normal of the poly. We will assume that the normal of a poly is precomputed and stored with the poly data. This will only change if the poly's object is rotated, so it's the "mover's" responsibility to keep the normals current. The above operation takes 3 multiplies and 6 adds/subtracts. This can be done in 5 uS in assembler (assuming 16 bit math for now). This is the basis of much of the BSP timing. Walking the tree: Assume we have a BSP tree: how long does it take to "walk" it? For a simple walk (find every poly's order, decide whether to draw it later) we do one test and some recursive stuff, so call it 7.5 uS per poly. For a tree of 1000 polys, that's 7500 uS (not bad). But you must still eliminate polys outside the visible space, clip polys, and translate to screen coords. The BSP walk can eliminate back-of-object polys (as they "face" away), and you can drop about 40% of the polys this way. Now, why am I talking about 1000 polys when the world database might be only 500 polys typically? Well, during creation of the BSP tree, you end up with about twice the number of polys that you started with: some polys get split. This is one of the disadvantages of the BSP method, as there is some overhead (about 30-50 us) if you draw 2 polys instead of 1. You can eliminate some of the polys (in fact, entire subtrees) that aren't in the viewed space by testing to see if the plane of the poly currently being traversed intersects the view "volume". This volume is typically a pyramid with the "point" at the viewpoint and sides going out to some great depth (not infinity, as you need to have 4 points at the base of the pyramid). If the plane of the poly does not intersect this pyramid (test to see if all 5 points of the pyramid are on the same "side" of the poly's plane) then ignore one subbranch of the tree from that node. This adds 4 point tests (an average of 2, if you use a self-treminating test sequence) because the viewpoint has been tested for the tree traversal. I suspect this test will only eliminate 1/2 of the poly tests, and that many of the polys cut would have been dropped in the first clip test anyway. You now need 13000 uS to walk the tree. Overall, it may not be worth it, depending on how much clipping time is saved. Now that the polys are sorted into drawing order, and clipped etc. you need to draw ALL of them. This means that the screen may be drawn over about 3.5 times (2 times if backward polys are dropped) so you have an added cost over algorithms such as scan-line, which draw the screen once. This cost is 35 mS (estimated) for 320x200x16 screen modes. Building the tree: The big problem with a BSP tree is that if anything in the world moves, you must regenerate the BSP tree. There are methods to reduce the cost of rebuilding the tree, but let's look at a "from scratch" build with 500 polys. The most efficient build (from the 1983 paper) is: - select a poly from the list of unused polys to use as a "root". Try several at random, and check against the other polys to see if this poly would force them to be split (the plane of the candidate root passes between the vertices of the poly). This takes 2.5 tests per poly, or 3x2.5x(nxLOG(n)) in total. You can reuse these test results to build the tree. But every poly is split (on average) once, you you have (avg) 1.5x500 or 750 polys to test. That's a total of 56250 tests or 280 mS. - build a tree from the polys, dividing them into 2 halves based on which side of the selected "root" they are. Some polys will have to be split, and put into both lists (say 20 uS for the split, and one split for each poly adds 10mS. List stuff for the polys adds 1 uS x n x LOG(n) or 7.5 mS - repeat till done. Total time: 280+7.5+10 or (about) 300 mS. This is the best case: the BSP tree is rarely balanced, so assume twice the build time (600 mS). This is far too expensive! So you need methods that let things move about properly. For example, you can generate BSP trees for single objects and either divide the objects with seperating planes (a bit costly to find these planes maybe?) or use some other method to decide what order to draw the objects in, then use the BSP tree of the object to draw it. This will result in some visibility errors, unless you keep objects either convex or keep them from getting too close to (or even inside) each other. In those cases, you might generate a new tree for each object (or combination of 2 objects). Let's look at the cost of generating a BSP tree for a 50-poly object. We have 2.5x3xnxLOG(n) or 2250 tests, and 50 splits and 300 list ops. This is about 3.5 mS for 50 polys, or 8 mS for 100 polys. Not great, but not too bad. The paper by Torres goes into a number of methods to generate BSP trees using seperating planes, object BSP trees, etc. I can't comment yet on how feasible some of this stuff is. But it would be useful if you had areas of the "world" where nothing ever moved, or if you could determine what areas to spilt up beforehand. Other methods: Z-buffer algorithms just are'nt an option for this project. It adds about 0.6 uS per pixel per poly (tripling rendering time) to do the Z-buffer stuff. Priority sorting first sorts the polys by deepest point. This in itself is not enough, so it must perform a series of tests on all polys that overlap each poly in depth. This can be costly for large polys: for example, say 25 polys ovelap a poly in depth. The test for X overlap checks 50 "extents" (max and min X) leaving (say) 6 polys to be checked for Y extent (another 12 extents). This leaves an average of 2 polys to be tested for slope etc. Total time is about 80 uS per poly, or 40 mS for 500 polys. This excludes the Z sort time (another 10 mS or so). This is too long to get 10 frames/sec. Scan-line algorithms guarantee that each screen pixel is drawn to once, saving 35 mS or more over other methods. Also, you can spend more time drawing each pixel, so texture-mapping or Gourand shading becomes an option. The Hamlin-Gear "crossing" algorithm looks to be the best bet. The problem with scanline algorithms is that you have to process each poly edge once per line of vertical extent. If the average height of a poly is 25 pixels, then you have 2x25x300 edge process steps for 300 polys (a clipped and back-poly processed 800 or so poly database), or 15000 steps! You can only afford 3.3 uS per step on average! Need very careful coding (I'm looking at it now). I'd appreciate any feedback on the above, and discussion of other techniques. I will be gone till Dec. 28, so have a virtual Xmas and a 3D New Year. -------------------------------------------------------------------------- | My life is Hardware, | | | my destiny is Software, | Dave Stampe | | my CPU is Wetware... | | | Anybody got a SDB I can borrow? | dstamp@watserv1.uwaterloo.ca | __________________________________________________________________________