From: decwrl!well.sf.ca.us!well!lilj@uunet.UU.NET (Joshua Neil Rubin) Subject: Re: VR/Video Date: 20 Apr 91 02:58:22 GMT Organization: Whole Earth 'Lectronic Link, Sausalito, CA Let's put aside for a minute the problem of hidden surfaces. I readily concede a single stereopair has insufficient information to allow you to synthesize alternate perspectives of such surfaces. Taking solely the information from a single stereopair of an object with no hidden surfaces, you can synthesize *any* new view of the object from *any* perspective you might wish. Solely with technology that is 100 years old. Take the surface of Mars as an example: Assume that 100 years ago you had a stereopair looking straight down onto a bumpy, craggy, mountainous part of Mars. The only really unusual thing about this stretch of terrain is that every bit of surface was in direct line of sight with each of the two cameras taking the stereopair. (This eliminates the hidden surface problem) Using only these two photos, by using some basic principles of stereoscopic arithmetic which have been known since at least the time of Wheatstone in the 1800's (before even the invention of photography, actually), an accurate ruler, a calculator, and some clay, you could easily (albeit tediously) create a perfectly accurate three- dimensional model of that terrain. And you could look at it from any angle you chose. As I see it, the problem in quickly synthesizing a new computer- generated virtual perspective of a scene from a single stereopair isn't that you need skillabytes of data. The problem is that you need sophisticated object recognition programs to recognize what stereographers call the "homologous points" in the two images which make up the stereopair. These are, as the name implies, the two points, one per image in a stereopair, which represent the same location in actual space. (You derive depth information from a stereopair by comparing the distance between two points on one image of the stereopair with the distance between the two homologous points in the other image.) Humans can pick out homologous points easily enough. In fact we do it automatically whenever we use depth perception. Computers currently have a harder time than we humans do parsing scenes into objects and recognizing analogies between imperfectly-matched patterns Once the homologous points have been identified, however, it is a simple matter to do the arithmetic required to reconstruct the relative depths of the various points in the scene. I'll grant you that we're talking about immense amounts of computing speed and power and memory to do all of that object recognition so fast.