From: John Costella <jpc@tauon.ph.unimelb.edu.au>
Subject: PAPER: Galilean Antialiasing for VR, Part 01/04
Date: Mon, 26 Oct 92 5:17:23 EET


[Co-mod (Mark):
	John Costella (jpc@tauon.ph.unimelb.edu.au) recently sent me this 
	paper to peruse and post, and I have found it very enlightening...
	Hopefully, you will as well!  
	This paper has been divided into 4 sections.  To create the 
	document, save these 4 messages to files, trim off the headers,
	and then concatenate the files with your favorite command.
	( cat file1 file2 file3 file4 > file.tex, or use an editor )
	Then, run latex on the resulting file, *twice*, in order to 
	process all of the references correctly.  

	Or, if you don't feel like going through that, just anonymously
	ftp to ftp.u.washington.edu, cd to directory public/virtual-worlds/
	currentPapers, and you'll find the .tex, .dvi, and .ps versions
	of this paper.  Feel free to mail me at deloura@cs.unc.edu if you are
	unfamiliar with anonymous ftp.

	I hope you enjoy it!
		---Mark
]

%  File 1 of 4.  NOTE: All four files MUST be concatenated
%                before this document can be LaTeXed.
%
%
%  Galilean Antialiasing for Virtual Reality Displays
%  --------------------------------------------------
%  John P. Costella, School of Physics, The University of Melbourne
%
%  Abstract:
%  --------
%  In this paper, a method is described that improves the perceived 
%  "smoothness" of motion depicted on rasterised Virtual Reality
%  displays, by utilising the powerful information already contained 
%  in the virtual-world engine. Practical implementation of this 
%  method requires a slight realignment of one's view of the nature 
%  of a rasterised display, together with modest modifications to 
%  current image generation and rasterisation hardware. However, the 
%  resulting improvement in the quality of the perceived real-time 
%  image is obtained for only modest computational and hardware cost
%  --offering the possibility of increasing the *apparent* graphical 
%  capabilities of existing technology by up to an order of magnitude.
%
%
%  Copyright (C) 1992 John P. Costella.
%
%  NOTE: The body of this document is written in the LaTeX format.
%        It can be "read" reasonably with a text editor, but for full
%        formatting should be LaTeXed, and thence viewed or printed.
%
%
%  Bibliographical information:
%  ---------------------------
%  Title: "Galilean Antialiasing for Virtual Reality Displays"
%  Author: John P. Costella
%  Institution:  School of Physics, The University of Melbourne,
%      Parkville, Victoria 3149, Australia
%  E-mail: jpc@tauon.ph.unimelb.edu.au
%  Telephone: +61 3 543-7795 (voice); +61 3 347-4783 (fax)
%  Length of LaTeXed document: 92 pages
%  Submitted to: Usenet Sci.virtual-worlds (electronic)
%  Internal revision number: 1.0
%  Date of revision 0.0: 16 October 1992
%  Date of this revision: 25 October 1992
%
%  -------------------------------------------------------------------
%
%  The following commands are macros that simplify and improve the
%  consistency of LaTeX papers. If using a text editor, skip to the 
%  next ``horizontal line'' (like the one above) for the body of the 
%  paper proper.
%  
%
%  Document type.
%
\documentstyle{article}
%
%  Shorter versions of newcommands.
%
\newcommand{\nc}{\newcommand}
\nc{\rnc}{\renewcommand}
%
%  Type of document.
%
\nc{\typeofdoc}{paper}
\nc{\Typeofdoc}{Paper}
%
%  Equation, section and reference macros.
%
\nc{\chname}{}
\nc{\beqn}[1]{\begin{equation}\label{Eqn:\chname#1}} 
\nc{\eeqn}{\end{equation}}  
\nc{\beqnarr}[1]{\begin{eqnarray}\label{Eqn:\chname#1}}  
\nc{\eeqnarr}{\end{eqnarray}}  
\nc{\beqnarrn}{\begin{eqnarray}}
\nc{\eeqnarrn}{\end{eqnarray}} 
\nc{\beqnarrnn}{\begin{eqnarray*}}  
\nc{\eeqnarrnn}{\end{eqnarray*}}  
\nc{\nline}{\nonumber \\} 
\nc{\eql}[1]{\label{Eqn:\chname#1}}  
\nc{\eq}[1]{(\ref{Eqn:\chname#1})} 
\nc{\fareq}[2]{(x.x)}  
\nc{\fArEqsUb}[2]{(\ref{Eqn:#1-#2})}  
\nc{\dummy}{\mbox{}}   
%
\nc{\newchap}[2]{\chapter{#2}\label{Chap:#1}\rnc{\chname}{#1-}}  
\nc{\chap}[1]{\ref{Chap:#1}}  
%
\nc{\newsect}[2]{\section{#2}\label{Sec:\chname#1}}
\nc{\sect}[1]{\ref{Sec:\chname#1}}  
\nc{\farsect}[2]{x.x} 
\nc{\fArsEctsUb}[2]{\ref{Sec:#1-#2}}  
%
\nc{\newssect}[2]{\subsection{#2}\label{SubSec:\chname#1}}  
\nc{\ssect}[1]{\ref{SubSec:\chname#1}}  
\nc{\farssect}[2]{x.x}  
\nc{\fArssEctsUb}[2]{\ref{SubSec:#1-#2}}  
%
%  Latin and other italicised phrases.
%
\nc{\e}[1]{\/{\em #1\/}}  
\nc{\newterm}[1]{\e{#1}}  
\nc{\ie}{\e{i.e.}}
\nc{\eg}{\e{e.g.}}
\nc{\viz}{\e{viz.}}
\nc{\etal}{\e{et al.}}
\nc{\etc}{\e{etc.}}
\nc{\apriori}{\e{a priori}}
%
%  Put accents and diereses back in to some words.
%
\nc{\coord}{co\"{o}rdinate}
\nc{\noone}{no\"{o}ne}
\nc{\role}{r\^{o}le}
\nc{\debacle}{d\'{e}b\^{a}cle}
\nc{\naive}{na\"{\i}ve}
\nc{\coin}{co\"{\i}n}
\nc{\coo}{co\"{o}}
\nc{\rei}{re\"{\i}}  
\nc{\ree}{re\"{e}}  
\nc{\rea}{re\"{a}}  
%
%  Set up some mathematical symbols that either aren't done
%  or not done too well or are inconvenient with native LaTeX.
%
\nc{\tb}{&\!\!\!\!\dummy}  
\nc{\tbnd}{&\!\!\!\!}  
\nc{\paren}[1]{\left(#1\right)}  
\nc{\leftparen}[1]{\left(#1\right.}  
\nc{\rightparen}[1]{\left.#1\right)}  
\nc{\brac}[1]{\left[#1\right]} 
\nc{\leftbrac}[1]{\left[#1\right.}  
\nc{\rightbrac}[1]{\left.#1\right]}  
\nc{\braces}[1]{\left\{#1\right\}} 
\nc{\leftbrace}[1]{\left\{#1\right.}  
\nc{\rightbrace}[1]{\left.#1\right\}}  
\nc{\modsign}[1]{\left|#1\right|}  
\nc{\rightmod}[1]{\left.#1\right|}  
\nc{\leftmod}[1]{\left|#1\right.}  
%
\nc{\txt}[1]{{\rm#1}}  
\nc{\vdot}{\!\cdot\!}  
\nc{\br}[1]{\overline{#1}}  
\nc{\dotpr}[2]{(#1\vdot#2)} 
\nc{\littlefrac}[2]{{\scriptstyle\frac{#1}{#2}}}  
\nc{\f}[2]{{\displaystyle\frac{#1}{#2}}} 
\nc{\vect}[1]{\mbox{\boldmath{$#1$}}}  
\nc{\vcapdot}[1]{\dot{\vect{#1}\,}\!}  
\nc{\vcapddot}[1]{\ddot{\vect{#1}\,}\!}  
\nc{\gcapdot}[1]{\dot{#1\,}\!} 
\nc{\gcapddot}[1]{\ddot{#1\,}\!}  
\nc{\gcap}[1]{{\it#1}}  
%
\nc{\ten}[1]{10^{#1}}  
\nc{\byten}[1]{\times10^{#1}} 
\nc{\degrees}{^{\circ}}  
\nc{\cross}{\!\times\!} 
\nc{\half}{\frac{1}{2}}  
\nc{\quarter}{\frac{1}{4}}  
\nc{\parenfracpower}[3]{\paren{\f{#1}{#2}}^{\!\!#3}} 
\nc{\pard}{\partial}  
\nc{\scr}[1]{{\cal#1}}  
\nc{\id}{\equiv}  
\nc{\dual}[1]{\widetilde{#1}}  
\nc{\what}[1]{\widehat{#1}} 
\nc{\del}{\nabla}  
\nc{\dash}{\prime} 
\nc{\dAlem}{\Box^2}  
\nc{\artan}{\txt{artan}}
\nc{\arsin}{\txt{arsin}}      
%
%  Some short-cuts for greek letters. Capitals are also italicised. 
%
\nc{\al}{\alpha}
\nc{\be}{\beta}
\nc{\g}{\gamma}
\nc{\de}{\delta}
\nc{\eps}{\varepsilon}
\nc{\epsil}{\epsilon}
\nc{\z}{\zeta}
\nc{\et}{\eta}
\nc{\th}{\theta}
\nc{\varth}{\vartheta}
\nc{\io}{\iota}
\nc{\k}{\kappa}
\nc{\la}{\lambda}
\nc{\m}{\mu}
\nc{\n}{\nu}
\nc{\x}{\xi}
\nc{\p}{\pi}
\nc{\varp}{\varpi}
\nc{\r}{\rho}
\nc{\varr}{\varrho}
\nc{\s}{\sigma}
\nc{\vars}{\varsigma}
\nc{\ta}{\tau}
\nc{\ups}{\upsilon}
\nc{\ph}{\phi}
\nc{\varph}{\varphi}
\nc{\ch}{\chi}
\nc{\ps}{\psi}
\nc{\om}{\omega}
%
\nc{\G}{\gcap{\Gamma}}
\nc{\D}{\gcap{\Delta}}
\nc{\Th}{\gcap{\Theta}}
\nc{\La}{\gcap{\Lambda}}
\nc{\X}{\gcap{\Xi}}
\nc{\Py}{\gcap{\Pi}}
\nc{\Si}{\gcap{\Sigma}}
\nc{\Ups}{\gcap{\Upsilon}}
\nc{\Ph}{\gcap{\Phi}}
\nc{\Ps}{\gcap{\Psi}}
\nc{\Om}{\gcap{\Omega}}
%
%  Reference list commands. 
%
\nc{\bib}[1]{\bibitem{#1}}  
%
\nc{\paper}[5]{#1, {\it #2\/}\ {\bf #3} (#4) #5}   
\nc{\book}[7]{#1, {\it #2\/}#3\ (#4, #5, #6)#7} 
\nc{\booknoyr}[6]{#1, {\it #2\/}#3\ (#4, #5)#6}  
\nc{\booknocity}[6]{#1, {\it #2\/}#3\ (#4, #5)#6}  
% 
%  Paper-specific macros.
%
\nc{\Galn}{Galilean}
\nc{\Gal}[1]{G^{(#1)}}
\nc{\Anti}{Antialias}
\nc{\anti}{antialias}
\nc{\VR}{Virtual Reality}
\nc{\RR}{Real Reality}
\nc{\FS}{Flinders Street}

\nc{\vx}{\vect{x}}
\nc{\vv}{\vect{v}}  
\nc{\vi}{\vect{i}}
\nc{\vj}{\vect{j}}
\nc{\va}{\vect{a}}
\nc{\vb}{\vect{b}}
\nc{\vn}{\vect{n}}
\nc{\floor}{\mbox{floor}}
%
% --------------------------------------------------------------------
%
%     ---->    The LaTeX-formatted paper begins here...   <----
%
\begin{document}
%
\title{\Galn\ \Anti ing for \VR\ Displays}
\author{John P.\ Costella \\
  {\small\it School of Physics, The University of Melbourne, 
  Parkville, Vic.\ 3052, Australia}}
\date{25 October 1992}
\maketitle
%
\begin{abstract}
In this \typeofdoc, a method is described that improves the perceived 
``smoothness'' of motion depicted on rasterised \VR\ displays, by
utilising the powerful information already contained in the
virtual-world engine.
Practical implementation of this method requires a slight \rea lignment
of one's view of the nature of a rasterised display, together with 
modest modifications to current image generation and rasterisation 
hardware.
However, the resulting improvement in the quality of the perceived
real-time image is obtained for only modest computational 
and hardware cost---offering the possibility of 
increasing the \e{apparent} graphical capabilities of existing
technology by up to an order of magnitude.
\end{abstract}
%
\newsect{Intro}{Introduction}

Numerous definitions of the term \e{\VR} abound these days.
However, they all share one common thread: a successful \VR\ system
\e{convinces} you that you are somewhere other than where you really are.
Despite engineers' wishes to the contrary, just \e{how} convincing  
this experience is seems to depend only
weakly on raw technical statistics like megabits per second; 
rather, more important is how well this information is ``matched''
to the expectations of our physical senses.
While our uses for \VR\ might encompass virtual worlds bearing little
resemblance to the real world, our physical senses nevertheless still
expect to be stimulated in the same ``natural'' ways that they have for 
millions of years.

Two particular areas of concern for designers of current \VR\ systems
are the \e{latency} and \e{update rate} of the visual display hardware
employed.
Experience has shown that poor performance in either of these two areas 
can quickly destroy the ``realness'' of a \VR\ session, 
even if all of the 
remaining technical specifications of the hardware are impeccable.
This is perhaps not completely surprising, given that the images of
objects that humans interact with in the natural world are never delayed 
by more than fractions of milliseconds, nor are they ever ``sliced'' into 
discrete time frames (excluding the intervention of the technology of the 
past hundred years).
On the other hand, the relative psychological importance of latency and 
update rate can, conversely, be used to great
advantage: with suitably good 
performance on these two fronts, the ``convinceability factor'' of a \VR\ 
system can withstand relatively harsh degradation of its other 
capabilities (such as image resolution).
``If it \e{moves} like a man-eating beast, it probably \e{is} a 
man-eating beast---even if I didn't see it clearly''
is no doubt a hard-wired feature of our internal image processing 
subsystem that is responsible for us still being on the planet today.

A major problem facing the \VR\ system designer, however, is that 
presenting a sufficiently ``smooth'' display to fully convince the viewer
of ``natural motion'' often seems to require an unjustifiably high 
computational cost.
As an order-of-magnitude estimate, a rasterised display update rate of 
around 100 updates per second
is generally sufficiently fast enough that the human visual system cannot
distinguish it from continuous vision.
But the actual amount of information gleaned from these images by
the viewer in one second is nowhere near the amount of information 
containable in 100 static images---as can be simply verified by
watching a ``video montage'' of still photographs presented 
rapidly in succession.
The true rate of ``absorption'' of detailed visual information is probably
closer to 10 updates per second---or worse, depending on how much actual
detail is taken as a benchmark.
Thus, providing a completely ``smooth'' display takes roughly an order of
magnitude more effort than is ultimately appreciated---not unlike 
preparing a magnificent dinner for twelve and then having \noone\ else
turn up.

For this reason, many \VR\ designers make an educated compromise between
motion smoothness and image sophistication, by choosing an update
rate that is somewhere between the $\sim\!10$ updates per 
second rate that we
absorb information at, and the $\sim\!100$ updates per second rate needed
for smooth apparent motion. 
Choosing a rate closer to 10 updates per second requires that the
participant mentally ``interpolate'' between the images 
presented---not
difficult, but nevertheless requiring some conscious processing, which
seems to leave fewer ``brain cycles'' for appreciating the virtual-world
experience.
On the other hand, choosing a rate closer to 100 updates per second
results in natural-looking motion, but the reduced time available
for each update reduces the sophistication of the graphics---fewer
``polygons per update''.
The best compromise between these two extremes depends on the application
in question, the expectations of the participants, and, probably most
importantly, the opinions of the designer.

In the remaining sections of this \typeofdoc, we outline enhancements to
current rasterised display technology that permit the motion of objects
to be consistently displayed at a high update rate, while allowing the
image generation subsystem to run at a lower update rate, and 
hence provide
more sophisticated images.
Section~\sect{BasicPhilosophy} presents an overview of the issues that
are to be addressed, and outlines 
the general reasoning behind
the approach that is taken.
Following this, in section~\sect{MinimalImplementation}, detailed (but
platform-independent)
information is provided that would allow a \VR\ designer to
``retrofit'' the techniques outlined in section~\sect{BasicPhilosophy}
to an existing system.
For these purposes, as much of the current image generation 
design philosophy as
possible is retained, and only those minimal changes required to
implement the techniques immediately are described.
However, it will be shown that the full benefit of the methods
described in this \typeofdoc, in terms of the specific needs of
\VR, will most fruitfully be obtained
by subtly changing the way in which the image generation process is
currently structured.
These changes, and more advanced topics not
addressed in section~\sect{MinimalImplementation}, are 
considered in section~\sect{Enhancements}.

\newsect{BasicPhilosophy}{The Basic Philosophy}
We begin, in section~\ssect{CurrentRasters}, 
by reviewing the general methods by which current rasterised 
displays are implemented, to appreciate more fully why the problems
outlined in section~\sect{Intro} are present in the first place,
and to standardise the terminology that will be used in later sections.
Following this, in section~\ssect{Motion}, we review some of 
the fundamental physical principles underlying our understanding of
motion, to yield further insight into the problems of 
depicting it accurately on display devices.
These deliberations are used in section~\ssect{GalAnti} to pinpoint
the shortcomings of current rasterised display technology, and to
formulate a general plan of attack to rectify these problems.
A brief introduction to the terminology used for the 
general structures required to carry out these
techniques is given in section~\ssect{Galpixels}---followed, in
section~\ssect{GalpixmapStructure}, by a careful
consideration of the level of sophistication required for practical
yet reliable systems.
Specific details about the hardware and software modifications
required to implement the methods outlined are deferred to 
sections~\sect{MinimalImplementation} and~\sect{Enhancements}.

\newssect{CurrentRasters}{Overview of Current Rasterised Displays}
Rasterised display devices for computer applications, while ubiquitous 
in recent years, are relatively new devices.
Replacing the former \e{vector display} technology, they took advantage
of the increasingly powerful memory devices that became
available in the early 1970s, to represent the display as a
digital matrix of pixels, a \e{raster} or \e{frame buffer},
which was scanned out to the CRT line-by-line in the same fashion 
as the by then well-proven technology of \e{television}.

The electronic circuitry responsible for scanning the image out from
the frame buffer to the CRT (or, these days, to whatever display 
device is being used)
is referred to as the \e{video controller}---which may be as simple 
as a few interconnected electronic
devices, or as complex as a sophisticated microprocessor.
The \e{refresh rate} may be defined as the reciprocal
of the time required for the video
controller to refresh the display from the frame buffer, and is
typically in the range 25--120~Hz, 
to both avoid visible flicker, and to allow smooth
motion to be depicted.
(Interlacing is required for CRTs at the lower end of this range to avoid
visible flicker; for simplicity, we assume that the display is 
\e{non-interlaced} with a suitably high refresh rate.)
Each complete image, as copied 
by the video controller from the frame buffer to the 
physical display device,
is referred to as a \e{frame}; this term can also be used to refer to
the time interval between frame refreshes, \ie\ the reciprocal of the
refresh rate.

Most \VR\ systems employ two display devices to present a stereoscopic
view to the participant, one display for each eye.
Physically, this may be implemented as two completely separate display
subsystems, with their own physical output devices (such as
a twin-LCD head-mounted display).
Alternatively,
hardware costs may be reduced by interleaving the two video signals 
into a single physical output device, and relying on physically
simpler (and sometimes less ``face-sucking'') demultiplexer techniques
to separate the signals,
such as time-domain switching (\eg\ electronic shutter glasses),
colour-encoding (\eg\ the ``3-D'' coloured glasses of the
1950s), optical polarisation (for which we are fortunate that
the photon is a vector boson, and that we have only two eyes), 
or indeed any other physical attribute capable of distinguishing
two multiplexed optical signals.
For the purposes of this \typeofdoc, however, we define a \e{logical
display device} to be \e{one} of the video channels in a twin-display
device (say, the left one), or else the corresponding \e{effective} 
monoscopic display device for a multiplexed system.
For example, a system employing a single (physical)
120-Hz refresh-rate CRT, 
time-domain-multiplexing the left and right video channels into
alternate frames, is considered to possess two \e{logical} display
devices, each running at a 60~Hz refresh rate.
In general, we ignore completely 
the engineering problems (in particular, \e{cross-talk})
that multiplexed systems must contend with.
Indeed, for most of this \typeofdoc, we shall ignore 
the stereoscopic nature of \VR\ displays altogether,
and treat each video channel separately; 
therefore, all references to the term
``display device'' in
the following sections refer to \e{one} of the two
logical display devices, with the understanding that the other video channel
simply requires duplicating the hardware and software of the first.

We have described above how the video controller scans frames out
from the frame buffer to the physical display device.
The frame buffer, in turn, receives its information from 
the \e{display processor} (or, in simple
systems, the CPU itself---which we shall also refer to as ``the display
processor'' when acting in this \role).
Data for each pixel in the frame buffer is retained unchanged
from one frame
to the next, unless it is overwritten by the display processor in the
intervening time.
There are many applications for computer graphics for which this
``sample-and-hold'' nature of the frame buffer is very useful:
a background scene can be painted into the frame buffer once, and
only those objects that change their position or shape from frame to
frame need be redrawn (together with repairs to the background area
thus uncovered).
This technique is often well-suited to traditional computer hardware
environments---namely, those in which the display device is 
physically fixed in
position on a desk or display stand---because a constant background
view accords well with the view that we are using the display as a
(static) ``window'' on a virtual world.
However, this technique is, in general, ill-suited to \VR\ 
environments, in which the display
is either affixed to, or at least in some way ``tracks'', the 
viewer, and thus must change even
the ``background'' information constantly as the viewer moves her head.

There are several problems raised by this requirement of constant updating  
of the entire frame buffer. 
Firstly, if the display processor proceeds to write into the frame buffer
at the same time as the video controller is refreshing the display,
the result will often be an excessive amount of visible flicker, as 
partially-drawn (and, indeed, partially-erased) objects are
``caught with their pants down'' during the refresh.
Secondly, once the proportion of the image requiring regular updating rises
to any significant fraction, it becomes more computationally cost-effective
to simply erase and redraw the entire display than 
to erase the individual
objects that need redrawing.
Unless the new scene can be redrawn in significantly less time than
a single frame (untrue in any but the most trivial situations),
the viewer will see not a succession of complete views of a
scene, but rather a succession of scene-building drawing operations.
This is often acceptable for ``non-immersive'' applications such as CAD
(in which this ``building process'' can indeed often be most informative); 
it is not acceptable, however, for any convincing ``immersive'' 
application such as \VR.

The standard solution to this problem is \e{double buffering}.
The video controller reads its display information from one frame buffer,
whilst at the same time a second frame buffer is written to by the
display processor.
When the display processor has finished rendering a complete image, 
the video controller is instructed to switch to the second frame buffer 
(containing the new image), 
simultaneously switching the display processor's focus
to the first frame buffer (containing the now obsolete image that the
video controller was formerly displaying). 
With this technique, the viewer sees one constant 
image for a number of frames, until the new image has been completed.
At that time, the view is instantaneously switched to the new image,
which remains in view until yet another complete image is available.
Each new, completed image is referred to as an \e{update}, and the rate
at which these updates are forthcoming from the display processor is
the \e{update rate}.

It is important to note the difference between \e{refresh} rate and 
\e{update} rate, and the often-subtle physical 
interplay between the two.
The refresh rate is the rate at which the video controller reads images
from the frame buffer to the display device, and is
typically a constant for a given hardware configuration (\eg\ 70 Hz).
The update rate, on the other hand, is the rate at which complete
new images of the scene in question are rendered; it is generally lower
than the refresh rate, and usually depends to a greater or lesser
extent on the complexity
of the image being generated by the display processor.

It is often preferable to change the video processor's focus only 
\e{between} frame refreshes to the physical display device---and 
not mid-frame---especially if
the display processor is comparable 
in speed to one update per frame.
This is because switching the video controller's focus mid-frame 
``chops'' those objects in the particular scan line that is being 
scanned out by the video controller at the time, which leads to
visible discontinuities in the perceived image.
On the other hand, this restriction means that the display processor
must wait until the end of a frame to begin drawing a new image (unless
a third frame buffer is employed), effectively forcing the refresh-to-update
rate ratio up to the next highest integral value.
This is most
detrimental when the update rate is already high; for example,
if an update takes only 1.2 refresh periods to be drawn, 
``synchronisation'' with the frame buffer means that the remaining 
0.8 of a refresh period is unusable for drawing operations.

Regardless of whether the frame-switching circuitry is synchronised
to the frame rate or not, if the update rate of the display processor 
is in fact slower than the refresh
rate (the usual case), then the same static image persists on the display 
device for a number of frames,
until a new image is ready for display.
For \e{static} objects on the display, this ``sample-and-hold'' 
technique is ideal: 
the image's motion (\ie\ no motion at all!) is correctly
depicted at the (high) refresh rate, even though the image itself is
only being generated at the (lower) update rate.
This phenomenon, while appearing quite trivial in today's 
rasterised-display world, 
is in fact a major advance over the earlier vector-display technology: 
the video processor, utilising the frame buffer, effectively \e{fills
in the information gaps} between the images supplied by the display
processor.
Recognition of the remarkable power afforded by this feat of
``interpolation''---and, more importantly, a critical
assessement of how this ``interpolation'' is currently carried out---is
critical to appreciating the modifications that will be suggested  
shortly.

As mentioned in section~\sect{Intro}, the \e{latency} (or ``time lag'')
of a \VR\ system in
general, and the display system in particular, is crucial for the
experience to be convincing (and, indeed, non-nauseous).
There are many potential and actual sources of latency in such 
systems; in this \typeofdoc, we are concerned only with those 
introduced by the 
image generation and display procedures themselves.
Already, the above description of a double-buffered display system
contains a number of potential sources of lag.
Firstly, if the display processor computes the apparent positions of 
the objects in the image based on positional information valid at the
\e{start} of its computations, these apparent positions will already
be slightly out of date by the time the computations are complete.
Secondly, the rendering and scan-conversion of the objects takes more
time, and is based on the (already slightly outdated) positional information.
Finally---and perhaps most subtly---the very ``sample-and-hold'' nature of
the video processor's frame buffer leads to a
significant average time lag itself, equal to \e{half the update period}.
While a general mathematical proof of this figure
is not difficult, a ``hand-waving''
argument is easily constructed.
For simplicity, assume that all other lags in the graphical pipeline are 
magically removed, so that,
upon the first refresh of a new update, it describes the
virtual environment at that point in time accurately.
By the time of the second refresh of the same image, it is now one
frame out-of-date; by the third refresh, it is two frames out-of-date;
and likewise for all remaining refreshes of the
same image until a new update is provided.
By the ``hand-waving'' argument of simply averaging the out-of-datedness
of each refresh across the entire update period, one obtains
\[
\left<\ta_\txt{\,lag}\right>\sim\f{1}{\ta_\txt{update}}
  \int_0^{\ta_\txt{update}}t\,dt
  =\half\ta_\txt{update},
\]
where $\ta_\txt{update}$ is the update period.
Thus, a long update period not only affects the \e{smoothness} of the
perceived display, but also its \e{latency}---thus rendering it
a particularly insidious enemy of real-time \VR\ systems, and
a doubly worthy target of our attention.

It is this undesirable feature of conventional display methodology that we
will aim to remove in this \typeofdoc.
However, to provide suitable background for the approach we shall take,
and to put our later specifications into context, we first review some
quite general considerations on the nature of physical motion.

\newssect{Motion}{The Physics of Motion}
As noted in section~\sect{Intro}, while our applications for \VR\ 
technology may encompass virtual worlds far removed from the laws of
physics, our physical senses nevertheless expect to be stimulated more or
less in the same way that they are in the real world.
It is therefore useful to review briefly the evolution of man's knowledge
about the fundamental nature of motion, and note how well these views have
or have not been incorporated into real-time computer graphics. 

Some of the earliest questions about the
nature of motion that have survived to this day are due to Zeno of Elea.
His most \e{famous} 
paradox---that of Achilles and Tortoise---is amusing to
this day, but is nevertheless more a question of mathematics than physics.
More interesting is his paradox of the Moving Arrow: At any instant in
time, an arrow occupies a certain position.
At the next instant of time, the arrow has moved forward somewhat.
His question, somewhat paraphrased, was: How does the arrow know how to get
to this new position by the very next instant?
It cannot be ``moving'' at the first instant in time, because an instant
has no duration---and motion cannot be measured except over some duration.

Let us leave aside, for the moment, the flaws that can be so quickly
pointed out in this argument by anyone versed in modern physics.
Consider, instead, what Zeno would say to us if we travelled back
in time in our Acme Time Travel Machine, and showed him a television
receiver displaying a  broadcast of an archery
tournament.
(Ignore the fact that, had television programmes been in existence
two and a half thousand years ago, Science as we know it would 
probably not exist.)
Zeno would no doubt be fascinated to find that the arrows that moved
so realistically across the screen were, in fact, a \e{series of
static images} provided in rapid succession---in full agreement (or
so he would think) with his ideas on the nature of motion.
The question that would then spring immediately to his lips: \e{How
does the television know how to move the objects on the screen?}

Our response would, no doubt, be that the television \e{doesn't}
know how to move the objects; it simply waits for the next frame (from
the broadcasting station) which shows the objects in their new positions.
Zeno's follow-up: How does the \e{broadcasting station}
know how to move them?
Answer: It doesn't either; it just sends whatever 
images the video camera measures.
And eventually we return to Zeno's original question: How does the 
real arrow 
itself ``know'' how to move?
Ah, well, that's a question that even television cannot answer.

Ignoring for the moment the somewhat ridiculous nature of this 
hypothetical exchange, consider Virtual Zeno's first question
from first principles.
Why \e{can't} the television move the objects by itself?
Surely, if the real arrow somehow knows how to move, then it is not
unreasonable that the television might obtain this knowledge too. 
The only task then, is to determine this information, and tell it to the
television!
Of course, this is a little simplistic, but let us fast-forward our
time machine a little and see what answers we obtain.

Our next visit would most likely be to Aristotle.
Asking him about Zeno's arrow paradox would yield his well-known
answer---that would, in fact, be regarded as the ``right answer'' for the
next 2300 years: Zeno wrongly assumes that indivisible
``instants of time'' exist at all.
Granting Aristotle this explanation of Zeno's mistake, what would his
opinions be regarding ``teaching'' the television how to move the
objects on its own?
His response, no doubt, would be to explain that every object has its
\e{natural place}, and that its \e{natural motion} is such that it moves
towards its natural place, thereafter remaining at rest 
(unless subsequently subjected to \e{violent motions}).
Heartened by this news, we ask him for a mathematical formula
for this natural motion, so that we can teach it to our television.
``Ah, well, I don't think too much of mathematical formul\ae,'' he
professes, engrossed in a re-run of \e{I Love Lucy}, ``although I can
tell you that heavier bodies fall faster than light ones.''
So much for an Aristotelian solution to our problem.

Undaunted, we tweak our time machine forward somewhat---2000 years, in
fact.
Here, we find the ageing Galileo Galilei ready and willing to answer our 
questions. 
On asking about Zeno's arrow paradox, we find a general agreement with
Aristotle's explanation of Zeno's error.
On the other hand, on enquiring how a television might be taught how
to move objects on its own, we obtain these simple answers: 
If the body is in \e{uniform motion}, it moves according to $x=x_0+vt$; if
it is \e{uniformly accelerated}, it moves according to $x=x_0+v_0t+
\half at^2$. 
Furthermore, \e{gravity} acts as a uniform acceleration---changing the
velocity of an object smoothly (and not discontinuously, as earlier
propounded); and, what is more, this rate of acceleration is a constant
for every object.
To this, one must add accelerations other than gravity (such as the force
of someone in throwing a ball) into the equation.
If we teach these principles to our television, 
he explains, it \e{will} then know how to move objects by itself.
And thus, from the first modern physicist, we get the information we 
desire---meanwhile leaving him fascinated by images of 
small white projectiles 
following parabolic paths, subtitled ``British Open Highlights''.

The tale woven in this section is, admittedly, a little fanciful,
but nevertheless illustrates most clearly the thinking behind the
methods to be expounded.
Very intriguing, but omitted from this account, is the fact that 
Aristotle's solution to Zeno's arrow paradox, which remained 
essentially unchanged throughout the era of Galilean relativity and
Newtonian mechanics, suffered a mortal blow three-quarters of a century
ago.
We now know that, ultimately, the ``smooth'' nature of space-time 
recognised by Galileo and Newton, and which underwent a relative 
benign ``warping'' in Einstein's classical General Relativity, must
somehow be fundamentally composed of quantum mechanical ``gravitons'';
unfortunately, \noone\ knows exactly how.
Zeno's very question, ``How does anything move at all?'', is again
\e{the} unsolved problem of physics.
But that is a story for another place.
Let us therefore return to the task at hand, and utilise the method
we have gleaned from seventeenth century Florence.

\newssect{GalAnti}{\Galn\ \Anti ing}
Consider the rasterised display methology reviewed in 
section~\ssect{CurrentRasters}.
How does its design philosophy fit in with the above historical figures'
views on motion?
It is apparent that the ``slicing up'' in time of the images
presented on the display device, considered simplistically, 
only fits in well with Zeno's ideas on motion.
However, we have neglected the human side of the equation: clearly,
if frames are presented at a rate exceeding the viewer's visual system's 
temporal resolution, then the effective integration performed by 
the viewer's brain combines with the ``time-sampled'' images to
reproduce continuous motion---that is, at least for motion that is 
slow enough for us to follow visually.

Consider now the ``interpolation'' procedure used by the video processor
and frame buffer.
Is this an optimal way to proceed?
Aristotle would probably have said ``no''---the objects, in the 
intervening time between updates, should seek their ``natural places''.
Galileo, on the other hand, would have quantified this criticism: the
objects depicted should move with either constant velocity if free, 
or constant acceleration if they are falling; if subject to ``violent
motion'', this would also have to programmed.
Instead, the sample-and-hold philosophy of section~\ssect{CurrentRasters} 
keeps each
object at one certain place on the display
for a given amount of time, and then makes
it \e{spontaneously jump} by a certain distance; and so on.
In a sense, the pixmap \e{has no inertial properties}.
As noted, this \e{is} the ideal behaviour for an object that is not moving
at all; its manifest incorrectness for a moving object is even more
simply revealed by simple \Galn\ mechanics:
Consider how an object travelling at constant apparent velocity $\vv$, with
respect to the display, is depicted with this system.
Mathematically, the trajectory displayed by the video
processor is
\beqn{SampleAndHoldConstV}
\vx(t)=\vx(0)+\vv\,\floor(t),
\eeqn
where $\vx(0)$ is the two-dimensional pixel-position vector at $t=0$,
$t$ is measured in units of frame-periods,  
$\vv$ is the (constant) velocity of the real object being simulated 
(in units of pixels per frame period), and
$\floor(y)$ returns the greatest integer that is smaller than or equal to
$y$.
Now consider applying a \Galn\ transformation of velocity $\vv$
to the \e{viewer}     
of the system, in the same direction that the object is moving
(\eg\ by having the viewer standing on a ``moving
footpath'' purloined from LA International Airport, which travels
past the display device).
The new trajectory seen by this moving
viewer, $\vx'(t)$, is obtained from
the stationary-viewer trajectory $\vx(t)$ by the Galilean transformation
\beqn{GalXfn}
\vx'(t)=\vx(t)-\vv t.
\eeqn
The \e{correct} trajectory of the object, of course, should simply be
\[
\vx'(t)=\vx'(0)\id\vx(0),
\]
\ie\ a stationary object.
On the other hand, application of the transformation \eq{GalXfn} to
the video-controller trajectory \eq{SampleAndHoldConstV} yields
\beqn{SpuriousConstV}
\vx'(t)=\vx'(0)+\vv\braces{\floor(t)-t}.
\eeqn
The function $f(t)\id\floor(t)-t$ appearing here can be
recognised as simply a ``saw-tooth'' function, ramping
linearly from $f(0)=0$ to $f(1^-)=1^-$, at which instant it jumps
back to $f(1^+)=0^+$; it then ramps linearly back up to $f(2^-)=1^-$, and
jumps back down to $f(2^+)=0^+$;
and so on. 
\e{It is this spurious motion, and this motion alone, that causes the
sample-and-hold display philosophy to perform poorly for 
uniformly moving
objects.}
The basic idea of a frame buffer is not the problem: rather,
the fault lies with the \naive\ way in which it is used.
It is also seen why a longer update period worsens the effect: the
object ``wanders'' further---and for a longer time---before 
``jumping'' back to
its correct position.
It is not surprising that such an effect is nauseous; the amount of
inebriation required to simulate this effect in \RR\ is more
than enough to separate the participant from his or her last meal.

This spurious motion can also be viewed in another light.
If one draws a \e{space-time diagram} of the trajectory of the object
as depicted by the sample-and-hold video display, one obtains a 
staircase-shaped path.
The \e{correct} path in space-time is, of course, a straight line.
The saw-tooth error function derived above is the difference between
these two trajectories; the ``jumping'' is the exact spatio-temporal
analogue of \e{the jaggies}---the (spatial) ``staircase'' effect 
observable when  straight lines are rendered in the simplest way 
on bitmapped (or rectangular-grid-sampled) displays.
The mathematical description of this general problem with
sampled signals is \e{aliasing};
in rough terms, high-frequency components of the original 
image ``masquerade as'', 
or \e{alias}, low-frequency components when ``sampled'' by the
bitmapping procedure, rendering the displayed image a 
subtly distorted and misleading version of the original.

As is well-known, however, aliasing \e{can} be avoided in a sampled
signal, 
by effectively filtering out the high-frequency components of the
original signal before they get aliased by the sampling procedure.
This technique, applied to any general sampled signal, is
termed \e{\anti ing}; 
in the field of computer
graphics, reference is often made to \e{spatial \anti ing} techniques
used to remove ``the jaggies'' from scan-converted images.
(This is often shortened, in that field, 
to the unqualified term ``\anti ing''; we shall
reject this trend and \rei nstate the adjective ``spatial''.)
For the same reasons,
the ``jerky motion'' of sample-and-hold video controllers is thus
most accurately referred to as \e{spatio-temporal aliasing};
any method seeking to remove or reduce it is \e{spatio-temporal 
\anti ing}.

One form of spatio-temporal antialising is performed every time we view
standard television images.
Generally, television cameras have an appreciable \e{shutter time}: 
any motion
of an object in view during the time the (electronic) shutter is ``open''
results in \e{motion blur}.
That such blur is in fact a \e{good} thing---and not a shortcoming---may
be surprising to those unfamiliar with sampling theory.
However, the fact that the human eye easily detects the 
weird effects of spatio-temporal aliasing if motion blur is \e{not}
present, even at
the relatively high field rate of 50~Hz (or 60~Hz in the US), can be
appreciated by viewing any footage from a modern sporting event, such as
the Barcelona Olympics.
To improve the quality of the now-ubiquitous slow-motion replay
(for which motion blur is stretched to an unnatural-looking extent), 
such events are usually shot with cameras
equipped with \e{high-speed} electronic shutters, \ie\ electronic shutters
that are only ``open'' for a small fraction of the time between frames.
The resulting images, played at their natural rate of 50 fields per 
second, have a surreal, ``jerky'' look (often called the ``fast-forward
effect'' because the fast picture-search methods of conventional video
recorders lead to the same unnatural repression of motion blur).
This effect is, of course, simply spatio-temporal aliasing; that it
is noticeable to the human eye at 50 fields per second (albeit only
25 \e{frames} per second) illustrates our visual sensitivity.
(For computer-generated displays, for which simulating motion blur
may be relatively computationally expensive, increasing the
refresh and update rates to above 100 Hz and relying on integration
by the CRT phosphor or LCD pixel, and our visual system, 
may be the simplest solution.)

This \e{frame}-rate spatio-temporal aliasing, which is relatively easy
to deal with, is not usually a severe problem.
Our immediate concern, on the other hand, is a
much more pronounced phenomenon: 
the \e{update}-rate spatio-temporal
aliasing produced by the sample-and-hold nature
of conventional video
controllers (the spurious motion described by \eq{SpuriousConstV}).
Correcting the video controller's procedures to remove this spurious
motion is thus our major task.
Again, we recall Zeno's question: how does the arrow know where to move,
if it only knows where it is, not where it's going?
The answer, supplied first by Galileo (albeit in a somewhat long-winded
form, in pre-calculus days), 
is that we need to know the \e{instantaneous time derivative
of the position} (\ie\ instantaneous velocity) of the object at that
particular time, in addition to its position.
We shall refer to the use of such information 
(or, in general, any arbitrary number of temporal
derivatives of an object's motion) to perform update-rate spatio-temporal
antialiasing as \e{\Galn\ \anti ing}.
Suggested methods for carrying out this procedure with existing technology
are described in the remainder of this \typeofdoc.

To carry out this task, we need to \ree xamine the
video controller philosophy
described in section~\ssect{CurrentRasters}. 
The most obvious observation that strikes one is that, using that 
design methodology,
\e{velocity information is
not provided to the video controller at all!} 
The reason for this omission is easily understood in historical perspective.
\e{Television} applications for CRTs preceded computer 
graphics applications by decades.
At least initially, all television images were generated by
simply transmitting the signal from a video camera, or one of a number
of available cameras.
However, normal video cameras have no facilities for determining 
the \e{velocity} of the objects they view. 
(Although this is not, in
principle, impossible, it would be technically challenging, and quite
possibly of no practical use.)
Rather, the high frame and field rate of a television picture 
alone, together with suitable motion blur, were sufficient
to convince the viewer of the television image that they were
seeing continuous, smooth motion.

When CRTs were first used for computer applications, in vector displays,
the voltages applied to the deflection magnets were directly controlled
by the video harware; such displays' only relation to television
displays was that they both used CRT technology.
However, when simple \e{rasterised} computer displays became feasible in the
early 1970s, it was only natural that their development was built on the
vast experience gathered from television technology---which, as noted,
has no notion of storing velocity information.
In fact, 
it is only in very recent years that memory technology has been 
sufficiently advanced that the \e{physics of the display devices}---rather
than the amount of amount of video memory feasible---is now
the limiting 
factor in developing ever more sophisticated displays at a reasonable
price.
To even contemplate storing the velocity information of a frame---even
if it \e{were} possible to determine such information---is something
that would have been unthinkable ten years ago.
It is, of course, no \coin cidence that the field of \VR\ has also 
just recently become cost-effective: the immature state 
of processor and memory technology was the critical factor that 
limited Sutherland's pioneering efforts twenty-five years ago. 
It is thus no surprise that the fledgling commercial field of
\VR\ requires new approaches to traditional problems.

Of course, the very nature of \VR, while putting us in the position
of requiring rapid updates to the entire display,
conversely provides us with the \e{very} information
about displayed objects we need: 
namely, velocities, accelerations, and so on,
rather than just the simple \e{positional} information that a television
camera provides.
Now, it is of course
a trivial observation that all virtual-world engines 
already ``know'' about the laws of Galilean mechanics, or Einsteinian
mechanics, or nuclear physics---or indeed any system of mechanics 
that we wish to program into them, 
either based on the real universe, or
of a completely fictional nature.
In that context, our rehash of the notions behind Galilean mechanics
may seem trivial and unworthy of the effort spent.
What existing virtual-world engines do \e{not} do, however, 
is \e{share some
of this information with the video controller}.
On this front, apparent triviality is magnified to 
enormous importance; our neglect of these same physical laws is, in fact, 
creating an artifical and unnecessary
degradation of performance in many existing \VR\ hardware
methodologies.

There is no reason for this omission to continue; the physics has been 
around for over three hundred and fifty years; and, fortunately,
the technology is now ripe.
The following sections will provide, it is hoped, at least a very crude 
and simplistic outline of the
paths that must be travelled to produce a fully-functional
\VR\ system employing \Galn\ \anti ing.

\newssect{Galpixels}{Galpixels and $\Gal{n}$ pixmaps}                       
Historically, rasterised computer graphics came into existence as soon as
solid-state memories of suitable capacity were able to be
fabricated.
It is therefore not difficult to guess the number of bits that were
initially allocated to each pixel: one.
Such technology was, for this reason, also referred to as 
\e{bitmapped graphics}: the bits in the memory device 
provided a rectangular ``map''
of the graphics to be displayed---which, however, could therefore only 
accomodate bi-level displays.

As memory---and the processor power necessary to use it---became 
even more plentiful, rasterised display options
``fanned out'' in a number of
ways.
At one extreme, the additional memory could be used 
to simply improve the spatial 
resolution of the display, while maintaining its bitmapped nature.
At the other extreme, the additional memory could be used exclusively
to generate
a multi-level response for each pixel position---for grey-scale, say,
or a choice of colours---without increasing the resolution of the
display at all; the resulting memory map, now no longer
accurately described as a ``bit'' map, is preferentially referred to
as a \e{pixmap}.
In between these two extremes are a range of flexible alternatives; to
this day, hardware devices often still provide a number of different
``video modes'' in which they can run.

Increasing the memory availability yet further led, in the 1980s, to the
widespread use of \e{$z$-buffers}, both in software and, increasingly,
hardware implementations (whereby the ``depth'' of each object displayed
on the display is stored along with its intensity or colour).
We can see here already an extension to the concept of a pixel: not
only do we store on--off information (as in bitmaps), nor simply
intensity or colour shading information (as in early pixmaps), we
also include additional, \e{non-displayed} information that assists
in the image generation process.
(Current display architectures also routinely store several more bits of
\e{control} information for each pixel.)

We now extend this concept of a ``generalised pixel'' still further,
with our goal of \Galn\ \anti ing firmly in our sights.
As well as storing the pixel shading, $z$-buffer and control information,
we shall also store the \e{apparent velocity} of each pixel in 
the pixmap.
We use the term \e{apparent motion} to describe the motion of objects
in terms of display Cartesian
\coord s: $x$ horizontal, increasing to the right;
$y$ vertical, increasing as we move upwards; and $z$ normal to the display,
increasing as we move out from the display towards our face.
This motion will typically be related to the \e{physical motion} of the
object (\ie\ its motion through the 3-space that the system is 
simulating) by perspective and rotational transformations; however,
in section~\sect{Enhancements}, more sophisticated transformations
are suggested between the apparent and physical spaces.

Thus, for $z$-buffered displays (assumed true for the remainder of
this \typeofdoc), three apparent velocity components must be stored for each
pixel---one component for
each of the $x$, $y$ and $z$ directions.
The motional information stored with a pixel, however, 
need not be limited to simply its apparent velocity.
In general, we are free to store as many instantaneous 
derivatives of the object's motion as we desire. 
The rate of change of velocity, the \e{acceleration} vector $\va$,
is an obvious candidate.
The \e{rate of change of acceleration}, 
which the physicist Rich\-ard~P.\ Feynman christened
the \e{jerk}, may likewise be stored; as can the rate of change of
jerk (for which the author knows no proposed name); the rate of change
of the rate of change of jerk; and so on.

We shall defer to the next section the process of deciding
just how many such motional derivatives we should store with each pixel.
For the moment, 
we shall simply refer to any pixmap containing motional information
about its individual pixels as a \e{\Galn\ pixmap}, or \e{galpixmap}.
The individual pixels within a galpixmap will be referred to as 
\e{\Galn\ pixels}, or \e{galpixels}.
Of course, in situations where
distinctions need \e{not} be made between these objects 
and their traditional counterparts, the additional prefix \e{gal-} may 
simply be omitted.

Finally, it will be useful to have some shorthand way of denoting
the highest order derivative of (apparent) motion that is stored within 
a particular galpixmap.
To this end, we (tentatively) use the notation \e{$\Gal{n}\!$ pixmap},
or, more verbosely, \e{Galilean pixmap of order $n$}, where $n$ is the
order of the highest time-derivative of the apparent position of each
galpixel that is stored in the galpixmap.
Thus, a conventional pixmap, which only records the position of each pixel
(encoded by its position in the pixmap,
together with its $z$-buffer information), and \e{no} higher time
derivatives, may be described as a $\Gal{0}$ pixmap.
Galpixmaps that store velocity information as well are $\Gal{1}$ pixmaps;
those that additionally store acceleration information are $\Gal{2}$
pixmaps; and so on.

As will be seen shortly, additional pieces of
information, over and above 
mere motional derivatives, will also be
required in order to effectively carry
out \Galn\ \anti ing in practical situations.
Although the amount of information thus encoded may vary from
implementation to implementation, we shall not at this stage propose
any notation to describe it; if indeed necessary, such notation will 
evolve naturally in the most appropriate way.

\newssect{GalpixmapStructure}{Selecting a Suitable Galpixmap Structure}
We now turn to the question of determining \e{how much} 
additional
information should be stored with a galpixmap, in order to maximise the
overall improvement in visual capabilities of the system
that are perceived by
the viewer.
Such questions are only satisfactorily answered by considering 
\e{psychological} and \e{technological} factors in equal proportions. 
That a purely technological approach fails dismally is simply shown:
consider the sample-and-hold video controller
philosophy described in section~\ssect{CurrentRasters}, as
(successfully) applied to static objects on the display.
We noted there that the video controller effectively boosted the
perceived information rate of the display from the \e{update} rate
up to the \e{refresh} rate, simply by repeatedly showing the same image.
Shannon's
information theory, however, tells us that this procedure \e{does not},
in fact, increase the information rate one bit: the repeated frames
contain no new information---as, indeed, can be recognised by noting
that the viewer could, if she wanted to, reconstruct these replicated
frames ``by hand'' even if they were not shown.
Thus, even though we \e{know} that frame-buffered rasterised displays
``look better'' than display systems without such buffers (\eg\ vector
displays), information theory tells us that, in a raw
mathematical sense, the frame buffer itself doesn't do anything
at all---a
fact that must be somewhat ironically
amusing to at least one of Shannon's former
PhD students.

Raw mathematics, therefore, does not seem to be answering the questions
we are asking.
A better way to view this \e{apparent} increase in information rate is
to examine the viewer's subconscious prejudices about what her eyes see.
She may not, in fact, even realise that the display processor 
\e{is} only generating
one update every so often: to her, each frame looks just as fair dinkum
as any other.
All of this visual information---a static image---is 
simply preprocessed by her
visual system, and compared against both ``hard-wired'' and ``learnt'' 
consistency checks.
Is a static image a reasonable thing to see? 
Did I really see that?
Was I perhaps blinking at the time?
Am I moving or am I stationary?
What do I \e{expect} to see?
It is the lightning-fast evaluation of these types of question that
ultimately determines the ``information'' that is abstracted from the
scene and passed along for further cogitation.
In the case described, assuming (say) a stationary viewer sitting in
front of a fixed monitor, all of the consistency checks balance: there
appears to be a fair-dinkum object sitting in front of her.
In other words,
the display is providing sufficient information for her brain to
conclude that the images seen are consistent with what would be seen if
a real object were sitting there and reflecting photons through a
transparent medium in the normal way; 
that is all that ultimately registers.

We now turn again to our litmus test: an object 
with a \e{uniform apparent 
velocity} being depicted on the display.
Using a $\Gal{0}$ display, such as described in 
section~\ssect{CurrentRasters}, results in the 
motion depicted in equation \eq{SampleAndHoldConstV}.
What does the viewer's visual system say now?
Is that an object that keeps disappearing and popping up somewhere
else?
Is it really something moving so jerkily?
Why doesn't it move like any animal I've ever seen before?
The answers to these questions depend on just how slow the update rate is,
the context that the images are presented in, and, most likely, the
past experiences of the viewer.
Let us assume, however, that her brain \e{does} decide that the scene
is, in fact, depicting a single object in motion, rather than the
spontaneous and repetitive destruction and creation of similar-looking
objects.
Immediately after this decision is reached 
her visual processing system performs a hard-wired 
\Galn\ transformation to that part of the scene in which the object
moves, such as described in section~\ssect{GalAnti}.
Why? 
Because as animals we learnt the hard way 
that it isn't enough to simply know
whether something is moving as a whole---one also needs to know what
the motion of the \e{parts} of the object are.
Is that human walking towards us with arms swinging by its sides,
or with arms outstretched ready to throttle us?
The \Galn\ transformation 
\eq{GalXfn} removes the (already-decided-on) uniform
motion of the object, to let the viewer then determine the relative 
motion of its constituent parts.
The result, as we have already shown in section~\ssect{GalAnti}, is
a weird-looking saw-tooth motion, involving spontaneous teleportations
every time the frame buffer is updated.
Does this look like any animal we have ever seen?
No, not really.
OK, then, maybe we didn't see it too well?
Yes---that must be it---I probably didn't see it properly.
How well this rationalisation can be tolerated depends on how
low the update rate is: seeing \e{is} believing---but only if you see it
for at least 100 milliseconds.

Let us now assume that the display system is not a $\Gal{0}$ device
at all,
but is rather one of the freshly-unpacked $\Gal{1}$ models.
At some initial time, the object appears on the display; the frame buffer
has been updated to show that it is there.
(Ignore, for the moment, this instantaneous birth.)
One frame later, the display processor is still busy redrawing things;
the video controller must decide itself what to do with the image.
Firstly, it clears a \e{third} frame buffer 
(in addition to 
the one that it has just finished
scanning from, 
and the one that
the display processor is talking to), 
into which it is going to generate a new image.
Secondly,
it goes through the entire frame buffer that it has just displayed, 
galpixel by galpixel.
At each pixel, it retrieves the velocity information for that galpixel.
It then adds this velocity (measured in pixels per frame)
to the current position, to find out where that galpixel would be one
frame later. 
It then writes this information into the new frame buffer at the
appropriate position; and repeats the process for all the galpixels in the
original frame buffer.
Thirdly, it worries a bit about those galpixels in the new frame buffer
that didn't get written to; let us ignore these worries for the moment,
and just leave the ``background'' colour in those pixels. 
Finally, it scans the new frame buffer onto the display device.

Ignore, for the moment, that the procedure described seems to double the
amount of time for the video controller to do its work.
(A moment's reflection reveals that, in any case, there is no 
fundamental reason why the new-frame-drawing procedure cannot occur
at the same time that the \e{previous} frame is being scanned to the
display device.)
What will the viewer think that she is seeing?
Well, the object will clearly
jump a small distance each frame---with each
jump exactly the same size as the last (at least, to the nearest pixel),
until the new update is available.
If the object really \e{is} travelling with constant apparent velocity
(our assumption so far), then upon receipt of the 
new image update, the
object will jump \e{the same} small distance from the last 
(video-controller-generated) frame as it has been jumping in the mean time
(assuming focus-switching is appropriately synchronised, of course).
Now, the \e{refresh} rate of the system is assumed to be significantly
faster than the visual system's temporal resolution; therefore,
the motion will look like convincingly like uniform motion.
Uniform motion has been Galilean antialiased!

Let us examine, now, what ``residual'' motion we are left with when this
uniform motion is ``subtracted off'', via a \Galn\ transformation, from
the perceived motion.
We now---thankfully---do not end up with the horrific expression
\eq{SpuriousConstV}, but rather with an expression that is 
\e{almost} zero.
In the setup described the error is not \e{precisely} zero---if the apparent
velocity of the object does not happen to be some integral 
number of pixels per frame, then the best we can do is move the pixel
to the ``closest'' computed position---leading to a small pseudo-random
saw-tooth-like
error function in space-time, \ie\ we are hitting the fundamental 
physical limits of our display system.
However, the fact that the \e{amplitude} of the error is at most one pixel
in the spatial direction, and one frame in the temporal direction, means
that it is a vastly less obtrusive form of antialiasing than the gross
behaviour described by \eq{SpuriousConstV}. 
(If so desired, however, even this 
small amount of spatio-temporal aliasing can be removed with 
suitable trickery in the video controller; but we shall not worry about
such enhancements in this \typeofdoc.)

Having successfully convinced the viewer of near-perfect constant motion,
let us now worry about what happens if the object in question is, in
fact, being \e{accelerated} (in terms of display \coord s), rather
than moving with constant velocity.
For simplicity, let us assume that the object is undergoing \e{uniform
acceleration}. 
Fortunately, such a situation is familiar to us all: excluding air
resistance, all objects near the surface of the earth ``fall'' by
accelerating (``by the force of gravity'', in 19th century terminology)
at the same constant rate.
How do our various display systems cope with this situation?

Let us assume that the object in question is initially stationary, 
positioned near
the ``top'' of the display.
Let us further assume that the acceleration has the value 2~pixels per
frame per frame. 
Firstly, let us consider the optimal situation: the display
processor is sufficiently fast to update the object each frame.
Clearly, if we shift our axes in such a way
that $y=0$ corresponds to the initial
position of the object, its vertical position in successive frames
will be given by 
\beqn{AccelBest}
-y=0,1,4,9,16,25,36,49,64,81,\ldots, 
\eeqn
as is verified from the formula
$y=y_0+v_0t+\half at^2$, where in this case $y_0=0$, $v_0=0$ and $a=-2$,
and $t$ is measured in frame periods.
Since this motion is depicted at the \e{refresh} rate---which, by our
assumptions, is sufficiently fast that our visual system does not
perceive any inherent temporal aliasing---the object should look a real
object falling, without air resistance, to the ground.
(We defer a more complete discussion of \e{motion blur} to 
another place.)

Let us now examine how this object is depicted on a $\Gal{0}$ display
device, such as described in section~\ssect{CurrentRasters}.
For simplicity, assume that the display processor takes precisely \e{two} 
frame periods to
render each completed image.
Clearly, the sample-and-hold nature of the video controller will simply
yield the positions 
\beqn{AccelG0Display}
-y=0,0,4,4,16,16,36,36,64,64,\ldots.
\eeqn
The error in each frame can be obtained by simply subtracting 
from the values in
\eq{AccelG0Display} their counterparts in \eq{AccelBest},
yielding
\[
\D y_\txt{error}=0,1,0,5,0,9,0,13,0,17,\dots.
\]
It is apparent that the error gets worse as the object accelerates:
it does not simply ``ramp'' between two bounds as was the case for uniform
velocity.
Whether or not this can be recognised by our viewer as smooth motion or
not depends on the resolution of the device.
However, it should be noted that the
\e{psychological} mismatches
that are accumulating here are of a worse nature than 
for simply uniform motion.
The reason for this is that, firstly, the viewer's visual preprocessing
system must decide, at each point in time, if the object on the display
really is moving at all---\ie, whether it has a \e{velocity}, such as
described above.
Secondly, her visual processing system
must then subconsciously determine whether the object is in fact
\e{accelerating}. 
How do we know that she cares about acceleration?
\e{Toss a tennis ball her way.}
The fact that humans can catch projectiles moving under the
acceleration of gravity shows
that we are capable (by some mechanism) of mentally
computing the effects of acceleration.
This is not surprising, given that everything on the surface of the
earth not held up by something else accelerates downwards at a constant
rate.
Whether \e{space}-born and -bred humans would develop their visual systems
in the same way, or whether they would ``evolve'' to reduce the
psychological importance of acceleration in their thinking, is a 
question that is beyond the author's reckoning; but it is nevertheless
a hypothetical (or, perhaps in time, a not-so-hypothetical) question
that emphasises the all-important 
fact that \e{the way we perceive the world depends
greatly on how we are used to seeing it behave}.

Let us now return to the case of the falling object, and determine how
its motion will be depicted on the $\Gal{1}$ display device that
served our purposes so admirably in our uniform-velocity 
thought experiment above.
Again, assume that the display processor updates the image
only once every two
frames.
On each update, the \e{velocity} of each galpixel must also computed;
for the object in question,
the formula $v=v_0+at$, with $v_0=0$, yields the velocity for each
successive frame:
\beqn{AccelG1Vel}
-v_\txt{stored}=0,0,4,4,8,8,12,12,16,16,\ldots.
\eeqn
We have here copied the velocity from every even-numbered frame (the ones
being updated) to the odd-numbered frames: the video controller, having
no better information, simply continues to assume that the velocity of the
galpixel is constant until the next update arrives, and thus copies
this information from frame to frame as it copies the galpixel.
In the current example, this ``propagated'' velocity 
(\ie\ the velocity that is assumed constant from frame to frame)
is not actually used to compute anything
(as the video controller only fills in \e{one} 
frame itself after each update),
but we shall shortly examine a case in which it \e{is} used
(namely, when the display processing takes longer than two frames
to generate each update).

It is straightforward to compute how the $\Gal{1}$ display system will
depict our uniformly-accelerated 
object's motion: using the velocities in \eq{AccelG1Vel}
to extrapolate the object's \e{position} for every odd frame, we obtain
\[
-y=0,0,4,8,16,24,36,48,64,80,\ldots. 
\]
Subtracting from this the sequence of 
\e{correct} positions, \eq{AccelBest}, we find that
\[
\D y_\txt{error}=0,1,0,1,0,1,0,1,0,1,\dots.
\]
Obviously, we have here a vast improvement over the $\Gal{0}$ display:
the errors, while not zero, are at least \e{bounded}.

Let us now consider making these last two
examples just a tad more realistic. 
Keeping the other parameters constant, let us assume that the display
processor is now only fast enough to generate one update every \e{three}
frames.
Our dust-gathering $\Gal{0}$ display system will 
successively show the object to be at the
positions
\[
-y=0,0,0,9,9,9,36,36,36,81,\ldots.
\]
The errors represented by these positions are, respectively,
\[
\D y_\txt{error}=0,1,4,0,7,16,0,13,28,0,\dots.
\]
Obviously, the errors are worse than even the horrible performance
with two frames per update.
Let us, therefore, turn our attention immediately away from these
horrible figures (in the best spirit of politicians), and consider
instead our now-worn-in $\Gal{1}$ video controller. 
The velocity values stored with the galpixmap, as copied across by 
the video controller where necessary, will now be
given by
\[
-v_\txt{stored}=0,0,0,6,6,6,12,12,12,18,\ldots.
\]
The position values used to display the 
object, as determined by the video controller, can then be computed 
to be
\beqn{AccelG1Pos3}
-y=0,0,0,9,15,21,36,48,60,81,\ldots,
\eeqn
which represent errors of
\[
\D y_\txt{error}=0,1,4,0,1,4,0,1,4,0\dots.
\]
We now see the precise capabilities of a $\Gal{1}$ system when dealing
with a uniformly-accelerated object.
In between updates, the positional 
error increases \e{quadratically}---\ie\ in a 
parabolic shape.
Once the display processor provides an update, the 
positional error returns (as it always does) to zero.
It then proceeds to increases quadratically to the \e{same} maximum
error as before (not to ever-worse values, as is the case for a $\Gal{0}$
system); and repeats the cycle.
This bounded-error feature of a $\Gal{1}$ display system with uniformly
\e{accelerated} motion resembles, to some extent, the (bounded) 
positional errors associated with a $\Gal{0}$ system for an object uniform
\e{velocity} (apart from the change of shape from ramp to parabola, of
course).
This is not very surprising when one considers that, in both of these
cases, the video controller ``knows'' about all of the
non-zero temporal derivatives of the object's motion save for the highest
order one.
On the other hand, this raises a worrying problem: in general, the
apparent motion of an object on the display will be an arbitrary
analytical function of time (excluding, for the moment, object
``teleporting'',
and any non-physical exotic mathematical functions introduced by a 
devious programmer).
The problem is that an arbitrary analytical function has an \e{infinite
number} of non-zero positive-exponent Taylor series coefficients, 
\ie\ one needs to know \e{all} of its derivatives to extrapolate its motion
indefinitely.
How can we possibly justify going on to $\Gal{2}$ systems, then 
$\Gal{3}$ systems, then $\Gal{4}$ systems, \e{et cetera ad infinitum}?
Are we doomed?

Clearly, this \typeofdoc\ would not be in public circulation were this
problem a real one, rather than a case of mathematics-gone-wild. 
Already, we have seen that using a $\Gal{1}$ display system vastly improves
upon the performance of a $\Gal{0}$ system, even if it cannot \e{fully}
extrapolate accelerated motion.
If all we are interested in is \e{some} low-cost performance improvement,
rather than full mathematical rigour in extrapolation, then we already
know how to obtain it.
However, we shall shortly see that we can do better than this:
there is clearly a point of diminishing returns beyond which it is
not worthwhile
going to the next $\Gal{n+1}$ system.
To justify this statement, 
however, we must first remove our attention from the 
abtract world of mathematics, and focus again
on the most important component in our physical system: \e{the 
participant}.

We first return to the above thought experiment of
using a $\Gal{1}$ display system for uniformly accelerated motion.
We earlier obtained results for the \e{positional} error of the
display (the repeated-parabola): what about the \e{velocity} error?
To assess this, we must first 
invent some crude model for the way in which the viewer's
brain computes velocities in the first place.
The simplest suggestion for such a model, based (perhaps very  
erroneously) on traditional mathematical methods, 
might be that the visual system simply takes
\e{differences in positions} between ``brain ticks''---or, here, between 
frames---to compute an average velocity for each ``tick''.
Admitting, for the moment, that this model is not too outrageous,
we can proceed to compute the values that the 
viewer's brain would compute, using such an algorithm,
when viewing the $\Gal{1}$ display results in 
\eq{AccelG1Pos3}.
Taking these differences between positions at adjacent times,
we obtain
\beqn{ComputeVel}
-v=0,0,9,6,6,15,12,12,21,\ldots.
\eeqn
The problem is, however, the following: how do we ascribe a particular
\e{time} to each of these
values, being as they are differences between \e{two} adjacent times?
For example, if we take the difference $y(5)-y(4)$, does this velocity
``belong'' to $t=4$ or $t=5$?
The least complicated choice is to simply decide that this velocity
evaluation ``belongs'' to $t=4.5$---``split the difference'', as it were.
With this ansatz,
the sequence of values \eq{ComputeVel} corresponds to the perceived
velocity computed at the times $t=0.5,1.5,2.5,3.5,\ldots$.
The \e{correct} velocity values, on the other hand, evaluated at these
particular times, are simply computed via $v=v_0+at$ as
\beqn{ActualVel}
-v=1,3,5,7,9,11,13,15,17,\ldots.
\eeqn
Taking the difference between \eq{ActualVel} and \eq{ComputeVel} thus
yields the errors in the computed velocities:
\[
\D v_\txt{error}=-1,-3,+4,-1,-3,+4,-1,-3,+4,\ldots.
\]
Now, this is the first error sequence that we have encountered that has
had \e{both positive and negative values}.
In fact, it is clear that the \e{time-average} of this sequence is
zero: $(-1)+(-3)+(+4)=0$.
In other words (as would have already been blindingly obvious to
anyone who has ever designed a feedback-control system), using
$\Gal{1}$ antialiasing
results in the correct \e{average} velocity being depicted for an
object (where we average over a complete update period), \e{even if
the motion itself contains higher derivatives}.
It is this observation that will tell us when to stop increasing the $n$
in $\Gal{n}$.

Let us now entertain the notion that we could, at this stage, be
forever more happy with our $\Gal{1}$ display technology.
Excluding the fact, for the moment, 
that modern computer equipment already has a half-life
shorter than most pairs of socks,
would the general \e{principles} of $\Gal{1}$ antialiasing be such
that we would never desire to venture to any $\Gal{n}$ for $n>1$?
That this is not an unreasonable notion is supported by the simple fact 
that $\Gal{0}$ technology has been widespread for about twenty years,
and \noone\ seems to be complaining about \e{it}.
(Yet.)
On the other hand, since this section does not end at the end of this
sentence, the reader is no doubt anticipating that the story of our
hypothetical viewer is not yet complete.

The main problem with our simplistic examples above, which will show 
why they are not completely representative of the real world,
is that they are all \e{one-dimensional}.
Of course, we can tranform the ``falling'' example to a ``projectile''
situation by simply superimposing an initial horizontal velocity on
the vertical motion; this would simply require applying the uniform motion
and uniform acceleration cases cojointly.
However, we have also simplified the world by only talking about some
unspecified, featureless, Newtonian-billiard-ball-like ``object'',
without worrying about the extended 
three-dimensional structure of the object.
A 2-D world, of course, is infinitely more interesting than
a 1-D world, not least because it becomes possible to 
step \e{around} other people, instead of simply bouncing into them.
In a ``$2\half$-D'' system---such as in many video games,
or Microsoft Windows---\e{three}-dimensional
structure is represented by painting
the object with depth-emulating visual clues; but 
the objects still only
move, effectively, in two dimensions.
On the other hand, in a manifestly three-dimensional environment such
as \VR, the two-dimensional display is a subtle \e{perspective
projection} of the virtual three-space.
Objects in such virtual worlds, unlike their 
2-D or $2\half$-D counterparts,
are free to rotate about arbitrary axes, as well as to move closer to
or further from the participant.
The \e{apparent} motion of the objects on the display device is
a complicated interplay between the physical motion of the objects,
the physical motion of the participant, and the mathematical transformations
yielding the perspective view. 
It is important that we consider more general forms of motion, 
such as this, before
drawing any general conclusions about whether $\Gal{1}$ antialiasing
is good enough for practical purposes.

%
%  ...File 2 of 4 should be concatenated here ...