From: John Costella <jpc@tauon.ph.unimelb.edu.au>
Subject: PAPER: Galilean Antialiasing for VR, Part 02/04
Date: Mon, 26 Oct 92 5:18:12 EET


%  File 2 of 4.  NOTE: All four files MUST be concatenated
%                before this document can be LaTeXed.
%
%
%  ... Continuation of "Galilean Antialiasing for Virtual Reality Displays"
%  
%  (The following line *must* be left blank.)

In fact, a fairly simple example will suffice to 
illustrate the problems that are encountered when we move up from 
1-D to 2-D or 3-D motion.
Consider a two-dimensional square, in the plane of the display,
which is also \e{uniformly rotating} in that same plane about the centre
of the square.
Choose any point on the boundary of this square.
The path traced out by this point in time will be a \e{circle} (as is
simply verified by considering that point alone, without the complication
of the rest of the square).
Now, a $\Gal{0}$ display will represent this rotating point as a series
of discrete points on the circle: the overall display will show successive
renderings of a square in various stages of rotation, from which the
viewer will (hopefully) be convinced that it is a single square undergoing
a somewhat jerky rotation.

Now consider how a $\Gal{1}$ display will render this rotating square.
The \e{velocity} of our chosen point on the square will depend on how
far the point is from the centre of the square; its direction will be
perpedicular to the line joining it to the centre.
Consider what happens when we propagate forward in time from this
position, using only the velocity information: the point moves in
a \e{straight line}. 
But we really want it to travel in a circle!
What will happen to the square?
Well, for very small times, the square will indeed be rotating 
beautifully---a vast improvement on the $\Gal{0}$ situation.
If we keep propagating the square further, however, we find a
disconcerting feature: \e{the square is slowly growing larger as it
rotates!}
Propagate yet further and this growth dominates, and the rotation slows
to a trickle.
Of course, when the next update comes along, the square spontaneously
shrinks again---not a good solution.

It might be argued that this example merely shows that you cannot 
stretch the \Galn\ \anti ing procedure indefinitely; ultimately, you 
must update the display at a \e{reasonable} rate, even if that is
considerably lower than the refresh rate.
To a certain extent, this is indeed true.
However, rejecting $\Gal{1}$ antialiasing as an optimal solution,
due to its poor performance for rotation, can be justified on the basis
of a good ``rule of thumb'' in Science: Approximating an arbitrary
curve by a straight line is usually pretty bad; approximating it
with a parabola is infinitely better; approximating it with a cubic \e{may}
be worthwhile; using anything of higher order 
is probably a waste of resources,
unstable, or both.
For sure, this is not a definitive law of Nature, but in any case 
it suggests that
we may be much better off going to $\Gal{2}$, or perhaps $\Gal{3}$,
rather than sticking simply with $\Gal{1}$ technology.

This purely mathematical line of reasoning, 
while most helpful in our deliberations, nevertheless again overlooks
the fact that, ultimately, all that matters is what the viewer \e{thinks}
she is seeing on the display, not how fancy we are with our
mathematical prowess.
To investigate this question more fully, it is necessary to perform
some investigations of a psychological nature that are not
completely quantitive.
These deliberations, however, shall require an additional piece of
equipment, readily available to the author, but (unfortunately) not
to all \VR\ workers: a \e{Melbourne tram}.
(Visitors to the \VR\ mecca of Seattle can, however, make use of the
authentic Melbourne trams that the city of Seattle bought from the
Victorian Government ten years ago, which now trundle happily along the
waterfront with a view of Puget Sound rather than Port Phillip Bay.)
Melbourne trams (the 1920s version, not the new ``torpedo'' variety)
have the unique property that, no matter how slowly and carefully
they are moving, they always seem to be able to
hurl standing passengers spontaneously 
into the lap of the nearest seated passenger (which may or may
not be an enjoyable experience, depending on the population of the tram).
This intriguing (if slightly frivolous) property of such vehicles
can actually be
used as a reasonably quantitative experimental investigation of the
kinematical capabilities of humans.

Consider a Melbourne tram located in the Bourke Street Mall, sitting
at the traffic lights controlling its intersection with 
the new Swanston Street Mall.
A passenger standing inside the tram looks out the window.
Apart from wondering why on earth Melbourne needs so many malls---or 
indeed why one needs traffic lights at all at the intersection of
two malls---the 
passenger is unperturbed; she can take a good look at the 
surrounding area.
She drops a cassette from her Walkman into her handbag: it falls
straight in.

Now consider the same tram thirty seconds later, as it is
moving at a constant velocity along Bourke Street.
The standing passenger is again simply standing around; cassettes
drop fall straight down; if it were not for the passing scenery,
she wouldn't even know she was moving.
And, of course, physics assures us that this is always the case: 
inertial motion cannot be distinguished from ``no'' motion, except with
reference to another moving object.
The laws of physics are the same.

We now take a further look at our experimental
tram: it is now accelerating
across the intersection at Russell Street.
Old Melbourne trams, it turns out, have quite a constant rate of
acceleration when their speed controls are left on the same ``notch'', at 
least over reasonable time periods.
Let us assume that this acceleration is indeed constant.
With our knowledge of physics, we might predict that our standing
passenger might have to take some action to avoid falling over:
the laws of Newtonian physics \e{change} when we move to an accelerated
frame.
Inertial objects do not move in straight lines.
Cassettes do not fall straight down into handbags.
Surely this is a difficult environment in which to be merely standing
around?

Somewhat flabbergasted, we find our passenger standing in the accelerated
tram, not holding onto anything, unperturbedly reading a novel.
How is this possible?
Upon closer examination, we notice that our passenger is \e{not} 
standing exactly as she was before: \e{she is now leaning forward}.
How does this help?
Well, to remain in the same position in the tram, our passenger must
be accelerated at the same rate as the tram.
To provide this acceleration, she merely leans forward a little.
The force on her body due to gravity would, in a stationary tram,
provide a torque that would topple her forwards. 
Why does this not also happen in the accelerating-tram case then?
The answer is that the additional forward frictional force of the tram's
flooring on her \e{shoes} both provides a counter-torque to avoid her
toppling, as well as the forward force necessary to accelerate her
at the same rate as the tram. 
Looked at another way, were she to \e{not} lean forward, the frictional
force forward of the accelerating tram would produce an unbalanced
torque on her that would topple her \e{backwards}.

Leaning forward at the appropriate angle is, 
indeed, a fine trick on the part of our passenger.
Upon questioning, however, we find to our dismay that she knows
nothing about Newtonian mechanics at all.
We must therefore conclude that ``learning'' this trick must be
something that humans do spontaneously---everyone seems to get the
hang of it pretty quickly.

It should be noted that \e{this trick would not work in space}.
Without borrowing the gravitational force, there is no way to produce
a counter-torque to that provided by the friction on one's shoes.
Of course, this friction \e{itself} should not be relied on too much:
without any gravitational force pushing one's feet firmly into the
floor, there may not be any friction at all!
This means that there will, in general, be no unbalanced torque
to topple one backwards anyway: the passengers in an accelerating
space vehicle, (literally) hanging around in mid-air, will simply
continue to move at constant velocity.
But the accelerating vehicle will then catch up to them: they will
slam against the \e{back} wall of the craft!
Of course, this is precisely Einstein's argument for the Equivalence
Principle between gravity and acceleration---if you are a 
standing passenger, you had better find 
what will become the ``floor'' in your accelerated 
spacecraft quick smart---but it shows that life on earth has prepared
us with different in-built navigation 
systems than what our descendents might
require.

What lessons do we learn from this?
Firstly, it is unwise to put a Melbourne tram into earth orbit.
More importantly, it shows that humans are quite adept at both
extrapolating the effects of constant acceleration (as shown by our ability
to catch projectiles), as well as being able to function quite easily
in an accelerated environment (as shown by our tram passenger).

Let us now examine the tram more closely \e{before} it accelerates away.
It is now sitting at the traffic lights at Exhibition Street.
The lights turn green.
The driver releases the tram's air-brakes with a loud hiss.
Our passenger spontaneously takes hold of one of the overhanging
stirrups!
More amazingly, another 
passenger spontaneously starts to fall forwards!
The tram jerks, and accelerates away across Exhibition Street.
Our passenger, grasping the stirrup, absorbs the initial jerk and,
as the tram continues on with a constant acceleration, lets go in
order to turn the page of her novel.
The second passenger, the spontaneously-falling character, magically did
not fall down at all: the tram accelerated at just the right moment to
hold him up---and there he is, still leaning forward like our first
passeneger!
However, all is \e{not} peaceful: a Japanese tourist, who boarded the
tram at Exhibition Street, has tumbled into the lap of a (now frowning)
matronly figure, and is struggling to regain his tram-legs.

What do we learn, then, from this experience?
Clearly, the Japanese tourist represents a ``normal'' person.
Accustomed to acceleration, but \e{not} to the discontinuous way that
it is applied in Melbourne trams, he became yet another veteran of
lap-flopping.
Our first passenger, on the other hand,
who grabbed the stirrup upon hearing the release
of the air-brakes, had clearly suffered the same fate in the distant
past, and had learnt to recognise the audible clue that the world
was about to shake: such is the Darwinian evolution of a Melburnite.
The second passenger, who spontaneously fell forward, appears to be
an even more experienced tram-dweller: by simply falling forward he had 
no need to grasp for a stirrup or the nearest solid structure.
This automatic response, which relies for its utility on the fact that
Melbourne tram-driving follows a fairly standard set of procedures,
is perhaps of interest to behavioural scientists, but does \e{not} indicate
that the passenger had any infallible ``trick'' for avoiding the effects of
jerks (to employ Feynman's term)---the ``falling'' method 
does not work at all if the jerk is significantly delayed for some unknown
reason.
(A somewhat mischievous tram driver once confessed that his favourite
pastime was releasing the air-brake and then not doing 
anything---and then watching
all of the passengers fall over.)
Of course, the Melbourne trams on the Seattle waterfront have an 
extra reason for
unexpected deceleration: in a city covered by decrepid, unused 
\e{train} tracks, motorists turning across the similar-looking \e{tram} 
tracks
get the fright of their lives when they find a green,
five-eyed monster bearing down on them!

Returning, now, to the task at hand, our admittedly simplified examples
above show that people are, in general, relatively adept at handling
\e{acceleration} (not surprising, considering our need to deal with
gravity), but not too good when it comes to \e{rate of change} of
acceleration, or \e{jerk}.
Numerous other examples of this general phenomenon can be constructed:
throw a ball and a person can usually catch it; but 
half-fill it with a 
liquid, so that it ``swooshes around'' in the air, and it can be very
difficult to grab hold of.
Sitting in a car while it is accelerating at a high rate
``feels'' relatively smooth;
but if the driver suddenly lets off the accelerator just 
a little,
your head and shoulders go flying forward---despite the fact that you
are still being accelerated in the \e{forward} direction!
In each of these examples, it is the (thus appropriately
named) \e{jerk} that throws our
inbuilt kinematical systems out-of-kilter; not too surprisingly,
it is difficult to formulate uncontrived examples in the \e{natural}
world in which jerks are prevalent (apart from falling out of a tree,
of course---but 
repeatedly hitting the ground is not a technique 
well suited to evolutionary survival). 

We now have two pieces of information on which to base a decision about
what $n$ should be in a practical $\Gal{n}$ display system.
Firstly, we have a purely mathematical 
``rule of thumb'': $n$ should be
either $2$ or $3$ to represent most efficiently (and stably) arbitrary
motion around $t=0$.
Secondly, we have a psychological criterion:
we have recognised that, in rough terms,
the human kinematical-computation system
is comfortable with dealing with motional derivatives up to the second,
but is quite uncomfortable dealing with the third derivative; this 
suggests that $n$ should probably be $2$, or, at the most, $3$.
There is, however, a \e{third} consideration to be taken into
account in this question, that has nothing to do with
mathematics nor psychology, but rather basic physics and information
theory: How many derivatives of the motion can we \e{accurately 
specify}
in a realistic \VR\ situation, in which many of the relevant
physical quantities are not
merely generated by the computer, but are, in fact, measured by
physical transducers?
To answer this question, we need to look a little more closely 
at the physical transducers that are used in real-life 
\VR\ systems, as well
as the way that the data they produce is analysed by the system
itself.

Clearly, positional--rotational data is the common
denominator among existing \VR\ transducer technology: find some
physical effect that lets you determine how far away the participant
is from a number of fixed sensors, as well as her orientation with
respect to these sensors, and you ``know where she is''.
Transducers for \e{velocity} information---which tell you 
``where she's going'' (but not ``where she is'')---are also 
commonplace in
commerical industry, albeit less common in \VR.
However, even if such transducers are \e{not} used, quite a reasonable
estimate of the true velocity of an object may be obtained by taking
differences in positional data (as was used to calculate
the results in \eq{ComputeVel}).
On the other hand, this ``numerical differentiation'' carries two
inherent dangers: firstly,
computing \e{any} differentiation on physical data
enhances any high-frequency noise present; and, secondly, performing
a \e{discrete-time} numerical differentiation introduces lags into the
data (\ie, in rough terms, you need to wait until $t=5$ to compute
$f(5)-f(4)\approx f'\!(4.5)$). 
The first problem can be somewhat ameliorated by low-pass filtering;
the second by ``extrapolating'' the data forwards half a time interval;
unfortunately, these two solutions are largely mutually exclusive.
However, in practice, quite reasonable velocity information \e{can} in
fact be obtained---as long as it is treated with care.

In a similar way, acceleration can either be measured directly, or
computed from the velocity data by discrete numerical 
differentiation.
In many respects, the laws of Nature make accelerometers \e{easier} to
make than speedometers.
This is, of course, due to the fact that \e{uniform motion is 
indistinguishable from no motion at all}, as far as Newtonian mechanics
is concerned: it is vitally necessary to measure velocity ``with respect
to something'' (such as the road, for an automobile; or the surrounding
air, for an aeroplane---``ground speed'' being much more difficult
to measure because there is 
[hopefully] no actual contact with the ground!).
On the other hand, \e{accelerations} cause the laws of physics to
change in the accelerated frame, and can be measured without needing
to ``refer'' to any outside object.
(Of course, this is not completely true: if (classical) laws of physics
are written in a \e{generally relativistic} way, they will also hold true
in accelerated frames; but that is only of academic interest here.)
Nevertheless, even though these properties make acceleration inherently
easier to measure than velocity, the \VR\ designer must ultimately worry
about both minimising the cost of the system, and minimising the number of 
gadgets physically attached to the participant---and it is unlikely that
\e{both} velocity and acceleration transducers would be deemed necessary;
one or the other (or, indeed, both) would be omitted.
Of course, acceleration may be deduced from velocity data
numerically, and carries the same dangers as velocity data
obtained numerically from positional data.
Most dangerous of all is if acceleration data must be 
numerically obtained
from velocity data that was \e{itself} obtained numerically
from positional data; the errors compound.

Of course, it is also possible actually \e{omit} measuring positional
information altogether, 
and instead obtain it by integrating measured velocities.
This integration actually \e{reduces} high frequency noise---but what one
gains on the swings one loses on the roundabouts: the \e{low} frequency
noise is boosted---manifested, of course, in ``drift'' in the measured
origin of the \coord\ system, which must be regularly calibrated by some
other means.
Alternatively, the relative simplicity of the physics may lead a
designer to simply use \e{accelerometers} as transducers, integrating
this information once to obtain velocity data, and a second time
to obtain positional data.
Of course, this double-integration suppresses high-frequency noise
even further, but
requires regular calibration of not only the 
origin of the \e{positional} \coord\ system, but also of the origin
of the \e{velocity} information (\ie\ knowing when the transducer is
``stationary'' with respect to the laboratory)---which is again
a manifestation of the general Wallpaper Bubble Conservation Law
(\ie\ whenever 
you get rid of one problem it'll usually pop up somewhere else).

Keeping in mind the above technical and design concerns involved in
determining even positional, velocity and acceleration information
from physical transducers, what chance is there for us to extend this
methodology to measuring \e{jerk} data?
On the physics side, there are few (if any) genuinely simple physical
effects that could inspire the design of a \e{jerkometer} (to coin a
somewhat obscene-sounding term).
It is not even clear that a jerkometer would be all that reliable 
an instrument anyway: since (by Newton's Second Law, $\vect{F}=m\va$)
\e{accelerations} are caused by \e{forces}, we see that \e{jerks}
are caused by \e{rates of change of forces}.
Think, now, of what happens when (for example) you lift your leg: at some 
instant in time you decide, ``Hmm, I'd like to lift my leg''; your
leg muscles then start to apply a force which overcomes gravity and
begins to accelerate your leg upwards.
The point is that \e{the force itself is applied rather abruptly}---in
other words, the \e{jerk} is almost like an \e{impulse function}
(or an ``infinite prick'' as it is sometimes derogatorily termed).
The main problem with impulse functions is that, due to fact that they
reach extremely high peak values (for short values of time, such that the
area underneath their curve is constant), many physical devices 
encountering them either \e{clamp} them to their peak-allowable value
(rendering the area under the impulse inaccurate), or get driven 
into \e{non-linear behaviour} (which can not only scramble the 
area-under-curve information, but may indeed drive the whole system
into instability).
Thus, one would need to be very careful in implementing directly a
jerk transducer.
Of course, even if one \e{were} able to manufacture such a jerkometer,
incorporating it into \VR\ equipment would again come up against
the abovementioned barriers of transducer overpopulation and excessive cost.

On the numerical-differentiation side, on the other hand,
whether it would be wise or not to
perform an extra differentiation of the acceleration data
to obtain the jerk data depends largely on where
the acceleration information originates.
If it is obtained from an actual physical accelerometer, such a numerical 
differentiation would probably be reasonable.
However, if the physical transducer is in fact a velocity- or 
position-measuring device, then one would not wish to place too much
trust on a second- or third-order numerical derivative
for a quantity that is already subject to concerns in terms
of basic physics: most likely,
all one would get would be 
a swath of potentially damaging instabilities in
the closed-loop system.
Thus, a numerical approach depends intimately on what order of
positional information is actually yielded by the physical transducers.

We now make the following suggestion: 
The visual display architecture of \VR\ technology should make
only \e{minimal} assumptions about the nature of the
physical tranducers used elsewhere in the system.
This suggestion is based, of course, on the concept of \e{modularity}:
if groups of functional components in any system cohere, by their
very intrinsic nature, into readily
identifiable ``modules'', then any one
``module'' should not be made unnecessarily and arbitrarily
dependent on the internal nature of
another ``module'' \e{unless} the benefits gained from whole
outweigh the loss of encapsularity of the one.
If this suggestion is accepted (which, in some proprietary situations,
may require consideration of the future health of the industry rather
than short-term commerical leverage), then it is clear that it would
be inappropriate for a display system
to assume that a given \VR\ environment obtains anything
more than raw positional--rotational information from physical
transducers.
With such a minimalist assumption, our above considerations show
that it would \e{not} be wise, in general, to insist that jerk information
about the physical motion of the \VR\ participant be provided to the
display system.
Of course, this does not prevent jerk information being obtained about
the other \e{computer-generated} objects in the virtual 
world---their trajectories in space are (in principle) knowable to
arbitrary accuracy; 
arbitrary orders of temporal derivative may be computed with relative
confidence.
However, if a display system \e{did} use jerk information for 
virtual objects, but not for the participant herself, it would all be
for naught anyway.
To appreciate this fact, it is only necessary to note that \e{all} of the 
visual information generated in a virtual world scenario 
\e{depends solely on the relative positions of the observer and the
observed}.
Differentiating the previous sentence
an arbitrary number of times, it is
clear that the \e{only} relevant velocities, accelerations, jerks, 
\etc, in a Galilean antialiased display environment are the \e{relative}
velocities, accelerations, jerks, \etc, of the observer
and the observed.
Of course, this property of \e{relativity} 
is a fundamentally deep and general principle of physics,
whether it be \Galn\ Relativity, or Einstein's Special Relativity or
General Relativity; it will probably not, however, 
be an in-built and intuitively obvious part of humanity's subconscious
until we more regularly part company with \e{terra firma}, and
travel around more representative areas of our universe.
(Ever played \e{Wing Commander}? 
Each Terran spacecraft has a maximum speed.
But \e{with respect to what?!}
Galileo would turn in his grave\ldots.)
Thus, using jerk information for one half of the system (the virtual
objects) but not the other half (the participant) brings us
no benefits at all---and, indeed, the inconsistencies in the virtual
world that would
result may well be a significant degradation.

Returning, again, to the task at hand, the above deliberations
indicate that, all in all, the most appropriate order of Galilean
antialiasing that should be used, 
at least for \VR\ applications, is probably $\Gal{2}$.
Of course, much of the above relies on only back-of-the-envelope,
first-principles arguments, which may well be rejected upon a more
careful investigation; and, of course, anyone is free to develop
technology to whatever specifications they like, if they believe
that their end product will be marketable.
However, for the purposes of the remaining sections of this
\typeofdoc, we shall assume that $\Gal{2}$ antialiasing is used
exclusively; the results obtained, and conclusions drawn, would need to
be extended by the reader if a different order of antialiasing 
were desired.

\newsect{MinimalImplementation}{A Minimal Implementation}
The previous sections of this \typeofdoc\ have been concerned
with the development of the underlying philosophy of,
and abstract planning for, \Galn\ \anti ing in general.
In this section, we turn directly to the practical
question of how one might
retrofit these methods to existing \VR\ technology.
Section~\ssect{MinHardwareMods} outlines the minimal modifications
to existing display control hardware that must be implemented;
section~\ssect{MinSoftwareMods} describes, 
in general terms,
the corresponding 
software enhancements necessary to drive the system.
More advanced enhancements to the general visual-feedback methodology
of \VR---which would, by their nature, be more amenable to 
implementation on new, ground-up developments---are 
deferred to section~\sect{Enhancements}.

\newssect{MinHardwareMods}{Hardware Modifications and Additions}                     
Our first task in modifying an exisiting \VR\ system 
is to determine precisely what changes must be made
to its hardware:
if such changes are technically, financially or politically 
unattainable,
then a retrofit will not be possible at all, and further speculation
would be pointless.
Clearly, the area of existing $\Gal{0}$ technology that has
been subjected to most scrutiny in this \typeofdoc\ is the 
\e{video controller subsystem}.
As would by now be obvious, the sample-and-hold frame buffer 
methodology of conventional display systems,
as described in section~\ssect{CurrentRasters},
must be completely gutted; 
in its stead must be installed
a tightly-integrated, relatively intelligent video controller
subsystem capable of correctly propagating from frame to frame
the moving, accelerated (but pixelated) objects passed to it by the
display processor.
We shall assume, in this \typeofdoc, 
that \e{refresh-rate \Galn\ \anti ing} is to
be retrofitted; in other words, the video controller computes a
new, correctly propagated frame for \e{each} physical refresh of the
display device.
This is, of course, the most desirable course of action: the motion
depicted on the display device should then (if its physical refresh
rate is suitably high) be practically indistinguishable from 
smooth motion.
However, in a retrofit environment, memory- and circuitry-speed 
constraints may quite possibly render this goal unachieveable.
In such a situation, \e{sub-refresh-rate \Galn\ antialiasing}
may be opted for: the actual ``frame period'' used for the 
propagation circuitry would, in that case, be chosen 
to be some integral multiple of the
physical refresh period, \ie\ the propagator update rate would then
be a sub-harmonic of the physical refresh rate.
For example, for a display system with a 60~Hz physical refresh rate, a
``\Galn\ refresh rate'' of 30~Hz, or 20~Hz, or 15~Hz, \etc, 
may be chosen instead
of the full 60~Hz.
Such an approach, however, carries the potential for many headaches:
the existing pixmap
frame buffers used by the scan-out circuitry may (or may not) need to be
retained, with \e{additional} pixmap buffers for the propagation process;
the results may then need to be copied at high speed from the propagation
frame buffer to one of the scan-out frame buffers; and so on.
In any case, sub-refresh-rate \Galn\ \anti ing will not be specifically
treated in the following; practitioners wishing to 
use this technique will need to take
references to ``refresh rate'' and ``frame buffers'' to mean
the corresponding objects in the \e{propagation} circuitry; 
requirements for the
\e{scan-out} versions of these objects must then be inferred by the
practitioner.

As noted in section~\ssect{Galpixels}, the galpixmaps that will be stored
in the (now necessarily multiple) frame buffers extend significantly
on the simple intensity or colour information stored in a regular pixmap.
However, there is clearly no need for a 
galpixmap to be \e{physically} configured as a rectangular array of
galpixel structures (in terms of physical memory); 
rather, a much more sensible configuration---especially in a retrofit
situation---is to maintain the existing hardware for the frame buffer
pixmap (and duplicate it, where necessary), and construct new memory
device structures for storing the additional information, such as
velocity and acceleration, that the galpixmap requires.
The advantage of this approach is that, properly implemented, the 
detailed circuitry responsible for actually scanning out the frame
buffer to the physical display device may be able to be left unchanged 
(apart, perhaps, for including a frame-buffer
multiplexer if hardware double-buffering 
is not already employed by the system).
This is a particularly important simplification for retrofit
situations, since, in general, the particular methodology employed
in the scan-out circuitry depends largely on the precise nature of the
display technology used.

We now turn to the question of what information \e{does} need to be
stored in the extended memory structures that we are adding to each
frame buffer.
Clearly, the (``display'', or ``apparent'') 
\e{position vector} of the $i$-th galpixel, $\vx_i$ 
(where $i$ runs from $1$ up to
the number of pixels on the display), 
is already well-spoken for: its $x$ and $y$ components are,
by definition, already encoded by the galpixel's physical location in the
pixmap matrix; and its $z$ component is stored in the hardware
$z$-buffer memory structure.
(If a hardware $z$-buffer is already implemented in the system, it
can be used unchanged; if a hardware $z$-buffer is \e{not} present,
it \e{must} be added to the system at this point in time; 
its presence is vital for \Galn\ antialiasing, as will soon be apparent.)
Is this all the positional information that we require?
What might be surprising at first sight is that, in fact, 
\e{it is not}; we must, however, first turn to the other memory
structures required, and the propagation algorithm itself, 
to see why this is so.

The \e{velocity vector} of the $i$-th galpixel, $\vv_i$, 
is a new three-component piece
of data that the display processor must now provide for each galpixel;
likewise, the \e{acceleration vector} of the $i$-th galpixel,
$\va_i$, must also be supplied.
With a $\Gal{2}$ system, of course, the acceleration of a particular
galpixel does not change at all while the video controller propagates
the scene from frame to frame; however, it would be wrong to think that
the \e{matrix} of galpixel acceleration values similarly stays
constant under propagation.
The reason, of course, is that \e{each galpixel carries its velocity and
acceleration data along with it}; in physics terms, these quantities are
\e{convective} temporal 
derivatives (``carried along'' with the flow), \e{not}
partial temporal derivatives (which stay put in space).
It should be noted that this concept---that a ``\e{gal}pixel'' 
is a little
pixelated logical object moving around the display, whereas
a plain ``pixel'' is simply a static position in the frame buffer
matrix---will be used extensively in the following description.

The next questions that must be considered are: 
What numerical format should we store the velocity and acceleration
information in?
How many bits will be needed?
These questions are of extreme importance for the
implementation of any \Galn\ \anti ed technology.
That this is so can be recognised by calculating just how many such
quantities need to be stored in the video hardware subsystem:
We need both a velocity and an acceleration value for every galpixel in a
frame buffer.
Velocity and acceleration each have three components.
We need at least three such frame buffers for each display device
(two for the video controller to propagate between, and one for the
display processor to play with at the same time);
and, for stereoscopic displays, will need two (logical) display devices.
Even for a bare-minimum display resolution of (say) $320\times200$,
we are looking down the barrel at 2.3~million quantities that need
to be stored somewhere;
increasing the resolution to $640\times480$ blows this
out to 11~million quantities.
Of course, with RAM currently on the street for A\$45 per megabyte,
even a relatively wasteful implementation of memory would not blow
out the National Deficit on RAM chips alone; however, configuring
such a memory structure in terms of electronic devices, 
in such a way that it can be 
processed at video-rate speeds, is a challenging enough task without
having to deal with an explosion of complexity.

Let us, therefore, make some crude estimates as to how we would 
like our physical system to perform.
Assume, for argument's sake, that we are implementing a 50~Hz refresh
rate display system;
each frame period is then 20 milliseconds.
Propagating quantities forward with a finite numerical accuracy
leads to accumulated errors that increase with time.
In particular, the position of an object will be ``extrapolated'' poorly
if we retain too few significant figures in the velocity and
acceleration---even ignoring the fact that the acceleration of the
object may have, in fact, changed in the mean time.
How poor a positional error, arising from numerical accuracy alone,
can we tolerate?
Let us say that this error should be no worse than a single pixel or so.
But the error in position will, in a worst case scenario, increase
linearly with time, \ie\ number of extrapolated frames.
How many frames will we want to extrapolate forwards while still
maintaining one-pixel accuracy?
Well, since we will be using binary arithmetic eventually, let's choose
a power of 2---say, 16 frames.
This corresponds to an inter-update 
time 
(\ie\ the time for which the video controller itself happily propagates
the motion of the galpixels between display processor updates)  
of 320~milliseconds---which should
be \e{more} than enough, considering that the participant's acceleration 
(not to mention that of the objects in the virtual world) 
will have no doubt
changed by a reasonable amount by then---and the view, if it has 
not been updated by the display processor, will thus be reasonably 
inaccurate anyway.
Of course, the whole display system won't suddenly fall over if, in 
some situation, we don't actually get a display processor
update for more than 16 frames---it is just that the inherent
numerical inaccuracy
of the propagation equations will simply grow larger than 1~pixel.
We shall say that the display system is \e{rated for a $N_\txt{prop}=16$
propagation time}---a number that would appear in its ``List of
Specifications'' at the back of the Instruction Manual.

OK, then, how do we use our design choice of $N_\txt{prop}$ (16, in our
case) to determine the accuracy of the velocity and
acceleration that we need to store?
The answer to that question is, in fact, inextricably intertwined
with the particular equation that 
our hardware will use to propagate galpixels across the display---and,
in particular, how it is implemented with finite-accuracy 
arithmetic---which must therefore be brought to the focus of our attention.
Now, we have learnt from our Virtual Galileo, in section~\ssect{Motion},
that the formula for the trajectory of a uniformly accelerated object
is given by
\beqn{UniformAccelPosVecVersion}
\vx(t)=\vx(0)+\vv(0)t+\half\va t^2,
\eeqn
and, from this, the equation of motion for its velocity is given by
\beqn{UniformAccelVelVecVersion}
\vv(t)=\vv(0)+\va t.
\eeqn
Since our video controller only knows about our galpixel's initial 
position $\vx_i(0)$, 
initial velocity $\vv_i(0)$
and initial acceleration $\va_i(0)$
anyway, assuming a \e{constant} acceleration $\va\id\va_i(0)$
for the galpixel
(until the next display processor update) 
is about the most reasonable thing to do.
If we measure $t$ in units of the frame period---which we shall always
do in the following---we can compute 
\e{exactly} where the uniformly-accelerated object would be, and 
what its velocity would be, at the
time of the next frame, by simply inserting $t=1$ into 
\eq{UniformAccelPosVecVersion} and \eq{UniformAccelVelVecVersion}, 
giving
\beqn{PropPos}
\vx_i(1)=\vx_i(0)+\vv_i(0)+\half\va_i
\eeqn
and
\beqn{PropVel}
\vv_i(1)=\vv_i(0)+\va_i
\eeqn
respectively.
We now note that, to our delight, the arithmetical operations needed
to compute \eq{PropPos} and \eq{PropVel} are not only simple---\e{they
can actually be hard-wired!}
The most ``complicated'' operation we need to perform is (signed)
addition, for which a hardware implementation is trivial.
And since we will be employing binary numbers, taking half of $\va_i$,
as required in \eq{PropPos}, amounts to simply shifting it right one
bit---or, in the hard-wired-adder version
of \eq{PropPos}, simply connecting
the signals appropriately shifted.
Thus, we can already see why it \e{is} technologically feasible to 
propagate 
motion using $\Gal{2}$ antialiasing, even
at video rates---the required
computations are, by \coin cidence, trivial for the digital devices
that we have at our disposal.

Now, using \eq{PropPos} and \eq{PropVel}, how many fractional bits do
we need to store for $\vx_i$, $\vv_i$ and $\va_i$ to ensure 1-pixel accuracy
at $N_\txt{prop}=16$?
At this point, we come to a horrible realisation: \e{we cannot have
any fractional bits at all!} 
Why do we reach this conclusion?
Because, as noted earlier, the $x$ and $y$ components of the
$\vx_i$ information for the galpixel
is already encoded in its position in the pixmap matrix\ldots and 
there's no fractional part to a position in a matrix!
And without any fractional bits in $\vx_i$, performing the addition in
\eq{PropVel} with fractional bits in $\vv_i$ or $\va_i$ would be
a complete waste of time: the fractional bits would be thrown away
each frame as we write the galpixel into its new position!
\e{A catastrophe!}

Recovering our composure, we ask:
What happens if we \e{are} restricted to having integral $\vv_i$
and $\va_i$?
Is that good enough?
Well, let us consider just the velocity for the moment, and assume the
acceleration is zero.
Clearly, if $\vv_i=0$ also, the galpixel won't move anywhere at all
until the next display processor update.
This is appropriate if the galpixel wasn't supposed to move more than
half a pixel in any direction anyway.
OK, then, what is the next smallest speed possible?
Let us consider just the $x$ direction, for simplicity.
The smallest value for $v^x_i$, in integer arithmetic, is, obviously,
$v^x_i=\pm1$.
What will the video controller do with such a galpixel?
It will move it one pixel per frame until the next update.
But this is terrible---that's a whole 16 pixels (if we wait 16 frames
for a display processor update); the thing is moving at 50 pixels per
second!
And that's the \e{smallest} non-zero speed we can define!
What happens if the galpixel should have been moving at 24 pixels per
second?
Too bad---it stays put.
And if it should have been going at 26 pixels per second?
Sorry, 50 is all I can give you. 
You'll just have to overshoot.
\e{A catastrophe!}

Regaining our composure again, let us reconsider the reason why the
proverbial hit the fan in the first place.
Our basic problem is that a pixmap matrix has no such thing as a
``fractional row'' or ``fractional column''.
Maybe we could increase the resolution of our display\ldots and call
the extra pixels ``fractions''?
Hardly a viable proposition in the real world---and 
in any case we'd be simply palming off the
problem to the \e{new} pixels.
Maybe we could leave the display at the same resolution, but replace
each entry in the galpixmap with a little galpixmap matrix of its own?
Then we could have ``fractional rows and columns'' no problems!
Well, how big would the little matrix need to be?
Seeing as a velocity of $v^x_i=1$ pixel per frame moves us by
$16$ pixels in $16$ frames---and we only want to move one pixel, 
max., in this time period---we should therefore reduce the minimum
computable velocity to $1/16$ of a pixel per second.
This, then, requires a $16\times16$ sub-galpixmap for each display pixel.
We'd need $256$ times as much memory as we've already computed 
before---2.3~millions quantities blows out to almost 600~million!
\e{A catastrophe!}

Regaining our composure for the third (and final) time, let us consider
this last proposal a little more rationally.
We uncontrollably assumed that what was needed at each pixel location
was a little galpixmap, with all the memory that that requires.
Do we really need all this information?
What would it mean, for example, to have a whole lot of little ``baby
galpixels'' moving around on this hugely-expanded grid?
What if the babies of two different (original-sized) 
galpixels end up on the same (original-sized) galpixel submatrix---does
such a congregation of babies make any conceptual sense?
Well, our display device only has \e{one} pixel per submatrix: so who
gets it?
Do we add, or average, the colour or intensity values for each of
the baby galpixels?
No---the object \e{closer} to the viewer should obscure the other.
Should it be ``most babies wins''?
No, for the same reason.
Then is there any reason for having baby galpixels at all?
It seems not.

Let us, therefore, look a little 
more closely at these last considerations.
Since the display device only has one physical pixel per stored 
galpixel (of the original size, that is---the baby galpixels 
having now been adopted out), then, obviously, 
a galpixel can only move one pixel at a time anyway.
But we only want to move the galpixel by one pixel
every 16 frames---or every 3 or 7 or 13 frames,
or whatever time period will, on the average, give us the right average 
apparent velocity of the galpixel in question.
So how does one specify that the video controller is to ``sit around''
for some number of frames before moving the galpixel?
Simple---put in a little counter, and tell it how many frames to wait.
Of course, we need a little counter for each galpixel, but it need only
count up to 16, so it only needs 4 bits anyway (in each of the
$x$ and $y$ directions)---not a large price.
Thus, we \e{can} get sub-pixel-per-frame velocities, with only a handful
extra bits per galpixel!

Let us look at this ``counter'' idea from a slightly different direction.
Just say that we have told the video controller to count up to 16 before
moving this particular galpixel one pixel to the right.
Why not \e{pretend} that, on each count, the galpixel 
``really \e{is}'' moving
$1/16$ of a pixel to the right---just that we don't actually see
it move because
our display isn't of a high enough resolution.
Rather, the galpixel says to itself on each count, ``Hmm, this display
device doesn't have any fractional positions; I'll just throw away the
fraction and stay here.''
But then, upon reaching the count of 16, the galpixel says, ``Hey, now
I'm supposed to be $16/16$ pixels to the right---but that's one \e{whole}
pixel, and I can do that!''
Clearly, this is a better description for the counter than our original
one---we now know what to do if counting, say, by 3s---namely, we count
up $3,6,9,12,15,18$\ldots, whoops!\ total is over 16---so 
move right one pixel and
``clock over'' to a count of $2,5,8,11,\ldots$; and so on.

And so we come---by a rather roundabout route, to be sure---to the
conclusion that the \e{simplest} way to allow sub-pixel-per-frame
velocities is to ascribe to each galpixel two additional attributes:
a \e{fractional position} in each of the $x$ and $y$ directions.
The above roundabout explanation has, as a consolation prize, already
told us how many bits of fractional
positional information we require for a
$N_\txt{prop}=16$ rated system, namely, 4, for each of the $x$ and $y$
directions.
Clearly, the number of bits required in the general case
is just equal to $(\log_2\!N_\txt{prop})$, 
\ie\ 3 bits for $N_\txt{prop}=8$;
5 bits for $N_\txt{prop}=32$, and so on.

There is, however, a slightly undesirable feature of the above 
specification of the action of fractional position, that we must now
repair.
In the example given, the galpixel ``moved'' $1/16$ of a pixel
each frame; on the 16th frame it moved to the right by one physical
display pixel.
Is this appropriate behaviour? 
Consider the situation if the galpixel had in fact
been moving with an $x$-direction velocity of \e{minus} one-sixteenth
of a pixel per frame.
On the first frame, it would have \e{decremented} its fractional 
position---initially zero---and ``reverse clocked'' back to a count of
15, simultaneously moving to the \e{left} by one physical display pixel.
But this is crazy---if it takes 16 frames to move one pixel right,
why does it only take one frame to move one pixel left, if it is supposed
to be moving at the \e{same speed} (\viz\ $1/16$ pixels per frame)?
Clearly, we have been careless about our arithmetic: we have been
\e{truncating} the fractional part off when deciding where to put
the pixel on the physical display; we should have been \e{rounding}
the fraction off.
Implementing this repair, then, we deem that, if a fractional position
is greater than or equal to one-half, the physical position of the galpixel
in the galpixmap matrix is incremented, and the fractional part is
decremented by 1.0.
(Of course, there is no need to \e{actually} do any decrementing in the
fractional bits---one just proclaims that $1000_2$ represents a fraction
$-8/16$; $1001_2$ represents $-7/16$; and so on, up to $1111_2$ 
representing $-1/16$.)
With this repair, a galpixel with a constant
speed of $1/16$ pixels per frame
(and initial fractional position of zero, by definition!) will
take 8 frames to move one pixel if moving to the right, and 9 frames
(by reverse clocking over $-8/16\rightarrow-9/16\id+7/16$
if moving to the left---as symmetrical a treatment as we are going to
get (or, indeed, care about).

There is one objection, however, that might be raised at this point:
we originally constructed our fractional position carefully, with the
correct number of bits, so that the galpixel would be able to wait up 
to 16 frames before having to move one pixel.
Why have we now restricted this to only 8 (or 9) frames?
The answer is that we haven't, really; the motion still \e{is} at a speed
of $1/16$ pixels per frame. 
To see this, one need only continue the motion on for a longer time
period: the displayed pixel ``jumps'' at frames $8,24,40,\ldots$, and
so on.
It is only in order to assure a left--right (and up--down) symmetric 
treatment of the motion that the \e{first} eight or nine
frames seem strange, at
first sight.
Looked at another way, consider \e{successive} display processor updates,
spaced 16 frames apart, 
for a galpixel that is, in fact, moving at the constant velocity
of $1/16$ frames per second.
On the first update, the galpixel is at $x=0$ (say); on frame number $8$
it moves one physical pixel to the right, to $x=1$.
For the remaining 7 extrapolated
frames of the first update period, it doesn't
move; at the end of this time, its fractional position is $-1/16$,
with respect to its physical position of $x=1$.
Then the second update comes.
The galpixel is now painted in at $x=1$, with a fractional position of
zero---exactly as it would have been if the previous frame has simply
been propagated.
\e{This} galpixel now waits 8 frames before moving to $x=2$.
Does this mean that it is actually moving at a 
rate of $1/8$ pixels per frame, rather than $1/16$ are desired?
No---one must add the 7 frames of the \e{previous} update period,
plus the update frame, to these 8 frames---making a total of 16 frames
since it had last moved, just as we wanted!
Of course, the display processor in some sense ``destroys'' the
previous galpixel when it draws the new one in the update frame, and so
in that sense it's not truly the ``same'' galpixel that accumulates the
two halves of the 16-frame waiting period.
However, the whole idea of a $\Gal{2}$ display system is precisely to 
make it \e{look} as if this is the same galpixel moving along---which,
if the true motion is reasonably well approximated by uniform
acceleration, and if the inter-update time is not too excessive,
will indeed be the case.
On the other hand, if the motion is \e{not} well approximated by
uniform acceleration, then it must necessarily be ``jerking around''
a bit---and now we rely on the \e{psychological} observation
that, in such circumstances, small details are difficult to discern!
Of course, if the viewer puts herself in such a position to ``take
a better look'' at the object represented by the galpixel, then, by
the very \e{definition} of ``taking a better look'' (\viz\ stabilising the
apparent motion of the object in one's field of view), the
gross jerkiness has been transformed away, and the display is again 
accurate!
It should be now apparent why psychological criteria played such
an important \role\ in the decisions made in the previous section---they
effectively ``save our bacon'' when things get technically difficult!

We now turn to the question of determining how many bits of accuracy
are required for the velocity and acceleration components themselves,
for the example of a $N_\txt{prop}=16$ system.
It might be thought that, since the position information is only
accurate to four bits itself, then equation~\eq{PropPos} means that
any more than four bits in $\vv_i$ or $\va_i$ would be wasted.
However, this conclusion would be erroneous: one must take into
account equation~\eq{PropVel} as well.
It is clear that $\vv_i(t)$ or $\va_i$ might well continue to be 
propagated from frame to frame at a \e{higher} accuracy than four bits,
which would then feed through, indirectly, to equation~\eq{PropPos},
through the velocity velocity $\vv_i(t)$.

Let us, therefore, examine this question a little more closely.
Consider, now, not the $i$-th galpixel travelling with constant 
\e{velocity},
but, rather, travelling under the effect of a constant 
\e{acceleration} $\va_i$.
Imagine, for simplicity, that the initial velocity of the galpixel,
$\vv_i(0)$, is zero, as is its initial position $\vx_i(0)$,
and that its acceleration $\va_i$ is purely in the
$x$-direction: $\va_i=(a,0,0)$.
The $x$-motion of this galpixel will clearly then be given by
\beqn{Parabolic}
x(t)=\half at^2.
\eeqn
Now consider how far this galpixel travels from the time $t=0$ to
the time $t=N_\txt{prop}=16$ frame periods: this distance
is clearly $\half\cdot a\cdot(16)^2=128a$ pixels. 
If we want the minimum specifiable distance travelled after
$t=N_\txt{prop}=16$ frame periods to be 1~pixel, we therefore
require a \e{minimum specifiable acceleration} 
of $1/128$ pixels per frame per frame, 
or, in other words, we require \e{seven fractional
bits} for the acceleration,
not four.
In the general case, of arbitrary $N_\txt{prop}$, the number
of bits required for the acceleration is clearly 
$(2\log_2\!N_\txt{prop}-1)$; the factor of 2 arises from the fact that
the time is \e{squared} in equation~\eq{Parabolic} 
(since $\log a^2\id2\log a$),
and the subtraction of 1 from the result arises from the fact that
we divide the acceleration by $\half$ in equation~\eq{Parabolic}
(since $\log_2(a/2)\id\log_2(a)-1$).
Thus, for example,
for $N_\txt{prop}=8$ we would require 5 fractional
bits for the acceleration;
for $N_\txt{prop}=32$ we would require 9 bits; and so on.

What, then, does this requirement of 7 bits for 
the fractional part of each component of $\va_i$ (for our
example of $N_\txt{prop}=16$) mean for the required accuracy of
the \e{velocity} vector, $\vv_i$?
Clearly, \eq{PropPos} cannot carry more than four bits of information,
since the position is only stored from frame
to frame with this accuracy.
The responsibility for 
propagating the information from the full 7 fractional bits of the
acceleration \e{must} therefore be carried by equation
\eq{PropVel}.
Thus, \e{the velocity also needs to have 7 bits of fractional
accuracy}---or, in the general case, $\vv_i$ must have the same
number of bits as $\va_i$, namely, $(2\log_2\!N_\txt{prop}-1)$.

We must now consider the problem 
of how we should add together the various differing-accuracy
numbers in \eq{PropPos}: $\vx_i$ has 4 fractional bits, $\vv_i$
has 7, and $\half\va_i$ has 8 (once we have multiplied it by the half,
\ie\ shifted it right one bit).
The \naive\ thing to do would be to simply compute it at the accuracy
of $\vx_i$---or 4 fractional bits in our example.
However, this would \e{unnecessarily} throw away information that is
already contained in the last three bits $\vv_i$, and 
the last four bits of $\half\va_i$.
The \e{correct} 
procedure is to add $\vv_i$ and $\half\va_i$ together with
\e{a full 8 bits} of accuracy; then to \e{round} this number off
to the nearest 4-bit-fraction number.
(This \e{rounding} can be effected in the same step by simply adding 
the number $0.00001000_2$ to the sum of $\vv_i$ and $\half\va_i$, and
then \e{truncating} the result to four bits.)
This four-bit number should then be added to the current fractional
position, as in \eq{PropPos}.
In this way we utilise the information stored in the $\Gal{2}$ frame
buffer optimally.

We now turn the question of the accuracy required for the $z$-buffer
position, velocity and acceleration information.
Clearly, there is no advantage to thinking in terms of
``fractional'' bits 
in the $z$ direction \e{per se}---because 
the visual information is not 
matricised in that direction anyway.
Rather, one must simply allocate a sufficient number of
bits for the $z$-buffer
to ensure that the finest movement in this direction that the 
application software requires
can be accurately \e{propagated}
over the rated propagation time $N_\txt{prop}$. 
It should be noted that, in general, this will require a 
\e{greater} number of bits for the $z$-buffer than for the same system
without \Galn\ \anti ing, for the same reasons as applied to the
use of fractional positional data in the $x$ and $y$ directions.

An interesting problem arises when an existing hardware $z$-buffer is
in place which, for \Galn\ \anti ing purposes, is not of a sufficiently
high number of bits to meet the design specifications for the
applications intended for use.
In such a case, it \e{is} useful to think of adding ``fractional bits''
to the $z$-buffer; these extra bits are then stored in a new
\e{physical} memory device, but in all respects are \e{logically}
appended to the trailing end of the corresponding $z$-buffer values
already implemented in hardware.
The controlling software may then choose to either compute $z$ values
to the full accuracy of this extended $z$-buffer; or, for backwards
compatibility with older applications, may choose to simply specify
only integral $z$-buffer values, using the fractional bits purely
to ensure rated performance under \Galn\ propagation.

We now turn to the question of how many \e{integral} bits are
required for the velocity $\vv_i$ and acceleration $\va_i$ of the
$i$-th galpixel.
This question is less straightforward than determining the number
of fractional bits, as above, and to some extent depends on the
experience and opinions of the designer of the \VR\ system.
Clearly, the maximum velocity portrayable on the display device
is equal to one display width or height in the period of one frame.
Any higher velocity than this and the galpixel in question either
wasn't visible on the previous frame, or else won't be on the
subsequent frame.
Since practical \VR\ displays are currently limited, in rough 
terms, to a maximum
resolution of around $1000\times1000$ at best, this suggests
that no more than 10 integral bits need be stored for each
Cartesian component of $\vv_i$
(since $2^{10}\approx1000$).
This estimate, however, is too generous, because the human visual
system cannot even \e{see}
an image that appears on a display for only a single
frame, let alone recognise it.
More useful would be to consider as a limiting velocity
a traversal of the entire display
\e{over a period of $N_\txt{prop}$ frames}.
For $N_\txt{prop}=16$, this suggests that about six integral 
bits for each component of velocity
may be sufficient to portray the maximum visualisable
apparent velocity;
in the general case of a display of linear dimension $D$ pixels, this 
estimated number of integral bits is simply given by the
formula $(\log_2\!D-\log_2\!N_\txt{prop})$.

One must, however, also take in account the \e{acceleration}, $\va_i$, 
in these considerations,
as may be seen in the following example:
Imagine that there is a virtual projectile being displayed that
is shot up from the bottom of the display, rises \e{just} to the 
top of the display under the simulated effect of gravitation,
and then falls back to the bottom of the display.
If we assume that the most rapid motion of this sort visualisable by
the viewer should also take place over a time interval of $N_\txt{prop}$
frames, we can compute the corresponding maximum velocity and
acceleration of the galpixel exactly.
Using Galileo's 
equations of motion $y_i(t)=y_i(0)+v_i^y(0)+\half a_i^yt^2$
and $v_i^y(t)=v_i^y(0)+a_i^yt$,
the above motion can be described by the constraints $y_i(0)=0$ (say),
$y_i(N_\txt{prop}/2)=D$, and $v_i^y(N_\txt{prop}/2)=0$.
The first constraint tells us that $y_i(0)=0$ (obviously); the
second that 
\[
D=v_i^y(0)\f{N_\txt{prop}}{2}
  +\half a_i^y\!\parenfracpower{N_\txt{prop}}{2}{2},
\]
or, or \rea rranging, 
\beqn{ConstrAcc1}
a_i^yN_\txt{prop}^2+4v_i^y(0) N_\txt{prop}-8D=0;
\eeqn
and the third constraint tells us that 
\beqn{ConstrAcc2}
0=v_i^y(0)+a_i^y\f{N_\txt{prop}}{2}.
\eeqn
Solving the pair of linear simultaneous equations \eq{ConstrAcc1}
and \eq{ConstrAcc2} for the two unknowns $v_i^y(0)$ and $a_i^y(0)$,
we find
\[
v_i^y(0)=\f{4D}{N}
\]
and
\[
a_i^y=-\f{8D}{N^2}.
\]
This analysis suggests that the maximum relevant velocity is in
fact four times that which we computed for uniform motion,
or, in other words, the number of required integral bits for
$\vv_i$ is approximately $(\log_2\!D-\log_2\!N_\txt{prop}+2)$ (since 
$\log_2(4b)\id\log_2(b)+2$).
Similarly, the number of required integral bits for acceleration
will then be $(\log_2\!D-2\log_2\!N_\txt{prop}+3)$ (since
$\log_2\!b^2\id2\log_2\!b$ and $\log_2(8b)\id\log_2(b)+3$).

We can now start to plug in some typical, real-life figures for $D$ 
and $N_\txt{prop}$ to get a feel for how many bits, \e{in total},
we shall require for each of the $x$ and $y$ components of each
galpixel's motion.
For $D=1024$ and $N_\txt{prop}=16$, we require 4 fractional
position bits, 7 fractional velocity bits, 7 fractional acceleration
bits, 8 integral velocity bits, and 5 integral acceleration bits:
total, 31 bits.
Thus, we can fit each of the $x$ and $y$
components of motional information for a given
galpixel into less than four bytes (\ie\ eight bytes in total,
for $x$ and $y$ combined; 
$z$ motion information, which we have not yet considered in this
respect, must also be added).
If we take $D=512$ to be closer to the mark of the resolution of
our device, we save two bits for each Cartesian
component; or, if $D=256$ is roughly the
resolution, we can save another two bits, bringing the total to 27
bits for each of the $x$ and $y$ components.
On the other hand, \e{increasing} $N_\txt{prop}$ seems, 
according to our formul\ae, to \e{decrease}
the overall number of bits required. 
This does not seem right: after all, to propagate
further forwards in time, do we not require greater accuracy, not less?
The reason we have obtained this misleading result is that
we have assumed $N_\txt{prop}$ to be the \e{minimum recognisability
time} of the human visual system, 
as well as the rated propagation time, without any good reason
for assuming these two quantities to be equal (other than the fact
that, numerically, they seemed to be so for the example we chose).
Clearly, a \VR\ system designer will want to consider these two
parameters independently;
we should replace $N_\txt{prop}$ where used above 
in its recognisability \role\ by a new
(psychological) parameter, the \e{recognisability time} $N_\txt{recog}$, 
being the smallest
time interval for which an object must be shown for it to be registered
by the conscious brain (subliminal virtual advertising being
ignored, for the moment---although this tactic might be worthwhile
when funding agency bureaucrats come around ``for a play'').
The
\e{total} number of bits required for \e{each} of the $x$ and $y$ 
components of positional information is then given by
\[
N_\txt{bits}=2\log_2\!D+5\log_2\!N_\txt{prop}-3\log_2\!N_\txt{recog}+3.
\]
Thus, the rules of thumb are the following: doubling the display
resolution in each direction adds two bits to each component; 
doubling the propagation time adds five bits to each;
\e{halving} the recognition time adds three bits to each.

We have not, so far, considered the $z$ buffer motional information.
As already noted, the $z$ buffer comes in for different treatment,
because it is not matricised or displayed as the $x$ and $y$ components
are.
It \e{may} prove convenient, from a hardware design point of view,
to simply use the \e{same} motional structure for the $z$ direction
as is used for the $x$ and $y$ directions.
However, in practice, if memory constraints are a concern,
one can allocate fewer bits to the $z$ buffer motional data.
For example, for the 31-bit-per-component example given above (where
$D=1000$ and $N_\txt{prop}=N_\txt{recog}=16$),
quite a workable system can be developed
using only sixteen bits for the $z$ direction motional information and
fractional position, in conjuction with an existing 16-bit
$z$-buffer (thus yielding four bytes for $z$ information in total).
It should be noted, however, that \e{some} sort of $z$-motion
information \e{must} be stored with the galpixmap, as will
become clear shortly.

Thus, we find that, as a very rough estimate, we shall need about
12 bytes to store the motional and $z$-buffer data for each galpixel.
This may seem high; but at $320\times200$ resolution this only
amounts to three-quarters of a megabyte; for $512\times512$ it
amounts to three megabytes.
This is not a lot of memory even by today's standards; it will seem
even less significant as time goes by. 
The important point to note is that memory demands 
are \e{not} a barrier to implementing \Galn\ \anti ing
immediately.

We have, of course, not yet considered the \e{colour} or \e{shading}
information that must be stored with each galpixel.
Clearly, in time, this will be universally stored as 24-bit RGB colour
information, as display devices improve in their capabilities.
This is, of course, another three bytes of data per galpixel
that must be stored.
But the colour--shading question is, in fact, more subtle than this.
Consider what occurs as one walks past a wall in real life---which may,
say, be represented as a Gouraud-shaded polygon in the virtual
world: the light reflected from the 
wall \e{changes in intensity} as we view it from successively
different angles.
Now, in the spirit of \Galn\ \anti ing, we should really provide
a method for the video controller to keep up this ``colour-changing
inertia'' as time progresses, 
until the next update arrives, so that our beautifully
rendered or textured objects do not suddenly change colours, in
a ``flashing'' manner, whenever a new display update arrives.
How, then, should we encode this information?
And how do we compute it in the first place?
The answer to the latter question is relatively simple, 
at least in principle: 
we only
need to compute the \e{time-derivatives} of the primitive-shading
algorithm we are applying, and program this information into the
display processor as well; colour temporal derivatives may then
be generated at scan-conversion time.
The former question, however---encoding the information
efficiently---is a little more subtle.
The \naive\ approach would be to simply store the instantaneous
time derivatives of the red, green and blue components of the colour
data for each pixel.
However, this \naive\ approach ignores the fact that, \e{most} of the
time, the change in colour of the object will be solely a change
in \e{intensity}---hue and saturation will stay relatively constant.
(Most \RR\ violations 
of this statement---such as a red LED changing to 
green---are due to
artificial man-made objects anyway; our visual systems
do not respond well to such shifts, evolving as we have under the light
from single star.) 
It is therefore prudent to encode our RGB information in such a way
that \e{intensity} derivatives can be given a relatively generous number
of bits (or, indeed, even second-derivative information),
whereas the remaining, hue- and saturation-changing pieces of information
can be allocated a very small number of bits.
On the other hand, this information must be regenerated, at video
scan-out speeds, into RGB information that can be added to the current
galpixel's colour values; we should not, therefore, harbour any
grand plans of implementing this encoding in a terribly clever,
but in practice unimplementable, way.
A good solution, with hardware in mind, would be to define three
new signals, $A$, $B$ and $C$, related to the $r$, $g$ and $b$ signals
via the relations
\beqnarr{ABCFromRGB}
A\tb=\tb\f{1}{3}\paren{r+g+b}, \nline
B\tb=\tb\f{1}{3}\paren{2r-g-b}, \nline
C\tb=\tb\f{1}{3}\paren{r-2g+b}.
\eeqnarr
Clearly, $A$ represents the intensity of the pixel in this coding
scheme: a generous number of bits may be allocated to storing
$dA/dt$---and maybe even a few for $d^2\!A/dt^2$, if deemed worthy.
On the other hand,
$dB/dt$ and $dC/dt$ would be encoded with a very small number of bits,
allowing gross changes of colour to be 
reasonably interpolated, but
otherwise not caring too much about inter-update hue and saturation
shifts.
The set of transformations \eq{ABCFromRGB}, while not 
fundamentally optimal, is especially amenable to hardware decoding:
the reverse transformations may be verified as being
\beqnarr{RGBFromABC}
r\tb=\tb A+B,  \nline
g\tb=\tb A-C,  \nline
b\tb=\tb A-B+C,
\eeqnarr
which, of course, also apply to the time derivatives of these 
quantities. 
In this way, we can efficiently encode colour derivative
information for storage, while at the same time
allowing a hard-wired reconstruction process in
the video controller to be possible.
Further technical specifications for the optimal number of bits of
storage to be relegated to each of the quantities $A$, $B$ and $C$
will be left for subsequent researchers to determine; 
a full analysis of the situation requires a careful consideration
of the algorithmics employed in the shading process, the capabilities
of the particular display device employed, and, most importantly,
the nature of the human colour visual system.
(The author, being short of 33\% of 
experiential information in the last regard,
disqualifies himself from research on this topic, lest he be tempted
to encode all hues most efficiently to his own isochromatic line in order
to save on RAM requirements.)

We have now outlined, in broad terms, most of the hardware additions and
modifications necessary to retrofit an existing \VR\ system with
a \Galn\ \anti ing display system.
There are, however, two final, important questions that must be addressed,
that are slightly more algorithmic in nature: What happens when two
galpixels are propagated to the \e{same} pixel in the new frame?
And what happens if a pixel in the new frame is not occupied by \e{any}
galpixel propagated from the previous frame?
There must, of course, be answers supplied
to these questions for any
\Galn\ \anti ing system to work at all; however, the \e{best}
answers depend on both how much ``intelligence'' can be 
reliably crammed
into a high-speed video
controller, as well as the nature of the images are being displayed.
In section~\ssect{LocalUpdate}, suggested changes
to image generation philosophy will subtly 
shift our viewpoint on this question yet again; however,
general considerations,
at least as far as \VR\ applications are concerned, 
will remain roughly the same.

We first consider the question of \e{galpixel clash}: when two
galpixels are propagated forward to the same physical display
pixel.
Clearly, the video controller must know which galpixel should
``win'' in such a situation.
This is the reason that we earlier insisted that hardware
$z$-buffering \e{must} be employed: without this information,
the video controller would be left with a dilemma.
\e{With} $z$-buffer information, on the other hand, the video
controller's task is simple: just employ the standard $z$-buffering
algorithm.

Having dealt so effortlessly with our first question, let us now
turn to the second: what happens when there is an unoccupied galpixel
in the new frame buffer?
It might appear that the video controller cannot do \e{anything}
in such a situation: there simply is not enough information in
the galpixmap; whatever \e{should} now be in view must have been
obscured at the time of the last update.
While this is indeed true when the unoccupied pixel \e{does},
in fact, correspond to an ``unobscured'' object---and will, 
of course, require
some sort of acceptable treatment---it is \e{not} the only
situation in which empty pixels can arise.
Firstly, consider the finite nature of our arithmetic: it may well be
that a particular galpixel just happens to ``slip'' onto its
neighbour; where there were before two galpixels, there is now
only one.
This ``galpixel fighting'' leads to a mild attrition in the number
of galpixels as the video controller propagates from frame to frame;
this is not a serious problem, but it must 
nevertheless be kept in mind.
Secondly, and more importantly, it must be remembered that we
are here rendering true, three-dimensional perspective images---\e{not}
simply $2\half$-dimensional computer graphics---and must consider
what this means for the apparent motion of the objects in the scene.
Obviously, motion \e{towards} or \e{away from} the observer will lead
to a \e{change in apparent size} of the object in question.
If the object is moving \e{away} from the observer, the object's apparent
image gets smaller; in this situation, some galpixels will ``fight it off''
in a $z$-buffer duel; visible detail will, of course,
be reduced; but all of the pixels within the object's boundaries will
\e{still} remain filled in the new frame.
However, if the object is moving \e{towards} the observer,
then it will ``expand'' in apparent size: ``holes'' will appear
in the new frame, since we only have a constant number of galpixels
to fill the ever-increasing number of pixels contained within the
boundaries of the object.

How, then, are we to treat this latter case?
Clearly, the best idea would be to simply ``fill in'' the holes,
with some smooth interpolation of colour, so that we get some
sort of consistent ``magnification'' of the (admittedly still
low-resolution) object that is approaching the observer.
But before we implement such a strategy, it is necessary to note that
\e{we must have some way of distinguishing unobscuration and expansion}.
Why?
Because, in general, we would like to apply a 
different solution to each of these
two cases.
Is it not possible apply one general fix-all solution?
After all, we have a fundamental deficit of information in the
unobscuration case anyway---why not just use the expansion
algorithm there too so that at least we have \e{something} to
show?
This sounds reasonable, and, indeed, might be the wisest course
of attack in highly constrained retrofit situations.
However, in general, we can do much better than this, for essentially
no extra effort; we shall now outline these procedures.

We shall first take a nod at history, and consider the case of
a \e{wire-frame} \VR\ display system.
While destined to slip nostalgically into the long-term memories
of workers in the field of computer graphics, and will only be able to be
seen at all by the
year 2001 by watching Stanley Kubrick's masterpiece of the same name,
wire-frame graphics nevertheless provides a simple proof-of-concept
test-bed for
prototyping \Galn\ \anti ed displays, and is so simple to
implement that it is worth including here.
In a wire-frame display system, the vast majority of the display
is covered by a suitable background colour (black, or dark blue);
on top of this background, lines are drawn to represent the edges
of objects.
That such graphics can look realistic at all is a testament
to our edge-detection and pattern-recognition visual senses:
quite literally, almost of the visual information in the scene has
been removed.
This removal of information, however, is what makes unoccupied-pixel
regeneration so simple in wire-frame graphics: the best solution is
to simply \e{display the background colour} for such pixels.
For sure, a part of a line might disappear if it happens to 
\coin cide with one that is closer to the observer in any given frame
(although this problem can in fact be removed by using the enhancements
of section~\ssect{LocalUpdate}); however, we assume that the rendering
system is not too sluggish about provide updates anyway: 
the piece of line will
only be AWOL for a short time.
Overall, the accurate portrayal of \e{motion} far outweighs, in 
psychological terms, any problems to do with disappearing bits of lines.

Let us now turn to the case of \e{shaded}-image systems.
Will the approach taken for the wire-frame display system work 
now?
It is clear that it will not: in general, there is no such thing as
a ``background'' colour: each primitive must be ``coloured in''
appropriately.
Thus, \e{even for the unobscuration case}, we must come up with
some reasonable image to put in the empty space: painting it
with black or dark blue would, in most situations, look simply shocking.
The solution that we propose is the following: \e{if a portion
of the display is suddenly unobscured, just leave 
whatever image was there last frame}.
Why should we do this?
Well, for starters, we have nothing better to paint there.
Secondly, if the display update is not too long in coming, this
trick will tend to mimic CRT phosphor persistence: it looks as if
that part of the image just took a little bit longer to decay away.
This ``no-change'' approach thus fools the viewer into thinking
nothing too bad as happened, since the fact that that part of the
display ``does not look up-to-date'' will not truly register,
consciously, for (say) five or ten frames anyway.
Thirdly, and most revealingly, \e{this approach yields no worse
a result than is already true for conventional
sample-and-hold $\Gal{0}\!$ display devices!}
If a participant can be convinced of the reality of the virtual
world with a lousy $\Gal{0}$ display, she can \e{surely} be convinced
of it when only an occasional, small piece of world defaults
back to $\Gal{0}$ technology, with the overwhelming majority of the 
virtual world charging ahead with full $\Gal{2}$ 
persuasion!

Of course, this explanation is a tad more mischievous than
it appears: mixing $\Gal{0}$ and $\Gal{2}$ algorithms together
does \e{not quite} have the same effect as either algorithm on its own.
There is a slight ``edge-matching'' inconsistency when they 
are used cojointly: an object propagates away, under the 
$\Gal{2}$ algorithm---yet a part of it may simultaneously
be ``left behind'' on
the display, under the $\Gal{0}$ ``no-pixel'' fallback provisions!
Fortunately, however, the nature of the human visual system
makes this seeming inconsistency quite mild: the object simply
appears to be moving, even though bits of it keep getting left
behind.
That this, in fact, works quite well in pratice can be verified
by observing the Mouse Trails that Microsoft Windows 3.1 can
provide for laptop displays: they look a little funky, but nevertheless
one does \e{not} become psychologically confused when seeing dozens of
little mouse pointers left behind.
Over longer time scales, but with lower pixel-per-frame apparent
velocities, the effect
resembles to some extent that of the 
NCC--1701 upon its violation of Einsteinian relativity.
At these time scales, the effect is becoming noticeable, but, again,
is not overwhelmingly disturbing.
In any case, the author has not been able to think of any better
way (apart from the wholesale changes outlined in 
section~\ssect{LocalUpdate}) to deal with the unobscuration problem.

We now consider in more detail how this mixing of $\Gal{0}$ 
and $\Gal{2}$ methodologies can be carried out in practice.
The general idea is as follows: Consider the video controller's
task as it updates a ``new'' frame buffer from an ``old'' frame
buffer.
The first task it does is to copy across the \e{static colour} 
information of each galpixel to the
\e{same} position in the new frame buffer as it occupied in the
old. 
This is the $\Gal{0}$ methodology at work. 
The velocities, accelerations, fractional positions, \etc, of these
copied-across pixels are all zeroed, just as if they were inhabitants
of a standard $\Gal{0}$ display.
These copied-across pixels will be termed \e{debris}.
Each pixel of debris must be further subjected to an additional operation:
it must be stamped with a \e{debris indicator} to denote its
status.
The debris indicator may be a special $z$-buffer value reserved 
specifically for this purpose, say (although this will cause
problems with the methods of section~\ssect{LocalUpdate}, and should
probably be avoided), or it may instead
be a \e{special value} for one of the lesser-used colour derivatives
(such as $dC/dt$)
that is reserved specifically for this purpose, and not considered
to be
a valid physical value for that derivative by the operating software.
If, on the other hand, 
the \VR\ system designer is prepared to dedicate a \e{whole} bit of
memory to this
purpose, for each galpixel of each frame buffer, 
the debris indicator will then have its own exclusive \e{debris flag};
such an approach uses more memory, but it may be more efficient,
in terms of scanning speed, if a simple flag must be tested, rather
than an equality test being performed between a multi-bit number
and an (admittedly hard-wired) debris code.

The next task for the video controller is to simply perform the
$\Gal{2}$ propagation procedure from the old frame buffer to the
new.
Fully-fledged card-carrying galpixels \e{always} replace any
debris that they encounter ``sleeping in \e{my} bed''; if they
clash with another galpixel, standard $z$-buffering takes place.

At the end of this two-pass process, the new frame buffer will
contain as much true \Galn\ \anti ed information as possible; the
remaining unoccupied galpixels simply let the debris ``show through''.
(Of course, we have not yet considered the surfaces that are 
``expanding''; this will come shortly.)

The prospective \Galn-\anti ed display designer might, by this point,
be worrying about the fact that we have introduced a \e{two-pass}
process into the video controller's propagation procedure:
nanoseconds are going to be tight anyway; 
do we really need to double the procedure just for the sake
a copying across debris?
Fortunately, the practical procedure need only be a \e{one}-pass
one, when done carefully, as follows:
Firstly, the new frame buffer must have all of its debris
indicators set to true; nothing else need be cleared.
This will be especially easy when the debris indicator is a
dedicated set of debris flags that are all located on the same
memory chip: flags can be hardware-set \e{en masse}.
Next, the video controller scans through the old frame buffer, galpixel
by galpixel, retrieving information from each in turn.
Two parallel arms of the video controller's circuitry now 
come into play.
The first proceeds to check
to see whether the \e{debris} from that particular
galpixel may be copied across:
it checks the corresponding galpixel in the new frame buffer; if it
is debris, it overwrites it (because debris is copied across one-to-one
anyway, so that the debris that is there
must be simply the cleared-buffer message);
if it is a non-debris galpixel, it leaves it alone.
Simultaneously, the second parallel arm of the video controller circuitry
propagates the galpixel according to $\Gal{2}$ principles:
it computes (among other things) the new position of the galpixel
in the new frame buffer, and checks the corresponding galpixel
that is currently there (after checking for out-of-view conditions,
of course).
If that galpixel is debris, it overwrites it; if it is a non-debris
galpixel, it performs its usual $z$-buffer test to determine if it
should be overwritten.
Furthermore, there must be circuitry to \coord\ between 
these two parallel processes in the case that they are trying to
look at the \e{same} galpixel in the new frame: clearly, in this
case, the old galpixel takes precedence over its debris image; the
latter need not proceed further.
Finally, in the case that the \e{original} pixel is itself
debris, the galpixel-arm of the parallel circuitry should be
disabled: debris may be copied across from frame to frame 
indefinitely (if so required), but they have lost all of their
inertial properties.

%
%  ...File 3 of 4 should be concatenated here ...
