SPECTROGRAM (2.2)

1.  PRINCIPLES OF OPERATION

	Most ordinary sounds are complex combinations of individual
frequency components or harmonics which cover a wide frequency range and
vary in intensity over time.  A spectrogram is simply a plot of the
frequency content of such an audio signal as a function of time.  In
this program, digital audio recordings (PCM format) are analyzed to
produce a plot of frequency versus time, with harmonic intensity
represented by a variable color scale.  These spectrograms reveal the
fascinating hidden frequency structure of audio signals and can be used
for identifying or classifying particular sounds.  When used to analyze
recorded voice, spectrograms have also been know as
'voice prints'.

	Spectrogram uses a mathematical Fast Fourier Transform in
performing the frequency analysis.  FFTs are usually specified by the
number of input data points used in each calculation.  For a sampling
rate of F (Hz), an N input point FFT will produce a frequency analysis
over a frequency range of F/2.  Signal amplitude will be calculated at
N/2 frequency increments in this range.  All this means is that for a
digital signal sampled at 10000 Hz, a 512 point FFT will calculate
signal amplitude to be found at 256 frequency increments from 0 Hz to
5000 Hz.  This will become clear as you calculate and observe
different spectrograms.

	Contrary to popular opinion, higher sampling rates are not
always necessary for high fidelity recording.  The choice of sampling
rate depends entirely on the highest frequencies in the audio signal.
The rule of thumb is to use a sampling rate that is twice the highest
frequency in the audio signal.  That is, if you expect to have no
frequency components above 11KHz, then a sampling rate of 22KHz is
adequate.  If you examine a spectrogram and see that all of the signal
is concentrated in lower frequency components at the bottom of the
display, then it is a good bet that the recording was sampled at too
high a rate, wasting a significant amount of memory.  This program
produces the highest quality spectrograms of digital recordings which
have been sampled at the appropriate rate.

2.  SYSTEM REQUIREMENTS

	Spectrogram will run on any Windows 3.1 equipped machine.
However, the intensive calculations required to develop the frequency
spectrum demand the fastest processor available.  In addition, large
sound files will require much memory for analysis and display, so the
more memory the better.  Spectrogram will process any 8 or 16 bit audio
data in PCM format including ".wav" files or raw data files.
Spectrogram cannot process compressed audio data found occasionally in
large .wav files.

3.  COMPUTING AND DISPLAYING A SPECTROGRAM

	Choose "Open" from the "File" menu to load a digital sound
sample file.  Once a file has been selected, Spectrogram will present
the "Analysis Options" dialog box where you will specify the parameters
of the frequency analysis.  To select the default values, just press the
space bar.  To tailor the calculations to your own preferences, see
below.

    a. SAMPLE CHARACTERISTICS

	You may enter any value of sample rate from 8000 Hz to 44100 Hz.
If you have selected a .wav file, the sample rate displayed will be the
rate used in the original recording.  If you have selected a raw data
file, a sample rate of 11025 Hz will be initially assumed, and you
should enter the correct value if necessary.

	You may also select the beginning and ending location in the
selected file (in bytes) to be analyzed.  Initially, the starting and
ending location of the entire file will be displayed.  If you make no
change here, the entire file will be analyzed.

	You also have a choice of 8 bit or 16 bit data resolution.  Pick
the value which you know corresponds to the data file you are analyzing.
If you are loading a .wav file, the correct value will already be shown.
If this is a raw data file, 16 bit data will be assumed, but it is up to
you to specify the correct value.

    b.  FFT Selection

	You have a choice of 512, 1024, or 2048 point FFTs for the
frequency analysis.  Use 512 points routinely.  Use 1024 and 2048 point
FFTs only for high resolution analysis.  The higher resolution FFTs
require more time to compute the spectrogram.  For this reason, it is
sometimes preferable to decrease sampling rate, if increased frequency
resolution is needed, rather than to use a higher resolution FFT.

    c.  Horizontal Scale Selection

	You may select a horizontal scale of 2, 4, 8, or 16ms per line.
Each vertical line in the spectrogram display represents the output of
one FFT calculation.  The FFT data input window is stepped sequentially
through the data, performing an FFT calculation at each step.  The
horizontal scale selected determines the length of the step between each
FFT and thus the total number of FFTs required. Experiment with these
values to pick the horizontal scale you prefer.

    d.  Display Threshold Selection

	You are also given a choice of display threshold in order to
reduce clutter in noisy digital recordings.  A threshold of -3 dB or
-6 dB reduces the input signal level to eliminate background clutter.
Use a threshold of 0 dB regularly, and select signal reduction only if
necessary to reduce clutter.

    e.  Color Palette Selection

	And finally, you have a choice of color of grayscale display.
For a color display, red represents the highest signals and dark blue
the lowest.  For a grayscale display, the darker the display, the higher
the signal level.
	
Once you are satisfied with the Analysis Options, click "OK" to begin
processing and display of a spectrogram of the audio data file.  The
program will step sequentially through the audio file, calculate an FFT 
at each step, and display the results in the Spectrogram window.  You
can stop the process at any time by clicking the "Stop" button.

4.  The Spectrogram Display

	The spectrogram display reveals the digital signal as a
frequency versus time plot with signal amplitude at each frequency
represented by intensity (or color).  A continuous readout of cursor
position in frequency (Hz) and time (milliseconds) is displayed at the
bottom left of the window.  A coordinate grid can also be added by 
clicking the "Toggle Grid" button.

	The width of the spectrogram display is limited only by the
display screen.  Maximizing the spectrogram window will expand the
display horizontally to fill the screen.  If the spectrogram width is
greater than screen width, you can use the horizontal scroll bar at the
bottom of the display to position the spectrogram side-to-side.

	The height of the spectrogram display is limited by the size of
the FFT chosen for analysis.  Only 256 vertical display points are
needed for a 512 point FFT.  The 1024 and 2048 point FFTs require 512
and 1024 points respectively.  Maximizing the spectrogram window will
expand the display vertically to the size required by the FFT if not
limited by the screen height.  If the spectrogram height is greater than
the screen height, use the vertical scroll bar at right of the window to
position the spectrogram top-to-bottom.
 
5.  Modifying Spectrograms

	Once you have computed a spectrogram, you may want to make
changes to its length, vertical or horizontal scale, threshold or color
to improve the frequency analysis.  The menu bar across the top of the
display gives options for FFT size, horizontal scale, display threshold,
and color palette.  Choosing any of these options will cause the
spectrogram to be recomputed with the new value you have chosen.
If you want to change more than one parameter before recomputing the
spectrogram, choose "Modify" from the File Menu to bring up another
Analysis Options dialog box to make your selections.

	Frequently you will want to select a portion of the entire 
spectrogram for recomputation rather than recompute the entire length.
You can drag select this section from the spectrogram display.  Position
the mouse pointer at the desired starting point, press the left mouse
button and drag the mouse to the desired ending point and then release
the mouse button.  The Analysis Options dialog box will then appear with
the starting and ending locations filled according to your selection.


6.  Direct Recording and Analysis

	If you have a Windows compatible sound card installed, you will
be able to directly record and analyze an audio sample through a
microphone attached to your sound card.  Choose "Record New" from the
File menu to initiate recording.  You will be again be presented with
the Analysis Options dialog box to select the parameters of the
frequency analysis.  When recording is complete, computation of the
spectrogram will begin.    

7.  Spectrogram Playback

	If you have a windows compatible sound card installed, you will
also be able to play back the spectrogram by clicking the 'Play' or
'Play Wdw' buttons.  The Play button plays back the entire length of 
the .wav file, while the Play Wdw button plays back only that portion of 
the spectrogram which is visible in the Spectrogram Window.

8.  Saving Audio and Bitmap Files

	You can save a .wav file of the digital audio of your
spectrogram by choosing "Save Wave" from the File Menu.  You can also
save a bitmap of the visible portion of the Spectrogram Window by
choosing "Save Bitmap" from the File Menu.

9.  Problem Reporting

	Programs can only be improved if users provide feedback to the
author.  I can be reached at the following addresses for you to report
any bugs or to provide comments or feedback.  I encourage anyone with a
question to contact me at :

			DELPHI  -  RSHORNE

			INTERNET - RSHORNE@DELPHI.COM

10.  DISTRIBUTION

	Spectrogram is Copyright 1994 by R.S. Horne and may be
distributed as freeware.

11.  CREDITS

	So many interested Internet users have given good comments and
suggestions that I can't list everyone.  However, the contribution of
Philip VanBaren who provided the fast integer FFT code, has been
vital to the improved performance of this update.  Greg Walker and 
Henrik Clausen provided invaluable suggestions and debugging help.