Document Number 2250
Recognition Hints and Tips for Perceive Personal
7/13/92

1. Always scan in line-art or black and white mode.

2. Selecting The Right Resolution

Most magazines and text books should be scanned at 300 dpi. Use 400 dpi if 
the text you are scanning is smaller than standard magazine size.  For 
normal sized text (8 to 15 points), 300 dpi is the best resolution. Use 200
dpi if you are scanning large blocks of headline-style type.

      Text Size          Scanner Resolution (DPI)
    ----------------------------------------------
    6  to 8  Points     400 DPI gives best results
    9  to 15 Points     300 DPI gives best results
    16 to 20 Points     200 DPI gives best results

3. Adjusting Contrast

Contrast control (the relative amounts of black and white) is critical
in obtaining a good scan.  If the text to be scanned is too faint and the 
scanned image shows excessive broken characters, then the recognition will 
be poor due to the unrecognized characters.  If the scan is too dark, the 
letters will run together and cause poor recognition.  In any initial 
scanning of a new image, you will need to adjust the contrast.  Perceive 
Personal tends to prefer a darker contrast, so start with the contrast 
setting slightly darker than the middle notch on the scanner.
		  
4. Scanning Speed
 
For the best image quality, experiment with the scanning speed to 
determine the best rate for your computer.  You should be scanning at an 
acceptable rate so the speed indicator light on the scanner head does 
not flash. When you scan too fast the characters may look compressed, which
causes character size inconsistencies.  Try to scan a bit slower than the
normal rate recommended by the scanner software, perhaps at about 1/2 to
1 inch per second. This way the system can receive and process all of the
incoming scanned data in time.

Make sure your scanning does not get too far ahead of the real-time
display.  On slower machines the disk access speed cannot keep up the with
the rate of incoming data passed from the scanner to the machine channel.  
Before you hit RETURN to activate Recognition, view the entire image to check
whether all the data is captured to screen.  When you have scanned too fast 
for the system, you may notice the final image displayed will have loss of 
data, i.e., certain chunks of paragraphs are missing. In this case, you will 
have to rescan slower. Scanning too slowly will not affect the quality of 
the scanned image.

5. Scanning Straight

When the actual scan is slanted, skewed, or jagged, the text will not 
be recognized fully.  Use a hard edge ruler or the edge of a book to help 
scan straight.  Or press the left index finger on the left edge of the 
scanner window casing as your right hand is holding the scanner to scan.  
This will balance the weight of pressure in scanning and help align the scan
direction straight down the material.  

6. Practice!

As you become more and more familiar with the way Perceive works -
what it recognizes and what it doesn't - you will find your results increase
dramatically.  

7.  Full Page Scanning

When attempting to scan a full page, scan the page in two strips with an 
overlap of 1/2 to 1 inch.  Scan both sides straight, at the same speed, 
and make sure both strips begin and end at the same height on the page.


TROUBLESHOOTING

Following are the most common reasons why an OCR program may misread text:

Unrecognizable fonts or characters.  Omnifont technology recognizes standard 
fonts by comparing text characters to features programmed into the software.
The software will not read characters which don't match programmed features.
It may also misread similar characters, for example 1 and l.

Original document quality directly affects accuracy.  Letters which are 
faint, touching or otherwise illegible may be read inaccurately.  Newspapers, 
copies and faxes are often poor originals.

Uneven or fast scanning, or pausing in mid-scan may result in skewed, 
stretched or compressed text.  Follow the tips above to avoid these causes 
for misrecognition.
