What is a Search Engine?
A search engine is a tool for you to add searching capability to your web site. You can
specify a collection of documents that you allow web users to search on. Upon receiving a
search query, the search engine will scan through the collection and return the most relevant
documents to user.
A search engine is different from a simple pattern matching tool because it not only
returns the documents that contain the query, in fact it will intelligently select the most
relevant documents to users. This feature is crucial when the document set is large which
makes the job of scanning through the whole document set tedious.
What makes Cybotics Search Engine unique from other search engines?
In the marketplace, there are already a few search engines existed. So what makes Cybotics
Search Engine unique from the others?
Cybotics Search Engine is the first pure JavaTM multilingual search engine. Written in
Java with the latest servlet API, Cybotics Search Engine is totally platform independent.
Using sophisticated searching algorithms developed by Cybotics Technologies Ltd. for
statistical document ranking, the Cybotics Search Engine supports searching for English,
Chinese, Japanese and the Korean language. With a simple and well organized Web-browser
administration interface, you can set up the search service for your web site in minutes.
Some advance features are highlighted below:
- Multilingual Support - Support English, Chinese (GB, CNS11643 and Big5), Japanese (EUC, JIS and Shift-JIS ) and Korean (KSC5601).
- Search while indexing - You can continue searching while the index is updated. You can also choose to view a summary report or detailed report of the indexing status.
- Multiple Collections Support - You can create multiple collections and select which one to search on your search page.
- Statistical Document Ranking - Based on a language independent statistical document ranking algorithm, each document is assigned a relevance score to indicate how relevant the document is to the search query.
- Browser-based Administration Tool - A web browser based administration tool makes search engine administration and configuration as easy as browsing the web either locally or remotely through a network.
- Generation of Search Pages - Generating search pages for your web site to access the search engine is as easy as choosing a few options and clicking a button by using the "Generate Search Pages" function.
- Customizable Result Pages - You can change the look and feel of the search result page by importing HTML result page templates.
A Quick Start Setup the Search Service for a Sample Document Collection
1. Start the administration interface
Invoke the servlet CyboticsAdmin.class. A typical invokation would be:
http://hostname%adminURL%
where hostname is the host on which Cybotics Search Engine installed; and servlet is
the alias of the servlet invoker. [Screen Snapshot - Main Login Page]
2. Login to Administration Interface
If this is your first time logging into the Cybotics Search Engine, please enter "admin"
as password.
3. Change Administration Password
Select "Security Setting" on the upper panel and enter the passwords in the three input boxes. Click on button "Change Password".
[Screen Snapshot - Change Password]
4. Create Collection
Select "Collection Management" on the upper panel and choose "Create Collection" option on the left.
Enter a name, say "Chinese Docs Test", for the document collection and specify the
language encoding of this collection. [Screen Snapshot - Create Collection]
5. Edit Collection
Specify the documents included in the collection. For example, we want to include all
files under directory "d:\www\big5", we will have a screen like [Screen Snapshot - Edit Collection]
Click the "Add" button to add the file/directory. The entry should now appear in the list box.
You can also control what kind of files to be included in the collection. Simply click on the "Filter" button and specify the file extensions to index.
[Screen Snapshot - Index Filter] By default, files with extensions "txt", "html" and "htm" would be
indexed. You can override the default settings by entering your own set of comma delimited file extensions.
6. Build Index
Select the "Build Index" option and highlight the collection "Chinese Docs Test".
Click on the "Build" button to build the index. [Screen Snapshot - Build Index]
7. View Indexing Status
A summary report of current indexing status would be shown as: [Screen Snapshot - View Index Status Summary Report]
Click on the "Detailed" button to have a detailed report of status: [Screen Snapshot - View Index Status Detailed Report]
8. Set File Aliases
Select "General Configuration" on the upper panel and click on the "Set File Aliases"
option on the left pane.
Suppose you have setup an alias "/big5" in your web server to map
to directory "d:\www\big5" and you would like the URLs in your search results point to
the right place. Enter the mapping by setting "URL Prefix" to "/big5" and "Map to Directory"
to "d:\www\big5". Click the "Add" button to add the alias.
[Screen Snapshot - Set File Aliases]
9. Generate Search Page
For simplicity, use the default search page and result page template.
Choose the appropriate collection. If you want to save the search page,
enter the name in the field "Output File". [Screen Snapshot - Generate Search Page]
Otherwise, simply click on the "Preview" button to try out the search function.
Enter your search query in the text box and click "Search" to submit the query.
[Screen Snapshot - Preview Search Page]
The query result would be displayed as: [Screen Snapshot - Search Result]
Copyright © 1997 Cybotics Technologies Ltd. All Rights Reserved.
|