Search Something
What are you looking for? To help people find
information on your web site quickly, consider adding a search engine.
The problem is that you probably don't have time to add one. Using
a search engine helps everyone save considerable amounts of time.
1. List your site in Altavista, Hotbot, or Yahoo
By adding your site to one of the bigger search engines, you'll
receive more traffic.
After being listed, you could use Hotbot
or Altavista to provide a primitive search capability. Just submit a
query to their web server asking that the search results be
limited to a particular host (your host). Look at HotBot for ideas on this. For
Altavista, you'll need to first take the query from the
user and then resubmit it with "host:yourhost.com" added to the query
string. Take a look at the search engine used at http://www.ucsd.edu to see this in action.
2. Write your own
If you have some experience in programming, you know that writing
a program to parse document like a web page isn't too hard. Putting
that information into a database that then allows quick retrieval
isn't too hard, either. But, taking the time to build a system
that works well will require considerable effort. In return, the
search engine will have unique features and be completely
customized to your particular needs.
For an example of a "super search" engine, look at
http://www.nasa.gov.
It indexes all of the Nasa web sites and even includes the contents
of .PDF and other non-html files to find information. It's quite powerful.
If I learn more about this, I'll be sure to let you know.
3. Install a search engine on your web server
Many companies want to sell you web search software. If you're
running on Windows NT, I don't know what options are out there. But,
users of Unix-based web servers can download ht://Dig, a free search engine. It operates
in two separate ways. First, the robot will go out and search
the web sites you specify. This compiles a database than
will then be searched when a user does a search. Before installing
it, make sure to read the documentation available for it. Also,
don't use the latest beta version (htdig-3.0.8b2)...
I've seen some bugs. Instead, look at the list of versions
available here.
After installing ht://Dig, you'll have the following directories
to deal with:
- bin/ - holds binaries for the search robot
- common/ - has template files used to customize the look of
the search results
- conf/ - holds the configuration file
- db/ - has the database files
Also, in your web server's CGI directory, you'll find htsearch,
the program that will take user input and return the search
results. It will also install some graphics in the directory
of your choice.
Before using htsearch, you'll need to run
the script 'rundig -v' in the bin/ directory. This will create
the database by searching the sites listed in the configuration.
The ht://Dig robot will recursively search your web site, starting
at the location you specify. It only follows links in the .html
files so it will only find the pages that are accessable from
where it starts.
ht://Dig includes many clever features, so be sure to read the
documentation carefully.
Try the following to see it in action:
The search engine indexes the following web sites:
How is the war going today?
Date: [04/07/98]
More articles about CGI
More articles by Doug
Steinwand
Author Biography