Search Something

What are you looking for? To help people find information on your web site quickly, consider adding a search engine. The problem is that you probably don't have time to add one. Using a search engine helps everyone save considerable amounts of time.

1. List your site in Altavista, Hotbot, or Yahoo

By adding your site to one of the bigger search engines, you'll receive more traffic.

After being listed, you could use Hotbot or Altavista to provide a primitive search capability. Just submit a query to their web server asking that the search results be limited to a particular host (your host). Look at HotBot for ideas on this. For Altavista, you'll need to first take the query from the user and then resubmit it with "host:yourhost.com" added to the query string. Take a look at the search engine used at http://www.ucsd.edu to see this in action.

2. Write your own

If you have some experience in programming, you know that writing a program to parse document like a web page isn't too hard. Putting that information into a database that then allows quick retrieval isn't too hard, either. But, taking the time to build a system that works well will require considerable effort. In return, the search engine will have unique features and be completely customized to your particular needs.

For an example of a "super search" engine, look at http://www.nasa.gov. It indexes all of the Nasa web sites and even includes the contents of .PDF and other non-html files to find information. It's quite powerful. If I learn more about this, I'll be sure to let you know.

3. Install a search engine on your web server

Many companies want to sell you web search software. If you're running on Windows NT, I don't know what options are out there. But, users of Unix-based web servers can download ht://Dig, a free search engine. It operates in two separate ways. First, the robot will go out and search the web sites you specify. This compiles a database than will then be searched when a user does a search. Before installing it, make sure to read the documentation available for it. Also, don't use the latest beta version (htdig-3.0.8b2)... I've seen some bugs. Instead, look at the list of versions available here.

After installing ht://Dig, you'll have the following directories to deal with:

Also, in your web server's CGI directory, you'll find htsearch, the program that will take user input and return the search results. It will also install some graphics in the directory of your choice.

Before using htsearch, you'll need to run the script 'rundig -v' in the bin/ directory. This will create the database by searching the sites listed in the configuration. The ht://Dig robot will recursively search your web site, starting at the location you specify. It only follows links in the .html files so it will only find the pages that are accessable from where it starts.

ht://Dig includes many clever features, so be sure to read the documentation carefully.

Try the following to see it in action:

Match: Format:
Search:

The search engine indexes the following web sites:

How is the war going today?

Author: Doug Steinwand
Date: [04/07/98]
More articles about CGI
More articles by Doug Steinwand
Author Biography
it won't. -->