Go to Top

Search Engines Help You Find The Right Needle

Imagine the Library of Congress without a card catalog. Imagine your favorite restaurant without a menu. Each might offer a lot of attractive options, but what good are they if you do not know what they are?

With upwards of 70 million pages already available on the World Wide Web and more data pouring in every day, the Internet would be equally frustrating were it not for search engines. Search engines are the sites on the World Wide Web that use sophisticated, often secret software to help you find just what you are looking for. The more you know about search engines, the more efficient and rewarding your time on the ’Net will be.

The most widely known search engines are Excite, Yahoo!, and Lycos. Other familiar engines include AltaVista Search, HotBot, and Infoseek. Behind the scenes, each of these search engines is constantly building and updating a database of what information is available on the Web and, often, in other nooks and crannies of the Internet such as Usenet news groups or even electronic mail. The front end of each search engine — the part you and I see — is the query form, the place where we tell the search engine what we are looking for.

Suppose the user types the subject “baby boomers.” The search engine looks through its private indexes of Internet content just as you and I might look for the term in the index to a book. When it finds items that seem to match what we are looking for, the search engine displays the results to us, usually in the form of links to the seemingly relevant pages on the World Wide Web.

Search engines index information in two different styles known as keyword and concept indexing. Keyword indexing is more widely used but less sophisticated. Each search engine has its own method, or algorithm, for indexing the Web pages that it knows about. For example the popular Lycos search engine (http://www.lycos.com) indexes the title, headings, subtitles, and hyperlinks of each page. The first twenty lines of text and the 100 words that occur most frequently are also indexed. The problem with keyword indexing is that since the search is not very sophisticated, a search for baby boomers sometimes returns information about baby bottles, not the generation born after World War II.

If I am not interested in information about baby bottles, how can I limit my search to only baby boomers? An easy starting point would be to use a concept-based search engine like Excite (http://www.excite.com). Concept-based indexing is very sophisticated. This type of search uses artificial intelligence to try to comprehend the gist of what you are looking for, not just specifically what you type. The crawler indexes items by common concepts, rather than just words in a document. Thus a search for baby boomers might return a document that refers to “fortysomethings.”

Another way to hone in your search, regardless of whether you are using a keyword or concept-based search engine, is to use Boolean logic parameters. Simply separate the words in your search with Boolean logic terms like AND, OR, NOT, or NEAR. If you type “baby AND boomer,” your search will return items that only contain both of those words. Unfortunately, I discovered when testing this approach, one might also retrieve items about the days when football star Boomer Esiason was very young. To better focus my search, I had a few options. One was “baby AND boomer NOT esiason,” or “baby NEAR boomer.” The former would have found documents that had “baby” and “boomer” in the text, but not the word “esiason.” The latter found documents where the words baby and boomer were in close proximity in the document.

So which engine should you use? Most Web users have their own personal favorite — and I will disclose mine after an appropriate suspenseful buildup — but, in truth, the choice of search service varies according to what you need and what you know. Here is a synopsis of the leading services.

Alta Vista (http://www.altavista.digital.com), a keyword-based search engine, contains 30 million pages of indexed text. The simple search returns pages and pages of text from Web sites and Usenet groups in an overwhelming fashion. A simple search in AltaVista yields an unhelpful mishmash of hits thrown together in no particular order. However, if you take the trouble to use AltaVista’s advanced search with Boolean language, you can get the search engine to rank hits in order of importance. This often makes Alta Vista a powerful choice for skillful users.

Excite, as mentioned earlier, is a concept-based search engine that has received accolades from PC Magazine as the favorite search engine. This engine has over 50 million pages of text indexed including Web sites, Web site reviews, Usenet news and classifieds, and a news tracker. Excite also lets the user personalize an Excite Live! page, so it can appear with up-to-the-minute news items, sports scores, stock quotes, etc. The simple search in Excite understands Boolean terminology, and also categorizes your results. Moreover, Excite offers the new Internet user “New to the ’Net”, a virtual course about the Internet that provide Internet tours of different subjects like arts and entertainment, politics and law, and sports. The results of a search can also be sorted by site which helps the researcher find which sites are the most relevant.

HotBot (http://www.hotbot.com), allows you to search 54 million indexed documents easily. Searches can be modified by easy-to-follow menus using Boolean logic. One can also have the results limited by entering dates when documents were posted to the Web. The location (geographical or Web site) of the documents can also be constrained by drop-down menus. Limiting the type of media (i.e. .txt files or java script) is a further option. With all of these available parameter settings, HotBot is a hit when it comes to finding specific information.

Seeking information on Infoseek (http://www.infoseek.com) is basic and quite simple. Unfortunately the results are the same, basic and simple. This engine indexes only 1.5 million pages, a far cry from the other powerhouse search engines. I was unimpressed when I searched for "larry+elkin" and only found Larry quoted in different publications without finding his own web site (http://www.elkin.com). On the other hand, when I searched for myself, I was listed as the designer of the Larry M. Elkin & Co. Web site, a far lesser feat. Though full Boolean logic does not work on this engine, + or - keys refer to the Boolean AND or NOT. One plus is that email addressees and recently published news are searchable on the Infoseek site.

When using the Lycos search engine, one has access to one of the largest searchable databases in Internetland. Lycos is very user friendly, and offers many types of directories to help get you started. The directories include current news, sorting by subject, top 5% of Internet sites, and more. I find this to be one of the easiest sites to use for basic searches. Nevertheless, when refining your search, look elsewhere. Lycos gives you the limited options of “Match all words” or “Match any word.” Lycos uses “+” and “-” for Boolean operations, not the “AND” & “NOT” words popular with other engines.

Open Text (http://index.opentext.net) has two very easy search tools, simple search and power search. The simple search and the power search are fairly easy to operate. Using the power search, drop-down menus provide Boolean logic operators. Open Text prides itself on having more words than Lycos does while having the same number of pages indexed. However, Open Text indexes every word on every page including words like “a,” “an” & “the.” Search utilities include the ability to hunt for current events, email addresses, news groups, and in Japanese, Spanish, and Portuguese.

Yahoo! (http://www.yahoo.com) is a directory of the Web. It tries to organize the Internet into many different categories. When you enter a query, Yahoo! searches for a suitable category to direct your search. If it does not find a perfect category, Yahoo! sends the information to the AltaVista search engine for hits. It then organizes the sites into Yahoo! categories. Yahoo! works well when the researcher knows the category of information desired but lacks specific preferences. My Yahoo! lets you personalize your Yahoo! screen, much like Excite Live!. Yahoo! also has a great quotation, business information, and portfolio tracking page at (http://quote.yahoo.com).

So what is my favorite? If I had to live with only one search engine, it would be Excite. Excite offers more useful hits with greater ease than the other leading services. The artificial intelligence (concept-based indexing) of Excite tends to weed out many returns that keyword search engines would pick up. Excite is an excellent engine for the new user and the experienced ’Net traveler. Good luck and happy surfing.

Related Posts

,