Contents Home Revision Advanced Syntax Using Fields Invisible Web Task List EBM Top 10 Search Engines AltaVista Ask Jeeves! Excite Fast HotBot Infoseek Lycos Northern Light Oingo WebCrawler Metasearch Dataware Dogpile Inference Find Ixquick MetaCrawler Simpli Directories |
Advanced Search SyntaxUsing advanced search syntax requires you to know a considerable amount about the particular search engine you are using - you will definitely need to read the help file. If you are willing to tackle this level of knowledge/expertise then you are seriously into searching! I would recommend using the AltaVista Advanced search initially as the AltaVista tutorial is very helpful. AltaVista have also recently introduced a Power Search, which attempts to use a form to substitute for some of the syntax (i.e. you select options rather than typing in syntax words). This is similar to the Hotbot advanced search concept.
|
Operator |
Description |
AND "search engine" AND tutorial |
Search words on both sides of this operator must be present in the document to score a hit |
OR "search engine" (guide or tutorial) |
Search words on either side of this operator are sufficient to score a hit |
AND NOT "search engine" (guide or tutorial) AND NOT beginner |
Search words after this operator make the search engine exclude the web page from the hits |
NEAR |
Search
words have to be within a certain number of words from one another in order to score a
hit. NB - BEFORE/AFTER are similar to NEAR but specify the order of the words as well |
PARENTHESES
(brackets) "search engine" AND AltaVista AND (guide or tutorial) |
Parentheses () are often neglected in discussions of Boolean logic, but they are an integral part of the logic, as they tell the search engine in what order to process the operators. |
Probably one of the most
frustrating features of using Boolean logic with search engines, is that the search
engines themselves (in an effort to be user-friendly) apply the Boolean syntax rather
loosely. To a certain extent this makes the whole business a bit pointless, and I would
recommend only using Boolean logic if really necessary, as it usually offers little more
than just using search engine arithmetic. The following two sections discuss Boolean
Operators and Parentheses in some detail.
AND |
Search
words on both sides of this operator must be present in the document to
score a hit. Example: ovary AND cancer will return web pages in which both the word ovary and the word cancer are present. Theoretically, the search should not return web pages in which only one of the words is present. The result is the same as if you had used the + sign as previously described in the section on search engine arithmetic Example: +ovary +cancer is equivalent to ovary AND cancer |
AND will narrow or focus your search. I mentioned above that the search should not return web pages in which only one of the words is present. In practice most search engines will actually also return web pages in which either word is present (theoretically the result of using the OR operator) but they will rank such pages much lower in the results list. An interesting
variation is the search +ovary cancer which
should return all of the web pages containing ovary, and will rank higher any pages which
also contain the word cancer. |
|
|
|
OR |
Search
words on either side of this operator are sufficient to score a hit. Example: ovary OR cancer will return web pages in which either the word ovary or the word cancer are present. It will also return web pages in which both the word ovary and the word cancer are present. Because OR is the default operator for most search engines the result is the same as if you entered the words without any operator. Example: ovary cancer is equivalent to ovary OR cancer |
OR will broaden your search. One of the most important uses of OR is to indicate to the search engine the use of a synonym Example: (ovary) AND (cancer OR carcinoma OR neoplasm) will return pages containing the word ovary together with any of the three words in brackets. You may wonder why I have put the word ovary in brackets, and this will be discussed later in the section on parentheses
|
|
AND
NOT |
Search
words after this operator make the search engine exclude the web page
from the hits. Example: ovary AND NOT cancer will return web pages in which the word ovary is present, but the word cancer is not. The result is the same as if you had used the + and - signs as previously described in the section on search engine arithmetic. Example: +ovary -cancer is equivalent to ovary AND NOT cancer |
Use this operator with caution as it can often
exclude a number of otherwise useful hits. For example, if the author has written 'This
article is about the human ovary, but does not include any discussion about cancer of the
ovary' then this might be exactly what you want. However, the search ovary AND NOT cancer will exclude this web page, because it
contains the word cancer.
|
Parentheses () are often neglected in discussions of Boolean logic, but they are an integral part of the logic, as they tell the search engine in what order to process the operators.
They also significantly affect the order in which the hits are displayed. Consider the three searches below.
ovary AND
cancer OR carcinoma (ovary) AND (cancer OR carcinoma) |
The first string is somewhat ambiguous. The AND
operator technically takes precedence over the OR
operator (in much the same way that the multiplication sign takes precedence over the
addition sign in simple arithmetic so that 2x3+3=9 and not 12).
The result would be that the search engine would find pages containing ovary and cancer together, or containing carcinoma (with or without ovary).
The second string attempts to correct this problem by instructing the search engine to process the cancer OR carcinoma OR neoplasm as one item, and then to combine the result with ovary in an AND relationship. However, one unforeseen consequence of this is that because the search engine processes the contents of the parentheses first, it uses the results of the string within the brackets as the most important keyword term when it comes to ranking results. The ranking order is equivalent to a search of the type cancer AND ovary as opposed to ovary AND cancer.
What you might find is that a web page which mentions cancer many times, and ovary infrequently, would be favoured over one which mentions ovary frequently and cancer less often. This might be what you want, but you should remember that the normal order of ranking of keywords as the most important being on the left is overridden when parentheses are present. The search engine is obliged to follow the rules of nesting when there are brackets - the deepest nest is interpreted first.
The third string, (ovary) AND (cancer OR carcinoma) compels the search engine to interpret ovary first (as the first of two equally nested brackets) and to combine it in an AND relationship with the results of the second set of brackets. The ranking order of results would be equivalent to the ovary AND cancer search string.
The general order of interpretation of parentheses, therefore, follows basic mathematical rules. See if you can follow the logic of the example below.
FOURTH to be (THIRD to be (FIRST to be evaluated) (SECOND to be evaluated) evaluated) evaluated
Caution must be taken not to promote the most deeply nested terms into being the most important terms when the results are ranked.
Having read and understood all of the points above, it will disappoint you to know that the search engines often apply Boolean logic rather loosely, effectively trying to make guesses at the logic you really want, as opposed to what is in the search string. Whilst this is helpful to inexperienced searchers, it is rather irritating if you have carefully constructed a search, only to find that the search engine treats it as if the parentheses weren't there!!
![]() |
Raouf Allim
22 Benjamin Road
High Wycombe
Bucks. HP13 6SR
raouf@allim.tc
2nd August 2000