Continuing
our discussion on how to find information from
Internet - we look at ways to make search result
more fruitful.
At the heart of any search endeavour,
no matter what kind of search tool you are using,
there are three areas that can affect your search
result significantly:
-
Content of search engine
-
Search logic or algorithm
-
Presentation of search result
Content
of search engine
A search engine collects information
for its database by accepting listings sent by
websites who want exposure, from its own spiders
(please see earlier discussion) or by simply using
databases of other search engines (e.g. meta search
engines). There are two issues in the process
that you, as information searcher, should be aware
of:
-
Focus of search engine
-
Degree of information collection
There are thousands of search engines
- and each has a focus area. Few big ones like
Yahoo!, Alta Vista or Google are universal - they
accept information on any subject or from any
geographical area so long as the website satisfies
their respective editorial policy. However, most
others are selective on content. For example -
country specific search engines accept webpages
only from or on the concerned country. Subject
specific search engines do not accept webpages
on alien subjects. Even universal search engines
like Yahoo!, MSN etc. have their country specific
versions (e.g. Yahoo! India)
So, if looking for information on
Australia - look for Australia specific search
engines.
There are many sources in The Net
that compiles information on search engines. Following
are a few for your convenience:
Degree of
information collection
Though actual working of Spiders is
closely guarded secret in many cases - it is generally
assumed that they start with a historical list
of links, such as server lists, and lists of the
most popular or best sites, and follow the links
on these pages to find more links to add to the
database. A spider could send back just the title
and URL of each page it visits, or just parse
some HTML tags, or it could send back the entire
text of each page. The coverage and degree of
indexing can have a bearing on quality of your
search result.
Many search engines use 'fields' to
store information collected from various parts
of a webpage. The title, the URL, image tag, hypertext
link etc. are common fields on a Web page. Field
searching allows the searcher to designate where
a specific search term will appear. Rather than
searching for words anywhere on a Web page, field-specific
searching can considerably reduce unwanted or
junk information in search result.
For example,
in Alta Vista - the searches
text : infobanc
Finds pages that contain the specified
text (i.e. infobanc) in any part of the page other
than an image tag, link, or URL.
title:'The Great Indian
Bazaar'
Finds pages that contain the specified
phrase 'The Great Indian Bazaar' in the page title
(which appears in the title bar of most browsers).
url:text
Finds pages with a specific word or
phrase in the URL. For example - url:export will
find all pages on all servers that have the word
export anywhere in the host name, path, or filename.
More search tips in coming issues
|