Working Of Search Engines
Search Engines do not really search the World Wide Web directly, infact each one searches a database of the full text of web pages selected from the billions of web pages out there residing on servers. Web based search engines, such as Google, create their listings automatically by "crawling" or "spidering" the web. If the content on the web pages is changed, crawler-based search engines eventually find these changes, and that can affect how you are listed. If a web page is never linked to in any other page, search engine spiders cannot find it. After spiders find pages, they pass them on to another computer program for "indexing." This program identifies the text, links, and other content in the page and stores it in the search engine database's files. When a user comes to the search engine and makes a query, typically by giving key words, the engine looks up the index and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text.
How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve.
The vast majority of search engines are run by private companies using proprietary algorithms and closed databases, the most popular currently being Google, MSN Search, and Yahoo! Search. However, Open source search engine technology does exist, such as ht://Dig, Nutch, Senas, Egothor, OpenFTS, DataparkSearch and many others.
|