How does a search engine work?



Have you ever thought how a google can find all of your answer within a fraction of a second? It’s a tail about not only google but of all search engines that are available today. In the context of google, we will try to represent the whole picture of how a search engine works. The process can be divided into 3 steps.

First one is Crawling and Indexing:

There are almost 130 trillion individual pages available in the world web and it’s growing indefinitely. Google uses automated programs called spiders or crawlers, just like most search engines to index all the links from page to page. It’s like an index page of any book where from any keyword you can go to the specific page containing the keyword. Here instead of an index page, google uses database server named as index server where it takes all of that data from a crawl and places it in a big database. The index also includes text from millions of websites from all around the world where site owners allow the crawling. In the indexing phase, they sort the pages by their content and other factors.

This slideshow requires JavaScript.

The second one is Algorithms:

To find an answer from near about 130 trillion web pages is a situation similar to find a needle in a haystack. Frankly speaking, the second situation is relatively easier. Now, if you are a smart person you will try to find a tool to search the needle. Google uses its algorithms which are computer programs that anticipate the hints from the input to return the exact answer what the user is looking for. Google uses a brand marked algorithm called PageRank, which attributes each Web page a relevancy score. The algorithm was named after inventor Larry Page. The algorithm uses over 200 factors to rank the pages like site and page quality, freshness, safe search, user context etc.

The third one is Fighting spam:

Spam results are very annoying to everyone google always tries to put the best relevant result in front of the users free from spam. The majority of spam removal is automatic. Google examine other questionable documents by hand. If they find spam, they take manual action. When they take action, they attempt to notify the website owners so that site owners can fix their site.

Courtesy: Google.Inc

YouTube: Learn it from Google



Sayan is a theoretical physicist from Plasma Science discipline. He is a front-end developer with an eye for details and a passion for perfection. He enjoys writing popular science articles and taking part in discussions.