Friday, February 12, 2010

What is the Google Dance?

Approximately once a month, Google update their index by recalculating the Pageranks of each of the web pages that they have crawled. The period during the update is known as the Google dance.
Because of the nature of Page Rank, the calculations need to be performed about 40 times and, because the index is so large, the calculations take several days to complete. During this period, the search results fluctuate; sometimes minute-by minute. It is because of these fluctuations that the term, Google Dance, was coined. The dance usually takes place sometime during the last third of each month.
Google has two other servers that can be used for searching. The search results on them also change during the monthly update and they are part of the Google dance.
For the rest of the month, fluctuations sometimes occur in the search results, but they should not be confused with the actual dance. They are due to Google's fresh crawl and to what is known "Everflux".

Google has two other searchable servers apart from www.google.com. They are www2.google.com and www3.google.com. Most of the time, the results on all 3 servers are the same, but during the dance, they are different.
For most of the dance, the rankings that can be seen on www2 and www3 are the new rankings that will transfer to www when the dance is over. Even though the calculations are done about 40 times, the final rankings can be seen from very early on. This is because, during the first few iterations, the calculated figures merge to being close to their final figures. You can see this with the Pagerank Calculator by checking the Data box (top left) and performing some calculations. After the first few iterations the search results on www2 and www3 may still change, but only slightly.
During the dance, the results from www2 and www3 will sometimes show on the www server, but only briefly. Also, new results on www2 and www3 can disappear for short periods. At the end of the dance, the results on www will match those on www2 and www3.
This Google Dance Tool allows you to check your rankings on www, www2 and www3 and on all of data centers simultaneously.

Google currently has 12 data centers, any one of which can provide the Toolbar PageRank of any page. As the dance progresses, these data centers are updated one by one. Before the dance begins, they all return the same, current PageRank value for a given page, but during the dance they are updated, one by one, to the new PageRank value. Checking each of the centers during the dance reveals the new PageRank values as they gradually spread through the centers. If the PageRank isn't going to change, the centers show the same values throughout, of course.
Querying the data centers
For this, it is necessary to have the Google Toolbar installed and the PageRank indicator on. Every time a page is received by the browser, the Toolbar requests its PageRank from one of Google's data centers. The information is returned as a one-line text file and stored in the Temporary Internet Files folder.
The Toolbar's request URL includes the URL of the page that it wants the PageRank for (the target page), and a checksum that matches that URL. Of course, the checksum must match the target page's URL.
A fat URL for a typical Toolbar request (all in one line):-
http://216.239.33.102/search
?client=navclient-auto
&ch=5150615727
&features=Rank:FVN
&q=info:http%3A%2F%2Fwww%2Eexampledomain%2Ecom%2F

If you copy and paste that fat URL into your browser, you will get Google's "forbidden" page back. That's because the target page and checksum don't match - it's just an example of the request URL.
Notice that the target page is in escaped format - some of the characters are represented by hexadecimal codes (e.g. %2F).
To get the new PageRank for a particular page, you need to make the same request that the Toolbar makes for it. I.e. you need the fat URL that the Toolbar uses. And you need to request the PageRank from all of Google's data centers. The method is a bit long-winded but it works. Here's how to do it:-

  • Use your browser to browse to the page. This makes sure that the page and the Toolbar's PageRank request are in your Temporary Internet Files folder. You only need to do this once - not every time.




  • Open the index.dat file from the Temporary Internet Files folder into a text editor, and perform a search in it for the target page. You'll find the entire fat URL, similar to the one above, for the Toolbar's PageRank request. NOTE: Because the target page is escaped in the fat URL, search only for an unescaped part; e.g. "exampledomain".




  • When you've found the fat URL, copy and paste it into your browser's address box and press Return or click Go. If the page is in Google's directory, the returned line includes the directory path. The last element in the first part of the line is the Toolbar PageRank value for the target page. To see the page's new PageRank spread across the centers during the dance, use the same fat URL, but replace the IP address with each of the data centers. This is also a good way to see the progress of the dance in general.
    Data centers

    216.239.33.100 :: www-ex.google.com
    216.239.35.100 :: www-sj.google.com :: currently offline
    216.239.37.100 :: www-va.google.com
    216.239.39.100 :: www-dc.google.com
    216.239.41.100 :: www-fi.google.com
    216.239.51.100 :: www-ab.google.com
    216.239.53.100 :: www-in.google.com
    216.239.55.100 :: www-zu.google.com
    216.239.57.100 :: www-cw.google.com
    216.239.59.100 :: www-gv.google.com
    66.102.11.100 :: www-kr.google.com
    66.102.7.100 :: www-mc.google.com
    TIP: If you want to check the same pages during future dances, save the fat URLs into a text document so that you don't need to go through the process of finding them in the Temporary Internet Files folder each time.





  • No comments:

    Post a Comment

    Note: Only a member of this blog may post a comment.

     
    rantop.com
    ....Our Business Partners....

    Rainrays Web Directory


    Earn upto Rs. 9,000 pm checking Emails. Join now!