 |

Internet Marketing &
Business Promotion
The Google Dance finally
explained
By Serge Thibodeau
Just
how does Google update its whole index? This is a rather
broad question, but I will attempt to explain it in
each and every step that Google takes every month to
ensure its database is the most relevant and of the
highest quality.
Quite a good number of people and companies realize
that, in order to obtain the best Google rankings early
in their search engine optimization (SEO) campaigns,
it is important to take all the necessary steps before
and carefully plan ahead. In respect to non-fee search
engine submissions, Google happens to be one of the
very few left. It's also one of the earliest to include
it in its database. As/of July 10th, 2003, it is widely
estimated that the worldwide Google database consists
of over 3.4 Billion pages! And that is only a fraction
of all available web sites, as some of them are not
open to Google, ie: those sites are not to be visited
by a search crawler or spider.
Such as it often is in real life, there are a lot of
risks and potential complications that website owners,
webmasters and Search Engine Optimization (SEO's) professionals
need to carefully assess while in the initial production
and pre-launch phases of the marketing program. While
most experts agree that Google spiders (crawls) before
and after certain phases, they are not certain at which
exact point in the month they will do their spidering
and finally update their whole database. In this article,
we will attempt to explain in great detail what the
"Google Dance" entails, when and how to read
"Googlebot" and at the exact right time. Additionally,
we will tell you what all of this means for your search
engine optimization campaign.
The Famous Google Dance
If you think you need to write an email to get invited
to an annual dance at the Google headquarters, the GooglePlex,
you may want to continue reading this important chapter
on just exactly how Google's database and search robot
technology operates. While the Google monthly update
cycle is fairly well documented, over the past year
this cycle (now affectionally called the "Google
Dance") has become less and less of a pattern and
more of a giant step in darkness for most webmasters
and site owners who anxiously await all monthly updates.
Each "dance" begins with Google making a
major, deep crawl. Let's call it Crawl A. What it does
is it spiders the whole web- over 3.4 Billion pages
at last count. Google uses over 15,000 inexpensive PC's
(actually, conventional desktop computers) spread all
over the world, located in different data centers. When
it sends Googlebot (or DeepBot) out to spider the current
sites within its database, as well as to find new websites
that have recently been launched on the web. Initially,
once Google has completed this Crawl A, effectively
catching all of these web pages for its next update,
there will be a second update afterwards, roughly two
weeks later.
Google will then update its whole database, showing
the new results on www2.google.com and www3.google.com.
All along during this update, the results are often
rapidly switched between the primary database and the
second and third database. As stated earlier, since
Google uses over 15,000 servers, most people in all
areas of the world are usually seeing very different
search results, until most of the update in finally
completed. The "Google Dance" will continue
for another few days, but usually no longer than a week
in duration (unless there are problems and/or major
algorithm changes done by Google such as the April 2003
update).
At all times, both during and directly following each
database update, Google will again start another heavy
spidering, we will call it Crawl B, of all the existing
websites in its current database and also newer websites
that have been recently launched on the web and picked
up by its search crawlers. After this spidering by Googlebot,
the cycle returns to the beginning and starts all over
again for the next month.
"Trapping" Googlebot at the perfect time
In order to get any website included in the Google
database, or have a site's updates efficiently reflected
in the database and as soon as possible, a good and
experienced webmaster needs to carefully plan ahead
and prepare everything so that he or she can effectively
"grab" Googlebot at the very precise position
in that particular monthly cycle. Most good SEO experts
know that there is the first, initial Googlebot spidering
at the beginning of the month, as well as a deep crawl
during and directly after that update.
If a webmaster wishes to have a new website included
in the Google database, the question is, will either
of these crawls insure its inclusion into the database?
Judging from our experience over many monthly updates,
this is not always the case! To be sure, if a website
is spidered in the beginning of the month, chances are
that it will not be included in that month's update.
If the website is spidered during the second crawl of
the month, which is directly following the update, it
is possible (but never guaranteed) that it will be revisited
in the next crawl and then included in the next monthly
update.
On other occasions Google will simply visit a new website
and take only the homepage and the Robots.txt file.
Such behaviour is usually a good indication that Googlebot
will come back during the next major crawl and the website
will usually be included in the update following that
second spidering. Looking back, it would seem that for
a new site to be included in the Google database, it
would take two complete visits from Googlebot. In most
cases, this would be true, although exceptions can always
happen.
In order to ensure the most rapid inclusion possible,
there are a few things an experienced webmaster can
do. If the website is spidered for the very first time
by Googlebot during or directly after the update, then
it is in good stance, as it is more than likely it will
be included in the next monthly Google Dance. If that
website is not crawled at that point, but during the
next crawl, the webmaster or site owner will have to
wait even longer for his or her website to be indexed
in Google's database.
In light of all of this, what's a typical webmaster
to do in order to get Googlebot crawl his website during
that very specific time period? He can either pray or
hope that it will happen that way, which is certainly
not very scientific, or he or she can do the necessary
homework and plan ahead the whole time. If webmasters
have other websites that are in the Google database,
they can watch the spidering and all update dates and
then carefully plan their new launchs accordingly. Additionally,
if you don't have any websites in the Google database
that you can individually monitor, you could always
watch www.google.com for the updates.
However, since in real life, there is almost no way
to be 100% certain that any website is ever going to
be crawled, either partly or completely, there are certain
cautionary steps a webmaster can do to "flag"
Googlebot and get the search robot (crawler) to the
designated website. The first step to take is to exchange
reciprocal links to the site from other websites with
a high Page Rank. In usual terms, the higher a website's
PageRank, the more that website will be crawled and
refreshed more often by Google, which really means that
your link (URL) should be picked up more quickly. A
word about relevancy: if a website is about furniture
retail, link to similar companies such as furniture
manufacturers or distributors, etc. Google will rank
you higher that way than if you just link to any site
that is off-topic.
Number two, you can submit your website to Google through
their add url section. While this is certainly not a
definite way into the Google database, it should still
be done. Number three, a webmaster can install the Google
Toolbar and then visit his or her own website through
the toolbar. Since mid-2002, there has been countless
reports of a direct correlation between a website's
inclusion into the Google database and a visit through
the Google Toolbar.
At US $299 annually, a listing in the Yahoo directory
is also a a good start in getting into Google's database,
and Yahoo does offer rapid inclusion times, usually
within seven days into their directory. Also, a DMOZ
(Open Directory Project or ODP) listing could be a good
way to have your website included in the database, although
this could sometimes take longer periods of time. DMOZ
is not 100% dependable and has had more than its share
of server problems lately.
Wrapping it all up
All the technical information that is available to
webmasters and SEO professionals alike as it pertains
to Google's crawling and update patterns can certainly
help a lot in planning and executing any search engine
optimization programs. On top of helping a lot with
all of the above, it can also help in our schedules,
as new developments & updates need to be launched
online by a certain date and time to be properly included
in the search engine database. As Google commands a
very high percentage of targeted search engine traffic
referrals, having a better, ballpark idea of when all
of this will probably start can be of immense help.
About
the Author Serge Thibodeau - SEO professional
- www.rankforsales.com
|