Weekly SEO news: 27 November 2007
Welcome to the latest issue of the Search Engine Facts newsletter.

This week, we're taking a look at the new robots.txt commands that are supported by Google. Make sure that Google can parse your robots.txt file.

In the news: Google confirms that hosting a website on a shared IP address is not bad for high rankings, Google doesn't like paid links and more.

Table of contents:

We hope that you enjoy this newsletter and that it helps you to get more out of your website. Please pass this newsletter on to your friends.

Best regards,
Andre Voget, Johannes Selbach, Axandra CEO

1. New robots.txt commands: make sure that Google can index your site

It seems that Google is currently experimenting with new robots.txt commands. If your robots.txt file accidentally contains one of the new commands, it might be that your robots.txt file tells Google to go away.

What is a robots.txt file?

The robots.txt file is a simple text file that must be placed in your root directory (http://www.example.com/robots.txt). It tells the search engine spider which web pages on your website should be indexed and which web pages should be ignored.

You can use a simple text editor to create a robots.txt file. The content of a robots.txt file consists of so-called "records".

A record contains the information for a special search engine. Each record consists of two fields: the user agent line and one or more Disallow lines. Here's an example:

User-agent: googlebot
Disallow: /cgi-bin/

This robots.txt file would allow the "googlebot", which is the search engine spider of Google, to retrieve every page from your site except for files from the "cgi-bin" directory. All files in the "cgi-bin" directory will be ignored by googlebot.

Which new commands is Google testing?

Webmasters have found out that Google seems to be experimenting with a Noindex commands for the robots.txt file. It basically seems to do the same as the Disallow command so it's not clear why Google is using this command.

Other commands that might be tested by Google are Noarchive and Nofollow. However, none of these commands is official yet.

How does this affect your rankings on Google?

If you accidentally use the wrong commands then you might tell Google to go away although you want them to index your pages.

For that reason, it is important that you check the content of your robotx.txt file.

How to check your robots.txt file

Open your web browser and enter www.yourdomain.com/robots.txt to view the contents of your robots txt file. Here are the most important tips for a correct robots.txt file:

search engine robots
  1. There are only two official commands for the robots.txt file: User-agent and Disallow. Do not use more commands than these.

  2. Don't change the order of the commands. Start with the user-agent line and then add the disallow commands:

    User-agent: *
    Disallow: /cgi-bin/

  3. Don't use more than one directory in a Disallow line. "Disallow: /support /cgi-bin/ /images/" does not work. Use an extra Disallow line for every directory:

    User-agent: *
    Disallow: /support
    Disallow: /cgi-bin/
    Disallow: /images/

  4. Be sure to use the right case. The file names on your server are case sensitve. If the name of your directory is "Support", don't write "support" in the robots.txt file.

You can find user agent names in your log files by checking for requests to robots.txt. Usually, all search engine spiders should be given the same rights. To do that, use User-agent: * in your robots.txt file.

What happens if you don't have a robots.txt file?

If your website doesn't have a robots.txt file (you can check this by entering your www.yourdomain.com/robotx.txt in your web browser) then search engines will automatically index everything they can find on your site.

Checking your robots.txt file is important if you want search engines to index your web pages. However, indexing alone is not enough. You must also make sure that search engines find what they're looking for when they index your pages.

You can make sure that Google indexes your web pages for the right keywords by optimizing your website. If search engine spiders index unoptimized pages, chances are that you won't get high rankings.

2. Search engine news of the week
Google confirms that shared IP address aren't bad for high rankings

"Lots of sites are hosted on shared IPs.  If this had a negative effect on ranking, it would harm most of the sites on the web--and that's not good for small webmasters or for our users.  So, understandably, sharing an IP should not have an effect your ability to rank."



Google updates its help pages regarding paid links

"Buying or selling links that pass PageRank is in violation of Google’s webmaster guidelines and can negatively impact a site’s ranking in search results."

Editor's note: Google doesn't like paid links. Better try building organic links with ARELIS.



Creative Aussies bane of Google

"The tendency of Australians to type words into search engines much as they speak and think, with frequent use of colourful language and occasional misspellings, is giving advertisers a way to circumvent the rising cost of paid search engine advertising."

Editor's note: Further information on how to lower your AdWords costs can be found here.



Search engine newslets
  • Google Talk gadget in 20 new languages.
  • Is Google's $10 million Android contest actually slowing developers down?
  • Rule based iGoogle themes.
  • Australian election results live in Google Earth.
  • AOL UK debuts mobile web portal.
  • 5 ways Google might monetize natural search results.
  • Google News indexing captchas.
  • JotSpot to replace Google Pages soon?
  • Google replaces Video link with Products link.
3. Articles of the week
Is Google spinning out of control?

"Google is a company convinced of its own brilliance and its clear vision of the future. Being a hotbed of Mensa members will do that to you. As will stumbling early onto an obscenely lucrative business model. The same thing happened to a company called Microsoft."



Google's Chinese foray depends on local know-how

"Google Inc's inroads into China relies on it linking up with local partners, navigating draconian regulations and understanding Chinese tastes to make it in one of the world's fastest-growing and lucrative Internet markets."



Is Google recession-proof?

"Probably not, and that means web companies would do well to broaden their income base."

Back to table of contents - Visit Axandra.com

4. Recommended resources

"This software rocks!"

"I've used ARELIS to find some great links and since launching my site just a month ago, I've already achieved top 10 rankings on some very good keywords on Google, as well as a few number 1 placements. This software rocks!"
Mohammed Ali, www.lifejewel.com



Get your site mentioned in front of 140,000+ subscribers

    We want to hear from you about your successes with IBP or ARELIS. Just write us 2-3 sentences and you might get featured in this newsletter along with your web site address.

 

Back to table of contents - Visit Axandra.com

5. Previous articles

Back to table of contents - Visit Axandra.com

Do you have a minute?