Traffic Articles Distribution

Facts About Search Engine Bots
Facts About Search Engine Bots

Send to friend | Print Article | Back to Home


RSS Feed RSS Feed


In the days before Google, in fact it was around 3 B.G), AltaVista was the newest search engine available to Internet users. In order to demonstrate the superior power of their minicomputers, AltaVista team at Digital chose to browse and index the entire Internet. At the time this was something new and most webmasters did not want these 'robot' programs visiting the pages on their site due to the resultant increased load on their servers and the associated rise in bandwidth cost. This led to the Robots Exclusion Standard. This standard was created in 1996 precisely to prevent this from happening.
Using a simple text file called robots.txt you can instruct search engines to stay out of certain directories. Here is a very simple robots.txt which disallows all search engines (User-agents) access to the /images directory.
User-agent: * Disallow: /images
By disallowing /images you are also implicitly disallowing all subdirectories under /images, such as /images/logos and any files beginning with /images such as /images.html.
The first draft of the standard did not include an "Allow" directive. It was added later, but there is no guarantee it's supported by all search engines. Anything that was set to be specifically disallowed was considered fair game to web crawlers.
To disallow access to your entire web site use a robots.txt like this:
User-agent: * Disallow: /
If User-agent is * then the following lines apply to all search engine robots. By specifying the signature of a web crawler as the User-agent you can give specific instructions to that robot.
User-agent: Googlebot Disallow: /google-secrets
Since the initial specification was issued, some search engines have expanded the protocol. An example of this is to permit the use of wildcards.
User-agent: Slurp Disallow: /*.gif$
This prevents Yahoo! (whose web crawler is called Slurp) from indexing any files on your site that end with ".gif". Keep in mind that wildcard matches are not supported by all search engines so you have to preface these lines with the appropriate User-agent line.
You can combine several of the above techniques in one robots.txt file. Here's a theoretical example.
User-agent: * Disallow: /bar User-agent: Googlebot Allow: /foo Disallow: /bar Disallow: /*.gif$ Disallow: /
Computer applications work great when it comes to following well defined instructions. The human brain however is less efficient at these functions, so the best advice is to keep things simple.
Google's webmaster tools includes a robots.txt analysis tool that is very highly recommended. For more information on the Robots Exclusion Standard, point your browser to www.robotstxt.org.
Today when companies are spending a lot of money to be included in search engine listings, the idea of excluding your content may seem quaint. But from a security perspective there are many valid reasons for limiting what a search engine indexes on your site.

Nick Dalton


Send to friend | Print Article | Back to Home

info about wow herbalism news. cool wow inscription guide. lvl skinning blog.

Sindicación de Contenido Sindicación de Contenido en BlogLines - Agregar la Noticias de ProDownload a tu página principal de Blog Lines Agregar la Noticias de ProDownload a tu página principal de Google Agregar la Noticias de ProDownload a tu página principal de Yahoo! Agregar la Noticias de ProDownload a tu página principal de NewsGator

Copyright @ 2007 ByTheAticles.com | turnkey website | posicionamiento SEO

portal inmobiliario | foro posicionamiento web | plug your site | software gratis | seo friendly sites | arcade games | mejores bonos de casinos | guia de casinos | posicionamientoweb | herramientas posicionamiento web | galeria de imágenes | mercado links

Directorio webCoin AuctionapartamentosarquitectosComprando Ando
Pimientos Asadosdetectivespisos valladolidRegalate algopisos
Pymes MarketingJuegosSoftware Gratis



Warning: mysql_connect() [function.mysql-connect]: Access denied for user 'linksold'@'localhost' (using password: YES) in /home/linksold/public_html/db.inc.php on line 9
Impossibile aprire una connessione Mysql :Access denied for user 'linksold'@'localhost' (using password: YES)
info about hidden object games on line. all hidden object puzzles. everyone likes free hidden object review.
get more wow herbalism. wow inscription guide reviews. everyone need lvl skinning.