View Full Version : robots.txt not working.
krisleech
08-04-2003, 11:06 AM
Can anyone tell me why my robots.txt does not work. http://www.phizzie.com/robots.txt
In robots.txt is:
User-agent: *
Disallow: copyright.htm
Disallow: privacy.htm
Disallow: small-business-links.htm
But http://www.phizzie.com/small-business-links.htm has a Pagerank. When the spiders should be visiting it.
Any suggestions as to why. Im sure the syntax in the robots.txt is correct.
MichaelC
08-04-2003, 07:16 PM
First off, many bots will ignore your robots.txt file (for example, every known address-harvesting bot). Others will read it and explicitly go where you told them not to go. :(
To make doubly sure the honorable bots aren't logging your pages to search engines, you might consider using the robots meta tag (http://www.robotstxt.org/wc/meta-user.html) on those pages. The robotstxt.org site has lots of useful information about bots.
MC
krisleech
08-05-2003, 04:02 AM
Thank michael, will take a look and try that out.
I would have thought googles robots would obey the robots.txt, as i am mainly concerned about google. I know googles bots have been on at least one of the pages in the robots.txt.
cheers kris.
frankc
08-05-2003, 10:39 AM
Originally posted by krisleech:
....I know googles bots have been on at least one of the pages in the robots.txt.Well, I'd bet many bad 'bots spoof their name, so it may not have been Google.
krisleech
08-05-2003, 11:40 AM
I know the real google bot has been as the pages actually have a pagerank of zero. If google bot had not been the page rank would be "google has not ranked this page", or something along those lines.
Jarrod
08-05-2003, 12:22 PM
I know the real google bot has been as the pages actually have a pagerank of zero. If google bot had not been the page rank would be "google has not ranked this page", or something along those lines.
I'm not sure about this. Using Google Toolbar V2 I've just had a look at the page ranking of the CNC pages for my web site. They also shows up as a page rank of zero, yet google (or any bot for that matter) has never been near the pages as they are all protected by a .htaccess file.
krisleech
08-05-2003, 04:09 PM
Yer, ur right jarrod ive just checked the CNC and it gives 0 out of 10.
I guess the only certain way to know if google bot is going to those pages is to serach google for a phrase or word that is on one of those page that is not in any other webpage on the web. And if the webpages comes up as a result then google bot has been on the webpage, it must have to put in googles database.
worth a try anyway,
Kris.
vBulletin® v3.6.8, Copyright ©2000-2013, Jelsoft Enterprises Ltd.