PDA

View Full Version : robots.txt and the cgi-bin


Cat
11-06-2004, 05:57 PM
I'm trying to block bots from my cgi-bin but they seem to be ignoring my robots.txt and happily listing these pages in the search results.

Since both Google and Yahoo, which do read the robots.txt and follow it, are ignoring it I'm guessing there must be another problem.

This is what I'm using:

User-agent: *
Disallow: /cgi-bin/

Could it be that since the cgi-bin is outside the www folder there should be a different path? Would that cover all the folders inside the bin, or do i have to specify each folder separately?

Terra
11-07-2004, 02:46 AM
robots.txt provides for advisory control only... Your well behaved bots will read and honor it...

Other bots will ignore it...

Your more nefarious bots will read it and use it as a guide to crawl potentially juicy parts of your site... The whole forbidden fruit tastes the best kinda thing... ;)

--
Terra
--if everyone will look to your right, you will see this city's seedy red light district - just pretend it isn't there--
FutureQuest

Cat
11-07-2004, 05:01 AM
hmm... maybe Yahoo and Google wanted a light fruity snack lol.

Usually they're pretty well behaved bots, that's why I wondered if I wrote it wong somehow. Seems odd that they both ignored it.