robot help: SpiderMonkey robots.txt Fetcher Check Your robots.txt File

robot help: SpiderMonkey robots.txt Fetcher It is important that every web site have a robots.txt file in the root directory to avoid the numerous 404 errors and to make the site more "robot-friendly". To manage a robot's crawl of your site, you can use this simple file (robots.txt) in the top-level domain (i.e.: www.mobrien.com/robots.txt) as an adjunct to properly written META tags.

# EXAMPLE robots.txt
User-agent: * # Enter specific user-agent but "*" is best.
Disallow: /cgi-bin/
Disallow: /cgi-win/
Disallow: /tmp/
Disallow: /images/
Disallow: /includes/
Disallow: /public/~specific-user/
- - - - -
# EXAMPLE robots.txt to exclude a single robot
User-agent: Bad-Bot-From-Hades
Disallow: /
- - - - -
# EXAMPLE robots.txt to allow single robot anything but forbid the rest some specifics.
User-agent: Googlebot
Disallow:
User-agent: *
Disallow: /mybirthdaysuitpictures/
Disallow: /reallyspecialimagesIdontwantpilfered/
- - - - -

Try SpiderMonkey's robots.txt Generator.


So what does yours look like?

  1. This SpiderMonkey resource will fetch and examine your robots.txt file from your web site, if it has one.
  2. If not, it will let you know.

Please enter your domain name:


[ Help | Robot Tech. Specs. | Add URL | Home Page | SpiderMonkey User Agent ]