Skip to main content
Loading...

How Can I Improve My Robots.txt?

When you’ve checked your robots.txt and discovered it isn’t doing what you need it to do, whether that means accessing the sites you need or not accessing others, now is the time to improve it. You may believe you’ve already carried out robots.txt best practices, such as naming the file name correctly as robots.txt, adding rules, uploading, and testing them, but there is more you can do.

Robots.txt Example

What Is The Best Robots.txt Example?

To understand where you might have gone wrong, take a look at this quick example:

User-agent: *

Disallow:

This allows all bots access to your entire site. If you had added a slash after disallow, it would block them from accessing it instead. Therefore, if you don’t want your entire site not to be crawled, you must add an instruction after this slash. For example, you might have something like this:

User-agent: *

Disallow: /specific URL/folder name/image you want to disallow access to

Adding Googlebots into the user-agent section would then further narrow it down. Now, you only want Googlebots to ignore the URL, folder name, or image.


Robots txt Sitemap

Why Add Your XML Sitemap To Robots.txt?

Adding a sitemap to your robots.txt lets Google see your requests more quickly, boosting visibility and ensuring crawling precision. In addition, it will streamline the crawling process and help get any new content you produce picked up more quickly. The more content you upload, the more frequently Googlebots will crawl your website; a sitemap can help with this.

Uncover Issues

You need to uncover the exact issues you are facing using a reliable checker like ours. Some of the most common issues people face include demonstrating poor use of Wildcards, not having a sitemap URL, or allowing access to development sites. You might find your issues lie in having blocked scripts and stylesheets or having a noindex in robots.txt. It could even be that your robots.txt file isn’t in the root directory or that you’ve incorrectly formatted your text. Make sure to uncover your issue by getting an expert to review it or using a checker tool. Then, you can work on fixing it.

Using Wildcards In Your Robots.txt

Using wildcards is a great technique for writing your robots.txt. However, you may find that you aren’t using them as efficiently or as accurately as you need. Rather than listing each of the URLs that have parameters on a separate line, use a wildcard that blocks all URLs found in the subfolder you’ve specified.

Use New Lines

Any time you want to enter a new directive, you must start it on a new line. If you have not done this, Google and other search engines will be unable to read these instructions. This could be why your current robots.txt file isn’t doing what you need it to. It should not read as user-agent: * Disallow: /directory.

Only Use Each User Agent Once

Unlike other mistakes made, using the user-agent more than once isn’t necessarily going to cause issues for you, but it does make it look more complex. When it appears in this way, there is a higher chance of bigger mistakes occurring, which cause more damage. Therefore, you should aim to only reference a user-agent once – it keeps it simple and tidy.