Skip to content

Troubleshooting crawl errors

Overview

Cloudflare allows search engine crawlers and bots. If you observe crawl issues or Cloudflare challenges presented to the search engine crawler or bot, contact Cloudflare support with the information you gather when troubleshooting the crawl errors via the methods outlined in this guide.


Disable Anti-bot modules

Search engine crawlers’ requests, when proxied through Cloudflare, can be blocked by anti-bot modules installed on your origin server. Try disabling any anti-bot modules to prevent your origin from blocking these requests.


Adjust Google and Bing crawl rates

To optimize CDN performance, Google and Bing assign special crawl rates to websites that use CDN services in order. Special crawl rates do not negatively affect Search Engine Optimization (SEO) and Search Engine Results Pages (SERPs). To change your crawl rates for Bing and Google, follow the guides below:


Prevent crawl errors

Review the following recommendations to prevent crawler errors:

Confirm an IP address belongs to Google by consulting Google’s documentation on verifying googlebot IP addresses.

  • Do not block the United States via custom rules or IP Access rules within the Security app.
  • Do not block or User-Agents in your .htaccess, server configuration, robots.txt, or web application.

Google uses a variety of User-Agents to crawl your website. You can test your robots.txt via Google.

  • Do not allow crawling of files in the /cdn-cgi/ directory. This path is used internally by Cloudflare and Google encounters errors when crawling it. Disallow crawls of cdn-cgi via robots.txt:

Disallow: /cdn-cgi/


Troubleshoot crawl errors

Troubleshooting steps for the most commonly reported crawl errors are mentioned below.

HTTP 4XX Errors

HTTP 4XX errors are the most common type of crawl error. Cloudflare delivers these errors from your web server to Google. These errors are caused for various reasons such as a missing page on your web server or a malformed link in your HTML. The solution depends upon the problem encountered.

HTTP 5XX Errors

HTTP 5XX errors indicate that either Cloudflare or your origin web server experienced an internal error. To correlate occurrences of crawl errors with site outages, monitor your origin web server’s health. Monitoring your website health both through Cloudflare and directly to your origin web server IPs determines whether errors occurred due to Cloudflare or your origin web server.

DNS Errors

Troubleshooting steps vary depending on whether your domain is on Cloudflare via a Full or CNAME setup. To verify which setup your domain uses, open a terminal and execute the following command (replace www.example.com with your Cloudflare domain):

dig +short SOA _www.example.com_

For domains on a CNAME setup, the result response contains cdn.cloudflare.net. For example:

example.com.cdn.cloudflare.net.

For domains on a Full setup, the result response contains the cloudflare.com domain in the nameservers listed. For example:

josh.ns.cloudflare.com. dns.cloudflare.com. 2013050901 10000 2400 604800 3600

Once you’ve confirmed how your domain was setup with Cloudflare, proceed with the troubleshooting steps appropriate to your domain setup.

CNAME

Contact your hosting provider to investigate DNS errors and provide the date Google encountered DNS errors. Additionally, review the Cloudflare System Status page for any network outages on the date the errors were encountered by Google.

Full

Contact Cloudflare support and provide the date and time that Google observed the errors.

Requesting troubleshooting assistance

If the above troubleshooting steps do not resolve your crawl errors, follow the steps below to export crawler errors as a .csv file from your Google Webmaster Tools Dashboard. Include this .csv file when contacting Cloudflare Support.

  1. Log in to your Google Webmaster Tools account and navigate to the Health section of the affected domain.
  2. Click Crawl Errors in the left hand navigation.
  3. Click Download to export the list of errors as a .csv file.
  4. Provide the downloaded .csv file to Cloudflare support.

Google’s documentation on crawl errors and troubleshooting