Earlier this week, I discovered that some of the XML site maps being generated by various WordPress installations were sending 404 headers, making them basically useless to Google.
After some digging, I found that the culprit was my implementation of W3 Total Cache on those sites. Basically, I had W3 Total Cache configured in the following way:
- Under “Page Cache”, I had the “Cache 404 (not found) pages” option unchecked – You should never enable this setting, unless you’re doing so temporarily because of a major performance hit – turning this option on causes your site to send 200 (OK) status headers with your 404 pages, causing all bots (including Google and Bing) to fail to realize that those items/pages don’t actually exist.
- Under “Browser Cache”, I had “Do not process 404 errors for static objects with WordPress” enabled
- My “404 error exception list” was set at the default
I was also using the “BWP Google XML Sitemaps” plugin to generate my XML sitemaps. Note: if you are using Yoast’s plugin to generate your sitemaps, you shouldn’t encounter this issue, as W3 Total Cache is configured by default to support that system. However, the BWP Google XML Sitemaps plugin uses a different naming convention than Yoast’s plugin does, so W3 Total Cache didn’t realize that the sitemaps generated by it needed to be ignored and given a proper header.
To fix the issue, I simply added “(.*)\.xml” to the “404 error exception list” under Browser Cache, and the sitemaps started working properly.