-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SEO Audits] Integrate robots.txt analysis #4356
Comments
Do we want to let user know if |
Should we fail in such case? How about robots.txt that is only blocking e.g. |
For consistency with #3182 let's try to avoid distinguishing between crawlers. If the common case is The alternative is to fail the audit when seeing anything resembling noindex, which seems too strict. I'd also love to see the contents echoed back in the extra info table or similar. Just showing it to users is a sort of manual validation, even if the audit passes. As a secondary benefit, this would be great for data mining later. |
For the record, here is the full set of rules I've put together from various sources and implemented in the robots.txt validator: Rules
TestI did run my validator against top 1000 domains and got following errors for 39 of them: https://gist.github.com/kdzwinel/b791967eb66d0e2925ea22c8ca14233a . ResourcesVarious docs:
and online validators: |
Using a JS-based robots.txt parser (like this one), validate the file itself and apply existing SEO audits whenever applicable.
This integration has two parts:
robots.txt is valid (new audit)
Audit group: Crawling and indexing
Description: robots.txt is valid
Failure description: robots.txt is not valid
Help text: If your robots.txt file is malformed, crawlers may not be able to understand how you want your website to be crawled or indexed. Learn more.
Success conditions:
all, noindex
Page is not blocked from indexing
Add the following success condition:
Note that directives may be applied to the site as a whole or a specific page. Only fail if the current page is blocked from indexing (directly or indirectly).
The text was updated successfully, but these errors were encountered: