red_spider¶

synopis:	Site validation spider based on `redbot`

Mark Nottingham released redbot - a modern replacement for the classic cacheability tester. I’ve been using it at work to audit website performance before releases since proper HTTP caching makes an enormous difference in perceived site performance.

redbot is a focused tool and provides a great deal of detail about at most one page and, optionally, its resources. I wanted to expand the scope to testing an entire site and performing content validation and created red_spider.py which allows you to perform all of those checks by spidering an entire site, receiving a nice HTML report and, optionally, also validating page contents as well.

--help¶: Display all available options and full help

--format=REPORT_FORMAT¶: Generate the report as HTML or text

--report=REPORT_FILE¶: Save report to a file instead of stdout

--validate-html¶: Validate HTML using tidylib

--skip-media¶: Skip media files: <img>, <object>, etc.

--skip-resources¶: Skip resources: <script>, <link>

--skip-link-re=SKIP_LINK_RE¶: Skip links whose URL matches the specified regular expression

--save-page-list=PAGE_LIST¶: Save a list of URLs for HTML pages in the specified file

--save-resource-list=RESOURCE_LIST¶: Save a list of URLs for pages resources in the specified file

--log=LOG_FILE¶: Specify a location other than stderr

-v¶

--verbosity¶

Increase the amount of information displayed or logged