check_site

synopsis:Site validation spider

A site validator which uses webtoolbox.clients.Spider to process an entire site and checking for bad links, 404s, and optionally HTML validation. It generates either text or HTML reports and can be used to generate lists of site URLs for use with load-testing tools like http_bench or wk_bench.

--help

Display all available options and full help

-v
--verbosity

Increase the amount of information displayed or logged

--validate-html

Process all HTML using HTML Tidy and report any validation errors

--format=REPORT_FORMAT

Generate the report as HTML or text

--report=REPORT_FILE

Save report to a file instead of stdout

--skip-media

Skip media files: <img>, <object>, etc.

--skip-resources

Skip resources: <script>, <link>

Skip links whose URL matches the specified regular expression

--save-page-list=PAGE_LIST

Save a list of URLs for HTML pages in the specified file for use with a tool like http_bench or wk_bench

--save-resource-list=RESOURCE_LIST

Save a list of URLs for pages resources in the specified file

--log=LOG_FILE

Specify a location other than stderr

--simultaneous-connections=2

Adjust the number of simultaneous connections which will be opened to the server