Webrecorder’s Specialized Crawling Services
Although we build open source crawling tools for everyone, Webrecorder also performs web crawling using our Browsertrix tool suite on behalf of our users or customers.
We also host public-facing archives in the public interest, for example https://govarchive.us
This crawling is done with a specialized user-agent, which contains the
standard browser user-agent with “Browsertrix/1.x”
appended. For example:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/144.0.0.0 Safari/537.36 Browsertrix/1.12 This crawling will be done with respect to robots.txt exclusions and either through designated IPs or with previous agreement from the site owners to crawl their websites on their behalf.