Webrecorder Crawling Services

Webrecorder’s Specialized Crawling Services

Although we build open source crawling tools for everyone, Webrecorder also performs web crawling using our Browsertrix tool suite on behalf of our users or customers.

We also host public-facing archives in the public interest, for example https://govarchive.us

This crawling is done with a specialized user-agent, which contains the standard browser user-agent with “Browsertrix/1.x” appended. For example:

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/144.0.0.0 Safari/537.36 Browsertrix/1.12

This crawling will be done with respect to robots.txt exclusions and either through designated IPs or with previous agreement from the site owners to crawl their websites on their behalf.

More about Browsertrix

Back to Home Page