The Internet Archive discovers and captures web pages through many different web crawls/
At any given time several distinct crawls are running, some for months, and some every day or longer/
View the web archive through the Wayback Machine/
This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds)
Survey 7 is based on a seed list of 339,249,2/8 URLs which is all the URLs in the Wayback Machine that we saw a 200 response code from in 20/7 based on a query we ran on Feb/ /st, 20/8/
The WARC files associated with this crawl are not currently available to the general public/