Web crawler

10 bookmarks
Custom sorting
cobalt.tools
cobalt.tools
save what you love. no ads, trackers, or other creepy bullshit.
ยทcobalt.toolsยท
cobalt.tools
HTTP Archive
HTTP Archive
The HTTP Archive Tracks how the web is built by periodically crawl the top sites on the web and record detailed information about fetched resources, used web platform APIs and features, and execution traces of each page.
ยทhttparchive.orgยท
HTTP Archive
Common Crawl
Common Crawl
ยทcommoncrawl.orgยท
Common Crawl
StormCrawler
StormCrawler
StormCrawler is collection of resources for building low-latency, scalable web crawlers on Apache Storm
ยทstormcrawler.netยท
StormCrawler