Mmmmmmm, fresh, hot data! With instructions to boot:
I am very happy to announce that Common Crawl has released 2012 crawl data as well as a number of significant enhancements to our example library and help pages.
…
Along with this release, we’ve published an Amazon Machine Image (AMI) to help both new and experienced users get up and running quickly. The AMI includes a copy of our Common Crawl User Library, our Common Crawl Example Library, and launch scripts to show users how to analyze the Common Crawl corpus using either a local Hadoop cluster or Amazon Elastic MapReduce.