[embed]https://twitter.com/bigdata/statuses/295222087457583105[/embed]
We describe and evaluate methods for learning to forecast forthcoming events of interest from a corpus containing 22 years of news stories. We consider the examples of identifying significant increases in the likelihood of disease out- breaks, deaths, and riots in advance of the occurrence of these events in the world. We provide details of methods and studies, including the automated extraction and generalization of sequences of events from news corpora and multiple web resources. We evaluate the predictive power of the approach on real-world events withheld from the system.
Mining the Web to Predict Future Events (PDF heads-up)
Reads cool, but for two things. First, they’re only mining scraped stories from The New York Times. Not quite The Web in my book. Okay, slight credit for exploiting Linked Data, but still.
Second, scraping The New York Times?! C’mon Redmond. You’re better than that. Ha, ha! Only serious!