Diggin’ in the feed cratez, ran across From Words to Concepts and Back: Dictionaries for Linking Text, Entities and Ideas, announced by Google’s Valentin Spitkovsky and Peter Norvig:
Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning — from turning search queries into relevant results to suggesting targeted keywords for advertisers — is also Google’s core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas.
And the data is readily accessible with publication capturing the details.