I couldn’t extract a really good money quote, but found Josh Wills post on Seismic Data Science: Reflection Seismology and Hadoop well worth reading. First Wills delves into the squishy term “data science” and usefully adds some definition. Then he looks at how the company he’s with, Cloudera, built some interesting infrastructure to adapt Hadoop to the seismic data processing domain.
One observation Wills made is really on point. The original core Hadoop infrastructure is starting to look like basic plumbing. Meanwhile there’s a coming (ongoing?) explosion of domain specific programming models, tools, and applications being built on top of Hadoop as a platform. Sort of like how Lisp macros enable the proliferation of domain specific languages.
Except not quite as elegant. But I quibble.