How have I not heard of Pydoop until now?! [embed]https://twitter.com/pypi/status/274179982736121856[/embed]
Welcome to Pydoop. Pydoop is a package that provides a Python API for Hadoop MapReduce and HDFS. Pydoop has several advantages 1 over Hadoop’s built-in solutions for Python programming, i.e., Hadoop Streaming and Jython: being a CPython package, it allows you to access all standard library and third party modules, some of which may not be available for other Python implementations – e.g., NumPy; in addition, Pydoop provides a Python HDFS API which, to the best of our knowledge, is not available in other solutions.