Hilary Mason has some advice:
I get quite a few e-mail messages from very smart people who are looking to get started in data science. Here’s what I usually tell them:
The best way to get started in data science is to DO data science!
First, data scientists do three fundamentally different things: math, code (and engineer systems), and communicate. Figure out which one of these you’re weakest at, and do a project that enhances your capabilities. Then figure out which one of these you’re best at, and pick a project which shows off your abilities.
Good input, although I’m still somewhat partial to Sean Taylor’s “Ask, Answer, Tell” for what data scientists do. I think coding is important, but while engineering systems can be scientific, the practice is much more pragmatic and provides much different challenges than scientific exploration. Probably a minor quibble but engineering is quite different from science.
Still, getting your hands dirty is better than noodling in the abstract on the sidelines.