Recently I started munging some really large datasets for work and for fun. In just doing some basic statistical verification I learned a lesson that Brendan O’Conor documented: gawk is really useful, reasonably fast, and some versions blaze.
I’ve got to go and check out mawk, even if it is old and slightly busted.