home ¦ Archives ¦ Atom ¦ RSS

Hackers vs Action

Machine Learning for Hackers vs Machine Learning in Action to be precise. Two books. One topic. Different languages. John D. Cook compares and contrasts:

Both books are about the same size and many of the same topics. One difference between the two books is choice of programming language: ML for Hackers uses R for its examples, ML in Action uses Python.

I was somewhat interested in ML for Hackers since I’m familiar with and admire Drew Conway’s online writings. The use of Python better aligns ML in Action with my interests though.


Django API Frameworks

Daniel Greenfeld compares and contrasts Django toolkits for creating REST APIs. I’m interested in other perspectives on this topic as I’ve talked about tastypie before and actually put it in practice at work.

On site comments for the post and over at HackerNews are useful as well. In particular, I have to agree with a few folks that tastypie is pretty good for fairly standard Django models, but gets a little tricky for non-ORM or search based resources. In particular, dealing with object dehydration and response URIs was a bit opaque. I may be hallucinating, but when I first got started with tastypie, this documentation node on the request/response cycle didn’t exist. Maybe it’ll clear up my confusion.

I’d still recommend tastypie, but for advanced uses prepare to spend some time digging into module source code and doing a lot of experimentation to get the right results. As you’d expect!


The 94th Minute

Gee, guess I called that one. If I didn’t know better I would have thought Manchester City executed their highwire snatching of defeat from the jaws of victory then victory from the jaws of defeat, just to explicitly taunt their Mancunian neighbors. Of course they might have killed a fan or two of their own in the process. A more fitting cherry on top of the Premiership season would have been harder to script, what with The Citizens going into stoppage time to score two goals and rescue their campaign from ignominy. Time had literally run out on the season before they came back from the dead yet again.

Unfortunately, I didn’t get to see the match live as I was out and about for Mother’s Day. Driving through DC and constantly checking the iPhone for updates is not particularly safe, I can confirm. I’m hoping ESPN goes Instant Classic with the broadcast or there’s an on-demand recording available.


Nova Makers

Link parkin’: the Nova Makers group is right around the corner in Reston, Virginia. Looks like they even have a hacker space, Nova Labs to support a wide variety of activities.

The NOVA Makers meetup is dedicated to creating and supporting a community of makers in Northern Virginia.


datavisualization.ch Selected Tools

Link parkin’: Datavisualization.ch selected tools

Datavisualization.ch Selected Tools is a collection of tools that we, the people behind Datavisualization.ch, work with on a daily basis and recommend warmly. This is not a list of everything out there, but instead a thoughtfully curated selection of our favourite tools that will make your life easier creating meaningful and beautiful data visualizations.

Hat tip Chris Diehl via Twitter.


Premier’s End

Premier League Logo So the end of the 2011-2012 Barclay’s Premier League campaign arrives on us tomorrow. There are pretty much three races. Man City vs Man U for the title. Arsenal, Tottenham, and Newcastle United for the third and fourth positions, with at least one making the next Champion’s League. And finally Bolton Wanderers and Queens Park Rangers are trying to avoid relegation.

I like how the Premiership schedules their last weekend. Everyone plays and every game starts at the same time. No team sits in their locker room, rooting for an outcome. If you need a result, all you can do is your part.

Still got it in my gut that something wacky is going to happen at the top. Maybe Rangers pull a draw against The Citizens or the Red Devils actually lose to clinch it for Man City anyway. It would only be fitting if all three races were still in doubt going into the second half of the matches.


Apps4VA

Apps4VA Logo Cool! The Commonwealth of Virginia is going to be running an apps competition later this year. Longitudinal data regarding education will be the source fuel. The competition window, about a month starting in early August, leaves enough time for a part time hacker to crank out something interesting, even if they’re not interested in launching a startup.


I Concur

With a bit of a break in the work storm last night, I tapped into two first round NBA Eastern Conference playoff series: ’76ers vs Bulls and Hawks vs Celtics. Both were elimination games so I was hoping for some high drama at the end.

As The Basketball Jones points out, both series ended in horrible thuds.

The Bulls, having been a smart, consistently hard working, high executing team wasted great defensive effort on bone-headedness. Sorry C. J. Watson, but that was just the wrong play.

The Hawks were just robbed. The officials messed up on both Joe Johnson’s drive and the in bounds play. Then the Hawks went all Hawks on us, acting stupid and choking on the line. They couldn’t even get off of a long range heave for three to try and tie it. Typical Hawks.

High drama indeed.

Ob Wiz. Please Lakers. Knock Javale McGee out of the playoffs. However, I do admit it might be fun to continue the playoffs with the Lakers out and the Clippers in.


PyCon 2013

PyCon 2013 is going to be right back where it was in 2012, Santa Clara, California. I’m assuming the Santa Clara Convention Center again. Yeah!!

Sign me up! And I promise this time not to get sick, deny any work requests, and be a more active participant.


Diggin’ On: The Symphony

Marley Marl House of Hits Cover On my iPhone, took one of my infrequent trips into the “Random Hip-Hop” playlist for some listening pleasure. Shuffle landed me on Marly Marl’s “The Symphony, Part 1”. The beat is legendary but jeez, Big Daddy Kane brings it as the last Cold Chillin’ rapper with arguably one of the greatest raps of all time.

… And battlin’ me is hazardous to your health
So put a quarter in your ass, cause ya played yourself
Like a game in the arcade. You need a far aid
I'm walkin’ the path that Allah made
I’ll attend and then begin to send a speech to reach and teach
So just say when
So I can let lyrics blast like a bullet
My mouth is the gun; on suckers I pull it
The trigger, ya figure, my pockets gettin’ bigger
Cause when it comes to money, yo, Grant's my nigga! ...

And words on the screen can’t even begin to do justice to Kane’s enunciation and delivery. Classic.

Ob moment of silence for MCA (Adam Yauch). After odes to Heavy D, Guru, and Malcolm Maclaren, I’m giving up obits in this space though.


Clarity

“Scarcity brings clarity.” Boy is that ringing true for me with a big crunch at work, a couple of holidays coming up, and my wife soon out of town for a week. When time is scarce you really start to prioritize.

I’ll sleep when I’m dead.


TileMill 0.9.1

I recently tapped into the MapBox blog and they announced TileMill 0.9.1:

We just released TileMill 0.9.1, which adds support for PostGIS 2.0, runs on the latest Node.js 0.6.17 release, and provides packages for the latest Ubuntu Long Term Support (LTS) distribution: 12.04 (Precise Pangolin). TileMill 0.9.1 is the culmination of a several month sprint on stability, with over 80 tickets closed. The full list of fixes and advances for this release can be found in the changelog. Here are a few highlights.

TileMill sounds so cool but I really have no idea what you do with it other than “make maps”. Ah, here we go:

TileMill is an application for making beautiful maps. Whether you’re a journalist, web designer, researcher, or seasoned cartographer, TileMill is the design studio you need to create compelling, interactive maps.

My only question is whether it also eases the effort to serve your maps for web clients? After one has made their maps can you just point a browser at an obvious server and go to town?

Seems like something to learn. Could be another personal project.

Shout out to MapBox as a DC area concern.


Why Postgres 2

PostgreSQL Logo Craig Kerstiens continues to catalog reasons to use PostgreSQL. Like the additions, although a quick and dirty test-drive of Multicorn failed miserably for me. Trying to build the sample application actually crashed the postgres db server, which is pretty tough to do. Probably some embedded Python and dynamic library badness, but still. Maybe I just need to go back and be a little careful about my build.


Panning Out

Well I guess that the next generation MacBook Pro announcement I was hoping for didn’t really pan out. Haven’t heard a peep out of Apple about anything MacBook related recently, even though the Intel Ivy Bridge announcement happened a few weeks ago. Although as MacWorld points out, the ultrabook version of the processor, slated for release later in the year, would be more appropriate for MacBooks. And Apple typically doesn’t pre-announce stuff so the timing would be in line. If they announce it, you can buy it.

Still waiting patiently


Footballin’

Chelsea FC LogoSo get this. Liverpool has already won the Carling Cup. Chelsea just beat Liverpool for the FA Cup. And The Blues could double through the Champions League finals, although I don’t give them much of a chance against Bayern Munich in Munich. Meanwhile, Manchester City is basically one game away, in which they are heavily favored, from beating out Manchester United for the Premier League title.

The weird thing is that Chelsea and Liverpool are definitely also-rans in the Premiership this campaign. Liverpool is in ninth place in the tables as I write this. If Chelsea doesn’t win the UEFA championship, they might not be in at all next year. Man City stunk it up in knockout play, European play, and were left for Premiership dead a month ago. One of their top players acts like a spoiled child and another took an extended golf vacation in the middle of the season.

I don’t quite know what to make of “underachievers” taking home so many trophies, but methinks they might be giving out a bit too much hardware in international football.


That Chicago House Groove

The premise of Michaelangelo Matos’ “How Chicago house got its groove back” might be a bit flawed, but I found it worth a read. Feels like Matos at least did quite a bit of interviewing and background research, including talking to folks like DJ Sneak, Derrick Carter, and Cajmere in depth. If accurate, it fills in some details of mid-90’s House music I wasn’t aware of.

The comments are somewhat illuminating as well, with Carter himself chiming in with some corrections and lamentations. Writing such a piece is always fraught, partially due to the obscurity of what’s trying to be covered making it hard to get the story right, partially because space limitations mean leaving out part of the story, and partially because there are always irate fans who know better.

Ob. disclosure. When Matos’ mentions Curtis A. Jones forsaking graduate school in Chemical Engineering, I was literally there with Cajmere at UC Berkeley. Part of a small cohort of black engineering students, we met at a College of Engineering function and started hitting the SF scene for parties. There’s brushes with greatness, but I can definitely say, “I knew him when…”.

Hat tip, @CajualRecords

P.S. As predicted, DJ Sneak’s, Fabric 62, didn’t do a whole lot for me. This is why I have a bit of a problem with the notion that Sneak somehow led a revival in Chicago House Music.


Rendering The World

MapBox Logo Interesting post by Young Hahn of MapBox on “Rendering The World”. The problem Hahn discusses is the rendering of map tiles at high zoom levels for the entire world. The obvious and straightforward way quickly becomes unscalable for the zoom levels MapBox wants to achieve due to exponential, recursive explosion.

Turns out the actual space of unique tiles, by content, is orders of magnitude smaller than the number of tiles needed a.k.a. there’s a high level of redundancy. For example, many tiles at any zoom level simply represent all blue patches of water. Capturing and exploiting this redundancy is the key to getting scalable performance.

This page had been sitting in my Chrome tabs for quite some time, but it was well worth the read once I got around to it.


Data Journalism Handbook

Link parkin’: Data Journalism Handbook

This book is intended to be a useful resource for anyone who thinks that they might be interested in becoming a data journalist, or dabbling in data journalism.


Postgres Guide

PostgreSQL Logo Link parkin’: Postgres Guide

We here are very big fans of Postgres as a database and believe it is often the best database for the job. For many though, working with and maintaining Postgres involves a steep learning curve. This guide is designed as an aid for beginners and experienced users to find specific tips and explore tools available within Postgres.

Via Craig Kerstiens who outlines a number of reasons why you might actually want to use PostgreSQL. I heartily concur.


Best Run Evah!

Holy smokes! How the ?!?!!*! did it get to be May already?

Looking at my monthly archives on the right over there tells me I’ve probably had my best posting run ever. Pretty much seven (7 wow!) months straight minus a singular brain cramp in early December.

That’s at least one post every day, including weekends and holidays. Through illness, while traveling, while my wife’s traveling, when work is bursting, and even when I think I just don’t have anything to say.

It’s been an interesting and worthwhile challenge to tackle, not to mention helping me keep one of my New Year’s resolutions. And there’s more yet to come. Not declaring victory just yet.

Now if I could only apply the same tenacity to a couple of my other resolutions. But it’ll come. I can feel an extended personal hacking run launching this summer.

Good to check back in on the resolutions. Some progress made, but more to do.


No Spoiling The Derby

After my spoiler adventures with the UEFA Champions League, I learned my lesson and avoided sports and news outlets on my way home from work. Worked like a charm as my DVR recording of The Manchester Derby went off without a hitch.

I knew it was a big match but I was somewhat surprised at the level of hyperbole even for ESPN announcers. “The biggest match in the 20 year history of the Premiership!” Okay. If you say so.

It wasn’t an epic match in terms of play, thanks mostly to Sir Alex Ferguson going all conservative and playing guys like Scholes, Giggs, and Park ??!? How Valencia gets all of 15 minutes and Chicharito doesn’t see the field is beyond me. But the strategy did stifle the Man City creativity, and if it wasn’t for the Kompany header, the Red Devils might be on their way to yet another title.

Since I’m a bit of a Man U hater, I quite enjoyed the Citizens going to the top of the table with their 1-0 victory. Let’s see if they can hang on through these last two games.


Big Dicts

atbr seems like an interesting approach to really large scale in memory key/value store, otherwise known as dictionaries, or dicts, in Python.

…atbr is basically a thin swig-wrapper around Google’s (memory efficient) opensource sparsehash (written in C++). Atbr also supports relatively efficient loading of tsv key value files (tab separated files) since loading mapreduce output data quickly is one of our main use cases.

While the authors seem a little more focused on Hadoop integration, I’ve got another interesting use case. NetworkX is a well developed Python module for graph representation, manipulation, and algorithms. The module uses Python’s built-in dicts as the primary data structure to represent these graphs. In my experience, NetworkX tends to fall over a bit with big graphs. Maybe using atbr as a replacement underneath NetworkX would improve both memory usage and execution speed. Yet another personal hacking project I could adopt.

Also of interest was some of the benchmarking that inspired atbr and demonstrated that Python dicts are actually pretty decent.


kanban Analytics

Interesting post by Sean Gorman of GeoIQ, on “Just in Time Analytics” a.k.a. kanban analyses, especially in the context of Big Data:

The presentation was on the concept of how analysis can evolve to better take advantage of real time data streams. The community currently does lots of fascinating analysis of real time data from Twitter, mobiles devices, sensors etc., but it is inevitably a post mortem. By that I mean we do the analysis well after the event itself is over. If we think of a data stream as a living organism that is constantly changing we focus our analysis on the history that has already past.

The post summarizes a presentation (which I need to partake of) at the O’Reilly Where 2012 conference.


DRose Down

And just like that, the Bulls championship hopes swirl down the drain. I thought they couldn’t get past the Heat, but Derrick Rose blowing out his ACL seals the deal.

I wouldn’t be completely surprised if they make the Eastern Conference Finals though.


Miso

Link parkin’: Miso

Miso is an open source toolkit designed to expedite the creation of high-quality interactive storytelling and data visualisation content.

The first release under the Miso Project is Dataset, a JavaScript client-side data management and transformation library.

Via Flowing Data


cliff

Link parkin’: cliff — Command Line Interface Formulation Framework

cliff is a framework for building command line programs. It uses plugins to define sub-commands, output formatters, and other extensions.


Spoilage

Chelsea FC Logo So I’m not a radical anti-spoiler type, but I did DVR the second leg of the Chelsea-Barca UEFA tilt this past Tuesday and on the way home was avoiding hearing the final outcome. So of course, being a dolt, I get home, open my feedreader, and click on the Sports folder, only to see at the top:

Fernando Torres Scores In Extra Time, Sends Chelsea To Champions League Final

Then I compounded the problem by actually reading the item from SportsGrid which revealed Chelsea playing with 10 men, Lionel Messi (GREATEST FOOTBALLER EVER ™, not) missing a penalty kick, and Chelsea scoring in extra time of the first half. Great! Why even bother to watch the recording now?

Well I did, and it was definitely worth it. First, Mireles’ Ramires’ goal at the end of the first half was sheer brilliance. Great run, perfectly placed ball. Second, not only did Messi miss a PK, he also hit the post on a near-miss that might have been helped by Petr Cech. Third, apparently no one told SportsGrid that down 2-1, Chelsea still would have advanced, tied on aggregate but with an away goal as the tiebreaker. So there was definite drama as FC Barcelona kept extended possession around the Chelsea box, probing and searching for that third goal. I half expected Barca to flop, errr luck into, another PK. The Torres goal sealed the deal but in no way did it “send Chelsea to Champions League Final”. The brilliant, disciplined, and a bit lucky, Chelsea defense did that.

So much for spoilers.

Now if only I’d not been listening to sports talk radio before the Bayern Munich-Real Madrid match…


Ernie Will Be Back

Wizards Logo 2012 Washington Wizards General Manager Ernie Grunfeld received a contract extension today.

Grumble.

Previously in these here parts, I claimed that five people needed to depart the organization: Andray Blatche, Flip Saunders, Nick Young, Javale McGee, and … Ernie Grunfeld. Grunfeld has managed to at least clean up a little this year by getting rid of the aforementioned four, but I still really fear his draft prowess, or lack thereof.

As a Wizards fan, it’s difficult to see how this was in any sense “earned”. There’s a nine year track record of Ernie’s ability and while there was a nice post-season streak, some spectacularly bad personnel moves provide counterbalance. We’ll have to defer to owner privilege on this one, but I don’t know too many DC folks who aren’t scratching their heads.

I do have precedence for sustained loyalty in the face of silliness like this. After Jordan retired I adopted the Chicago Bulls along with oft embattled GM Jerry Krause. Krause managed to scorch the franchise to the ground (Kornel David era anybody?) and get some interesting pieces (Elton Brand, Tyson Chandler, Eddy Curry, Jay Williams, Ron Artest, Jamal Crawford) that never really panned out. Things didn’t really pick up for the Bulls until Krause was long gone and they lucked into Derrick Rose.

So I’m not gonna’ let Ernie sticking around bum me out. Maybe lightning will strike this year and their top pick will be a superstar (the first since Wes Unseld?). Please though Ted, get Ernie some draft evaluation assistance. And stay away from the European picks!!


That’s A Nice Trick

Tim Bray drops science on tab management in Chrome and Safari. Read and learn grasshopper.


Manchester Derby, It’s On

So the Red Devils slipped up, letting Everton come back from 2 goals down to steal a point. At Old Trafford even. Meanwhile, Man City did their duty and picked up 3 points from Wolves. Next week, The Citizens face Manchester United in the Manchester Derby.

Man U. 3 clear with one big head-to-head to play against Man City. Definitely appointment TV.


D3 and Maps

A discussion on D3.js and mapping libraries, started by Nelson Minar, and with commentary from a few relatively well informed figures. For future reference.


Twitter Mining Recipes

Link parkin’: Matt Russell’s repo for his book “21 Recipes for Mining Twitter”

“This repository contains code for the 21 Recipes for Mining Twitter (O’Reilly, 2011.) As the name of the title suggests, it’s a short cookbook of recipes that’s designed to help you solve common problems when working with Twitter data. Some of the recipes are extracted from content presented in Mining the Social Web while others are completely new additions. In either case, they’re designed to be bite-sized and serve as the jetpack that you can strap onto that great Twitter mining idea you’ve been noodling on — whether it’s as simple as running some disposible scripts to crunch some numbers or as extensive as creating full-blown interactive web application.”


Pull Requesting

Rachel Nabors jumped into the deep end of the open source contributor pool and started making pull requests to fix issues. Of course in GitHub-land, it is not incumbent on the receiving end of the request to accept. Nabors didn’t have much luck. Seems like there are lots of prerequisites (check style guide, visit issue tracker, e-mail maintainers first) before you can realistically start firing fixes.

Part of the reason I’m becoming more interested in GitHub is that I’d someday like to be an open source contributor. But you’ve gotta know the tools. And clearly, as Nabors found out, the culture behind the commit log. Good to know.


Multiplexing

Rafe Colburn is just now getting into terminal multiplexing. All I can add is that I’ve been using GNU screen for a few years and it’s really been a boon to my development.

Usefully though, Colburn has a link detailing why tmux might be better than screen. The arguments are compelling. I might have to give the competition a test drive.


Python Text Munging

Link parkin’: csvfilter “a Python command-line tool for manipulating CSV data”

pyp “Pyp is a linux command line text manipulation tool similar to awk or sed, but which uses standard python string and list methods as well as custom functions evolved to generate fast results in an intense production environment.”

Pyp was even presented at PyCon 2012. Too bad it was when I was completely knocked from the flu and near fainting in a parking lot. Luckily the video is on YouTube.

I could of used these a few weeks and months ago, when I was going hot and heavy on the Tweet processing. Probably still useful for the occasional text munging task.


TIL About CartoDB

Thanks to a blog post, admittedly potentially biased, comparing Google’s Fusion Tables with the open source CartoDB. While enticing for some work stuff the GitHub repo looks like a big inhale with lots of sharp pointy bits. But cartodb.com provides a hosted version possibly worth kicking the tires on.


Whaddup Wigan?

Wigan Athletic Logo So not only do The ’Letics disrupt the very top of the Premiership table by bumping off Manchester United, then they back it up with a road win, downing The Gunners at The Emirates. This now puts 3rd place in play, bringing Arsenal back within range of Tottenham and Newcastle United. Chelsea remains right behind both, but at 2 more points back they don’t look like contenders for third. Thanks Wigan Athletic for putting some excitement into the end of the Premier League campaign.

Now off to watch my DVR of Bayern Munich and Real Madrid in The Champions League. I really wish there was a good option for getting the Bundesliga on Verizon FiOS.


TIL About dajax

Today I learned about the dajaxproject: django+ajax. Good to know about given the baseline knowledge needed for client side, errr, Front-End developers.

Thanks Kevin Veroneau. Looking forward to the tutorial.


Spike Lee’s Joints

Do the Right Thing Cover There’s an inspired programmer over at HBO, who‘s been scheduling pretty much a lot of Spike Lee’s oeuvre, especially some of the earlier studio films. Good to see lesser known works like Crooklyn, Clockers, and Jungle Fever, get some airtime, along with the American classic, Do the Right Thing.


Check-Ins Dead?

Jon Mitchell of ReadWriteWeb makes a reasonable argument that the “check-in” as a location based concept is “dead”. Doesn’t look like consumers are adopting check-in applications at a rate that can lead to large scale commercial businesses. Fodder for thought, especially in contrast with Anil Dash’s effusive praise for Foursquare which is not all that stale.

I tend to concur with the later part of the article that speculates that this failure is just part of an experimental, evolutionary process to find a model for continuous location sensing. What we’ve just learned is that continuous location announcement into our social circles isn‘t particularly useful or attractive.

Now personalized capture and derived insights could be a winner. However, that’s a model hard to scale in a big way fast, which is what VCs these days are looking for.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.