home ¦ Archives ¦ Atom ¦ RSS

Foursquare Execution

Link parkin’: Foursquare Today’s Best Executing Startup

“I’m all for it.”

Tampa Bay Buccaneers (National Football League) Head Coach John McKay on his team’s offensive execution.

In The Next Data Challenge I mentioned that there’s some interesting data collection experiments to be run against newer social media systems like Foursquare. Conveniently, Anil Dash did a deep dive into why he thinks the NYC based startup is at the top of the tech game. I don’t use Foursquare at all, so can’t speak to the piece’s veracity, but it’s a good read. And points to why I think Foursquare is worth studying closely.


For The Birds

Angry Birds Rio Logo

I’d held out for quite a while against the Angry Birds world domination. My wife succumbed last Christmas when I got her an iPad. For whatever reason, Angry Birds Rio was one of the few games she downloaded. Of course My Little Guy (™) took to it like a fish, even though he wasn’t all that proficient. He’s got fine motor skill issues, but he’s young and highly enthusiastic.

So now I’m trying to turn my dormant iPod Touch into My Little Guy’s (™) iOS device. That was the disease vector leading to my infection. I just had to try the game out.

And my old game obsession tendencies from way back in the undergrad days came back to life. I basically started playing Angry Birds Rio late last Wednesday, Dec 28th. I was done with all the levels by Sunday afternoon, January 1st. I struggled with a few screens, had to go to to the Interwebs to cheat on one (it was 3AM and I crumbled in a moment of weakness), and surfaced a gaming intensity I hadn’t seen in a long time.

Angry Birds Rio is addictive in that it’s a series of rich micro-challenges. Just complex enough to put the braincells to work, but completion is always tantalizingly close. “Just one more stage,” becomes the zombielike mantra. With a healthy dose of, “I’ll finish it in one shot!” Plus the overall game design, mechanics, and user experience are brilliant from a cognitive perspective. While subtly addictive, the game is really fun to play.

I will be staying very far away from the original Angry Birds.


The Next Data Challenge

Now that I’ve honed my Twitter data collection skills a bit, a couple of new ones are coming to mind. Interestingly, I like starting them as home hacking projects and then transferring the experience to work as needed.

First, collecting a million Tweets per day using the Streaming API doesn’t seem completely unreasonable. Now I don’t have enough home storage to handle that amount of data but I do have Amazon S3. I was getting hung up on having continuous query and analysis capabilities available. That would reauire an expensive VPS in the cloud or another machine to worry about in the basement. But simply storing a small window of the data on a cheap VPS, pushing the data into S3, and then batch processing with Elastic MapReduce is eminently feasible. Probably good for the resume too. And with a little automation this can operate while I’m sleeping and run for days at a time. That quickly means tens to hundreds of millions of tweets. You’re talking real Big Data at that point.

Second, I’m still not seeing any interesting data collection experiments from systems like Instagram or FourSquare. Maybe I’m looking in the wrong places, but seems like an opportunity to me.

Third, adaptive query specification for the Streaming API. Currently all my collection just sets up a bunch of geo boundaries and leaves them alone. Two issues here. Dynamic determination of the queries and dynamic update of the query spec. The latter isn’t too hard but the former is open territory.


Be It Resolved

I’m generally not big on announcing resolutions, although I have done so in the past. Let’s give it a shot for 2012.

  • Improve Physical Health. The holidays definitely have me backsliding but at least I’ve got the basics of exercise and weight loss going. Main goals are to lose about 20 pounds and play some pickup ultimate without embarrassing myself. No injuries. Stretch goal is to play organized ultimate like a league or hat tournament. Also, improve the nutritional intake. More veggies.

  • Maintain Financial Health. A little over 3 years ago, I was barely above water on a condo I wasn’t living in. I also had a pretty big pile of personal debt. That dispiriting conversation with a parent about a loan was looming. Luckily I closed on a sale just before John McCain announced he was suspending his campaign and wrecked the economy. Fiscal management took over at my household and now I’m in pretty good shape. No going back in 2012. Avoid lifestyle creep. Pay everything on time, no fees or penalties. 10% more of take home pay stashed into retirement instruments. Double my charitable contributions.

  • Shine The Skillz. The Mad Data Scientist exercises at home and work have exercised old programming muscles. In 2012, I’m going to work hard on regular hacking activity and getting some more professional interaction and training. I’m looking to attend three programmer events where I get in a workshop, class, or coding sprint.

  • Expand The Network. Once a month either get back in touch with an old professional contact or make a new one. Special focus on people local to the DC area. Join the MIT and UC Berkeley alumni clubs of DC and attend 6 events total. Update resume. Flesh out LinkedIn profile.

  • Get Rid of Stuff. I’ve got stuff in a storage site that I haven’t visited in over 3 years. It can’t be that important so time to go. Also, the wardrobe needs a major flush. Clearing half of that space is the target.

  • Build A Tribe. Over the last few years I’ve been leaning back and trying to be good teammate. This past year I’m starting to see technical challenges and opportunities that really excite me. But they all involve pulling people together to build something bigger than any one individual. This actually goes against my general nature but it’s the next meaningful life step. A high functioning team of three to five people would be an achievement.

  • Less Watch, More Do. Cut down on the number of weekend days I completely lose laying on the couch watching sports. More days with more hacking. More time spent doing real activities with My Little Guy (TM).

And of course keep on posting!!


Tapbots Love

Tweetbot Logo Upon John Gruber’s recommendation and subsequent eulogy for the Twitter client, I switched over to Tweetbot on my iPhone. Definitely enjoying it and wondering what took me so long.

Since I’m also trying to shed a few pounds, I took a peek at Weightbot and liked what I saw. Added to the iPhone as well. Tapbots appears to be a high quality outfit.


Riak Tradeoffs

Riak Logo What with the piles of data I have to process at work, I keep an eye out on the various storage, indexing, and query technologies. One product, Riak, looks good but hasn’t quite fit my use cases. There’s a nice overview on InfoQ, with Andy Gross and Mark Phillips of Basho Technlogies, on the tradeoffs that Riak provides.

The big downside for me has been the need for relatively sophisticated ad hoc querying. The Basho team points out that Riak isn’t particularly good for that, being more of a building block towards that capability. The high availability, horizontal scalability, and good performance on greater than main memory working sets are attractive features though.

May have to run some experiments at work just to baseline the Riak potential.


Backlog

WordPress Logo Since I’m not a deep or prolific blogger, I have to work to maintain continuous output. I don’t know how other bloggers do it, but I’m learning that maintaining a backlog really comes in handy to get over those spots where you can’t squeeze out enough time to compose a post. With a good backlog, a great blogging CMS with a decent mobile interface or app, and a smartphone, one can keep that streak alive.

The only issue I’m having with the WordPress iOS App is getting the publication time right. There needs to be a “set time to now and publish” button.


Deep Gibson

Zero History Penguin Cover The interview of William Gibson, in The Paris Review, was much deeper about his personal life than I expected. There’s a lot of depth on his time on Wytheville, Virginia and how it influenced his conceptions of science fiction and writing. Also his draft-era Vietnam angst and transition to Canada. Didn’t know he had a wife and son.

In addition, I’ve always thought of the nominal Bigend Trilogy (Pattern Recognition, Spook Country, Zero History) as a science fiction series, but the books are really contemporary thrillers. As Gibson puts it, there’s just enough to make them work like science fiction.

And the story of his first paid publication is priceless. A long read but well worth the time invested.

With the start of a new year, and the anniversary of my Kindle ownership, I’m thinking of rereading the Bigend trilogy in its entirety. I’d like to do it on the Kindle, but the prices of the electronic editions is mildly daunting.

Apropos of nothing, according to Wikipedia, Bigend was born the same year I was.


Ghosts and Shadows

Ghost Protocol Poster As promised, I managed to take a little bit of holiday time and catch a few movies.

Sherlock Holmes: A Game of Shadows was pretty much as anticipated. A fun little romp, historically set in Victorian Europe. The big delta over the first Sherlock Holmes is the full introduction of Professor Moriarty, a worthwhile adversary and quality villain. I found the plot engaging and it moved along at a quick pace, never dragging. Occasionally the film gets a little too caught up in bullet time slow motion, but it’s not a major flaw.

An upside of seeing it in the theater is that, as opposed to HDTV at home, Robert Downey Jr. doesn’t come across like he’s mumbling. Although, soft spoken, you can clearly make out what Holmes is saying. More Kelly Reilly please.

The major downside is that I think I got jacked by a 3D bulb on a 2D projection. Roger Ebert clearly outlines the issues but the dang film looked like it was completely shot at midnight. I like the Cobb 12 Theater, but if it happens again I’ll have to ask for my money back.

On a brighter note, Mission: Impossible Ghost Protocol showed at my nearby IMAX (real IMAX) theater. Memo to self, get to the theater earlier so you don’t have to sit in the second row, leading to a crick in the neck. Other than that, paying some extra shekels for the mega-screen was worth it. There are some scenes involving the Burj Khalifa where my heart literally leapt into my throat. Could have done with without the sandstorm chase scene, but that was made up for by the beautifully gigantic Paula Patton.

This Mission: Impossible was noticeably lighter than the previous two and more human. In somewhat of a return to the television roots, the film relies much less on gadgetry, and more on social engineering. Although I still love Philip Seymour Hoffman’s villain from the last edition. Heck, the iconic rubber masks from the first film are pretty much put out to pasture. And a huge dose of Simon Pegg added plenty of comedic touches.

Bonus. Since my IMAX is real IMAX, we didn’t get any trailers. But that was made up for by six minutes of The Dark Knight Rises prologue. Said prologue deserves a post of its own, but suffice it to say I’m now really looking forward to this film next year.

I can unreservedly recommend both Sherlock Holmes and Mission: Impossible for at least a matinee screening. If you have to pay full fare, you won’t feel ripped off although maybe a bit of buyer’s remorse for overpaying. They’l both make great movies on the home theater.


Apple and Brother Printers

Maybe this will get indexed by Google and save someone else some time.

Today my Brother Printer started giving me an annoying pop-up and refused to print:

Some of the software for the printer is missing.

Gee thanks.

Now my Brother HL-5250DN printer had been working perfectly fine, although I hadn’t used it for a while. This new message alerted me to the fact that there was some new software drivers. However, the Apple downloader inside of System Preferences, for whatever reason, just couldn’t get the job done. Tried Apple Software Updates but it didn’t have any new printer updates. Went to the Brother site and downloaded the latest drivers. Installed them. Didn’t fix the situation.

Finally went straight to the source, Apple. In the support section, there was a recent release on December 13th for Brother Printer Drivers 2.8. Slurped down all 156 MB in a few minutes, it’s good to be on FiOS, installed, and problem solved.

Hope that helps!


Pattern and Waffles

Link parkin: waffles, because I’m now in love with both CLIs and ML.

Waffles seeks to be the world’s most comprehensive collection of command-line tools for machine learning and data mining. Our native tools have minimal dependencies (no interpreter, VM, or runtime environment is necessary), and build cross-platform. If you have a useful data mining tool that meets these criteria, we want it in Waffles.

pattern, because I’m still in love with Python, ML, and information visualization.

Pattern is a web mining module for the Python programming language. It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics) and data visualization (graph networks).


Satisfaction

Merry Christmas and happy holidays folks!!

Once you get a moderately complex cron entry and set of bash scripts working there’s a nice sense of satisfaction. You gotta like collecting, archiving, ingesting, and indexing 100’s of megabytes of data in your sleep. On a holiday. If it keeps happening night after night, so much the better.

I’m particularly patting myself on the back because it only took one debug cycle to get things working. Surprised myself there.


mongo-hadoop

Link parkin’: mongo-hadoop. Probably so alpha it hurts, but curious to see what integrating mongodb and Hadoop actually means. Would be very convenient to use mongo as a data source.

Via Russell Jurney


Pycon 2012 Talks

PyCon 2012 Logo Still planning on attending PyCon 2012. Just gotta work out the financing and job schedule.

The conference talks were just announced. Too much good stuff. I’m most attracted to the data swizzling, geospatial processing, and systems hacking themes of the conference.


dangerousmeta 12 years old

Dangerousmeta logo
Gotta give a shout out to Garret P Vreeland’s dangerousmeta weblog, which recently turned 12 years old. As a longtime follower (I go back to the EditThisPage days) I have to say Vreeland is one of the few firehose weblogs I’ve managed to tolerate over the years. That dude can spew! And if I could type 120 WPM, I might crank out as many posts as Vreeland does on a daily basis, although I’m more impressed that he actually reads all the stuff he links to (no retweets here) before posting about it.

I don’t follow through on 10% of what he points to, but I’ve genuinely enjoyed his eclectic mix of culture, politics, technology, and New Mexico. Hopefully he doesn’t mind me cribbing his logo for some visual spice. Shoot me an e-mail if there’s a problem. And here’s hoping to another 12 good blogging years.


Guilt No More

Mission Impossible Udvar Hazy Snap I previously called Mission: Impossible — Ghost Protocol a guilty pleasure. Well now I don’t have to feel so guilty about wanting to see another Tom Cruise mega-action sequel.

First you’ll note the red circle in the screen capture on the right. With The Dark Knight Rises now on the horizon, (witness the latest trailer link to QuickTime), getting 6 minutes of Chris Nolan’s upcoming vision can make just about any movie worthwhile. My guess is that the prologue goes back to some confluence of beginnings for Batman, Ra’s Al Ghul, and Bane. I’ll be interested to see if there are any influences from Frank Miller’s The Dark Knight Returns in The Dark Knight Rises or whether that’s grist for a fourth film.

Even better is that’s a capture from the Udvar-Hazy Theater website. That’s the IMAX(real IMAX) venue right around the corner from me. I’m betting on another fabulous experience seeing a movie there.


Emacs 24

What with the nice coverage that Mastering Emacs is giving to the imminent GNU Emacs 24 release, I’m starting to think it might be time to kick the tires on the new ride. Thankfully, emacsformacosx has me covered with binary builds.


Emacs, DEL, and delete

I’m mildly chuffed that the following issue stumped me for a couple of hours this afternoon. I’ve been using Emacs forever and the solution should have been a no-brainer. Warning, major nerdage ahead.

So I’ve been amping up my Python hacking at work. The last time I was in such a mode, I was using Xemacs. In Python code, when I was at the first character of an indented line and hit delete, it would outdent one level. Very handy and natural for Python editing.

For my current run, I’ve switched back to Emacs 23. For a while, it wasn’t really bugging me, but the outdent didn’t work. A delete would just erase one character backwards.

Finally I got fed up, and the issue turned out to be the following. On modern keyboards, you often get two keys marked with delete. The one in the standard QWERTY position should effectively be a backspace and erase. The other one should be a rightward erase. Emacs is smart enough to figure out the difference, so you can individually bind DEL and delete. It’s all documented right in the Emacs manual. Unfortunately, I could never get the right combination of Google keywords to find a web page to explain this. I had to resort to gasp, Reading The F*! Manual.

The problem is that the latest and greatest Emacs python mode only binds delete to the really useful py-electric-backspace which Does The Right Thing (™). Meanwhile, DEL is left to its normal backwards character erase. Massive irritation for this particular user.

So here’s the fix: [sourcecode language=”text”] (add-hook ‘python-mode-hook ‘(lambda () (define-key py-mode-map “\d” ‘py-electric-backspace))) [/sourcecode]

May this post save another soul some time and effort.


Mushroom Jazz Bundle

If I was a Mark Farina completist, I’d be really tempted by this new Mushroom Jazz Bundle from OM Records.

Oh, wait, I am a Mark Farina completist. So I am tempted. Even though I’ve got every CD in that bundle. Even though I’ve got the tee shirt. Well, I don’t have the poster. So it would boil down to $70 for a signed poster. I guess I could EBay the rest of the bundle to try to break even, but somehow that seems sacrilegious.

So I guess I’ll have to pass. But more like that please OM Records.


Building from CLI

Link parkin’: Luke Wroblewski describes how Bagcheck practices progressive enhancement by developing the product starting from the web API, then constructing command line interface (CLI) tools, and thence to client side interfaces.

From a UNIX perspective, I’ve noticed a CLI is really helpful because you can smoothly tie into standard UNIX scripting tools. This allows for large scale automation which you need more often than not.

Bonus links: The Bagcheck technology stack and Wroblewski’s writings which often include comprehensive notes on interesting conference sessions.


PostgreSQL, Tabs, and COPY

PostgreSQL Logo Learned this the hard way, so maybe posting it will help someone else out.

PostgreSQL supports the SQL COPY statement which is a good way to bulk load a lot of data fast. Think a million tweets in JSON. Each row pulls out some of the key tweet fields into columns and the tweet JSON is also stored in a field just in case. The tricky part is that the bulk load input data, that’s not already dumped from a Postgres DB, has to be in a text format that’s akin to CSV.

This isn’t a big deal if your table fields are relatively simple, but as soon as they become arbitrary strings things get hairy. Escaping special characters and string encoding quickly bite you in the butt.

I was generating a large number of bulk data files, in tab separated format, from a Python script. Thinking the task to be straightforward, I started to handle escaping the record separator using Python’s str.replace. Tab escaping sort of worked, but then I had UTF-8 encoded strings. These strings are a pain to use str.join on and then write to an output file. Pretty much every data file generated would have some kind of import error within the first 100 records.

Too bad I didn’t have a magic tool that knew how to write CSV files and deal with all the escaping issues.

Oh wait. I’m using Python. Batteries Included.

Busted out the csv module. Just picked apart my tweet data structure into a tuple, along with the tweet JSON source text. The Python documentation has a convenient example of how to handle the UTF-8 encoding. The ancient, Python 2.3 born, csv module magically handles all the separator and terminator escaping. The csv.Writer.writerow method had no problems writing out a mix of ASCII and Unicode data. And Postgres happily slurped in every data file generated. I’m well on my way to ingesting multiple millions of tweet instances into a table that has 10+ fields.

The moral of the story? If you want to fast bulkload data into Postgres, find a csv compliant library and have it write your rows for you.

Bonus hint: If you’ve got geospatial data as well, maybe you’re using PostGIS, grab the shapely module and get familiar with the wkt attribute of the shape objects. While the PostGIS documentation says WKT is an acceptable load form, I found that EWKT, essentially adding an SRID, was the only way to get a shape loaded. Assuming you have an established SRID this is a piece of cake.


Premature Guilt

Have to say I’m somewhat looking forward to the upcoming releases of Mission Impossible: Ghost Protocol and Sherlock Holmes: A Game of Shadows. Both are clearly guilty pleasures. For Mission Impossible, while I like the action, Paula Patton is an extremely fetching attraction. As to Sherlock Holmes, while the incessant mumbling of Robert Downey Jr. can be irritating, I like the comic chemistry he has with Jude Law.

Now I can see how well the local cinema holds up during holiday season!


PyCon 2012 Tutorials

PyCon 2012 Logo I have a major 2012 resolution to attend PyCon 2012, on my own dime. I need to get out and about in a developer community, independent of the work context. Plus, since PyCon is in Santa Clara, CA, I’ve got a few peeps I can visit and crash with on the cheap.

The tutorials were announced recently. What a juicy slate! I’m definitely doing the Data Analytics I track. After that Data Analytics II looks tempting, but I’m also interested some of the web/DB development À la carte offerings.

Choices, choices.


Plexus Rangers Chronicles: Week 14

PlexusRangers Logo Small So that thud you heard was the final collapse of my poor fantasy football squad. This past weekend encapsulated my many frustrations over this season. To wit

  • The injury bug strikes yet again, mid-game this time. Poor DeMarco Murray goes down in the first quarter with a broken ankle. Felix Jones soaks up the resulting points in a track meet. Meanwhile, Jason Witten gets shut out or shut down, take your pick.
  • Conversely, I play the Cowboy defense which gets smoked by the Giants. Minus two points on the ledger.
  • The opponent went off to the tune of 139 points. Guys like Shonn Greene and Rob Gronkowski had peak weeks of 25+ points.
  • And the guy I was chasing won anyway. His opponent, down in the lower bracket like me, didn’t quite mail it in, but every player in his lineup underperformed their projection.

Just not meant to be this year.

Can’t really get excited about the fact that our league has a consolation playoff bracket. I know you don’t care about my fantasy team, so there’ll only be one more of these posts. I want to go back and compare the hope of the draft versus the season’s results.


RSS Feed Synching

RSS Feed Icon 64x64 A while ago there was a lot of consternation when Google Reader changed its user interface. Along with that Google axed a number of projects. Seeing as it’s not obvious if or how Google Reader generates revenue, people were duly concerned.

The big issue is that Google Reader seems to have become the de facto infrastructure for synching reading lists across devices. Longtime (but now ex) developer of NetNewsWire, Brent Simmons clearly chronicles the issue with Google Reader and synching. Based upon his experience he drills down into some of the key RSS synching technical challenges.

As a computer scientist, I understand the issues. But it seems to me that this is such a classic distributed systems problem, there has to be a clean solution already. To my eye distributed version control systems, like git, have most of the answer. Many differences between distributed files are easily handled, and conflicts have to be resolved by a human. Now obviously that would be a pain for an RSS reader but maybe a few simple resolution policies, designated by a human, could do the trick.

Just thinking out loud, because I heart RSS and feed reading.


Titan Theme

MPR Titan Header

When I converted this blog to WordPress, I looked at a number of themes, finally settling on Blogum. I was never really happy with it though. The title banner was never quite right. While a nice and clean design, the main body font didn’t suit me.

I ran across Mickey Petersen’s survey of new features in Emacs 24. The theme he was using, Titan from The Theme Foundry, looked good to me. So I decided to give it a whirl here. That’s one of the really nice features of WordPress, the ease with which new themes can be installed and experimented with.

Titan looks good to me, so I’ll be keeping it around. I’ll probably even ante up for pro support.


Twitter Favorites Feed

Twitter Bird Small Twitter is great for following the link streams of knowledgeable folks. Snagging interesting tweets for later perusal is also easy, as you can mark a tweet in your timeline as a favorite.

However, I now try to do all of my information aggregation in NetNewsWire. So going to a Twitter client to see my favorites is a bit of a pain.

Enter Twitter RSS feeds: http://twitter.com/favorites/crossjam.rss

Now I can subscribe to the above link and see my favorites stream along with my inflow. The only downside is that NetNewsWire doesn’t auto link the URLs in the tweets. Google Reader is smart enough to do this, but I’m not using GReader on a daily basis. The quick and dirty solution is to pop to the tweet in a browser tab and then follow the, typically shortened, link from there.


Realize

So in my continued possession by Evol Intent’s Us Against the World mix CD, I got captivated by the following, perceived sample:

Did you realize no one can see inside your view? / Did you realize the one inside belongs to you?

You can always distinguish the ethereal Beth Gibbon’s vocals, especially from a great Portishead track like Strangers. But I wanted to be sure I had the correct lyrics, since it’s a little tricky to make out. All I kept finding though was this:

Did you realize no one can see inside your view? / Did you realize for why this sight belongs to you?

spread across a bunch of sites designed to sell ringtones. I wasn’t hearing any of that for why… stuff though which made me suspicious. Turns out that’s from the Portishead Roseland NYC Live version of Strangers.

However, going back to the original Dummy CD, I heard the following:

Did you realize no one can see inside your view? / Did you realize the world inside belongs to you?

With some confirmation from the web, mystery solved. You’re welcome America!


Plexus Rangers Chronicles: Week 13

PlexusRangers Logo Small Back to back, Jack!! Two wins in a row. My only win streak of the year. A nice solid victory, although I had to sweat a little on Monday due to extended gar-bage time.

I have to give credit to an officemate. Even though we’re competing for the final playoff spot in our league, he tipped me off to playing Percy Harvin. You can roll like that when there’s no money on the line.

Even so, I was this close to benching Harvin on Sunday morning. Missing practice for an “illness” is fantasy-speak for “game time inactive” and zero fantasy points. But I rolled the dice and won big with 33.5 on the ledger.

Add in Aaron Rodgers 37.5, along with another inspired gamble on Roy Helu for 20.2, and that’s 90 points in the till. Everybody else on my team underperformed (I’m looking at you DeMarco Murray) but totaled enough to get me to 118.

Per usual my opponent chalked up over 100. I had a 36 point lead going into the Monday game and was feeling confident but slightly nervous. Maurice Jones-Drew was the last player left in the tilt. He hadn’t scored over 20 points all season. Jacksonville has been awful this year. No way he goes for 30+.

Still, it’s been one of those seasons.

Jones-Drew was over 25 by the end of the third quarter. The Chargers were so far ahead, I had visions of them laying down like dogs for a 60 yard touchdown run or something like that. Thankfully, Blaine Gabbert got a lot of throws, Jones-Drew got some rest, and the clock moved quickly.

The playoffs start for me this week, even if I’m not in the playoffs. I win and my buddy loses, and I’m in. We both win or both lose and I need to outscore him by 60 points. Not gonna happen. He wins, I lose, and it’s over. So losing isn’t much of an option.

Here. We. Go.


plv8

Given I’m aware of Python as an embedded procedural language in PostgreSQL, I should have anticipated that someone would stuff JavaScript into PGSQL. Enter plv8:

plv8 is shared library that provides a PostgreSQL procedual language powered by V8 JavaScript Engine. With this program you can write in JavaScript your function that is callable from SQL.

In the context of PostgreSQL, this means you have a surprisingly useful durable document store you didn’t know you had. The previous link focuses on XML in Postgres, but with plv8 there are plenty of JSON tricks you can do inside a sophisticated relational data management system.

Don’t know how mature plv8 is, but I have a few big piles of Tweet data in JSON format that might be subject to the extension’s charms.

Hat tip to Hacker News


Full Metal Jacket

Full Metal Jacket Peace Pin Just dropped dead in the middle off Stanley Kubrick’s Full Metal Jacket, (unfortunately on IFC). Despite the commercials, forgot how damn good, and twisted, a film it is. Apocalypse Now, Redux best captures what I know (being of the generation after) the fucked-upness of the Vientam war. And it’s just a better movie.

But Full Metal Jacket is a straight up mind fuck. War is hell. Must be that Jungian thing.


wikistream

Wikipedia Logo Link parkin’: wikistream, an experiment in real-time display of Wikipedia edits using node.js and redis. Neat interface within the browser page.

Mainly stashing to note that Wikimedia recent changes are streamed using IRC. I always thought monitoring Wikipedia would be a great sensor for various goings on in the world. Should do a literature search for any uses of the real time stream and then see about the potential for future advances.


PGSQL FDW

PostgreSQL Logo Link parkin’: PostgreSQL Foreign Data Wrappers.

When this link first started kicking around, I thought it was just a gimmicky way to pull data into Postgres (PGSQL), destined to be slow and finicky. Boy was I wrong! Turns out Foreign Data Wrappers (FDW) are an SQL standard that the latest version of Postgres, 9.1, heavily supports. Turns out there are all sorts of interesting uses for FDWs. Turns out that Multicorn makes writing the wrappers in Python relatively straightforward.

At work, I’ve got some hairy data ingest challenges for Postgres. Maybe FDWs can help solve them.

Hat tip, Ben Lorica’s Big Data Twitter stream


MapReduce Workshop

Hadoop Logo A long time ago, I called Google’s MapReduce distributed programming model, a force multiplier. Some, admittedly self-interested, parties are projecting a billion a year Hadoop industry. Some are just projecting that 2012 will be a big year for Hadoop. Given that Hadoop is the open source version of MapReduce, I might actually be on target with that prediction.

But enough self-congratulation. I found myself recently wondering what’s next on the MapReduce frontier. Should have known there’s an academic MapReduce workshop for that. Third edition no less.


Streaking

I’m definitely down with meta is murder, but small doses aren’t fatal.

Stupidly forgot to post yesterday, ending my 60+ day streak. Simply had a lot of stuff going on at work and home and lost track of time. Adam Fast recently posted on why he was blogging every day and I greatly sympathize. Which is why I take keeping posting streaks going seriously.

No big deal though, just get back on the horse and start a new streak!


Plexus Ranger Chronicles: Week 12

PlexusRangers Logo Small Victory! By the thinnest of margins no less, 0.23 fantasy points. I’m still barely alive for a playoff spot. Maybe The Fantasy Gods are starting to smile on me.

I say that because most of my guys underperformed. DeMarco Murray went over by a point or so. Meanwhile, Rob Bironas, my kicker, exceeded expectations by 5 points. It’s a sad day when your kicker is essential to a win.

My opponent’s team was equally bad, except for one player, Jimmy Graham of New Orleans. Going into Monday night, when the Saints played, my lead was a tick over 23 points. Graham’s been having a good year, in a high powered offense, so 23+ was not out of the question. Given the year I’ve been having, I anticipated this happening. By the 4th quarter, Graham had 22.9 points, I assumed a loss, and I went to bed early. Graham almost had one more catch but got busted up on the play.

And I woke up the next morning with the closest fantasy victory I’ve ever had. Keep hope alive!


ESPN’S Downfall?

ESPN Logo Previously, I had pondered what could bring down ESPN’s, the self-proclaimed Worldwide Leader in Sports, virtual programming hegemony. My creativity wasn’t too far off, as Comcast is retooling the Comcast Sports Network plus Versus, into the NBC Sports Network. But it remains to be seen if this can really be a viable alternative.

My thoughts didn’t stray to that most American of disruptors: scandal. For the longest, I’ve wondered how the sports media industry in general, and ESPN in particular, could never “break” the Major League Baseball steroids story. Given the amount of coverage of baseball, and the porous revolving door between “journalistic” organizations and the baseball franchises, some intrepid reporter should have been able to find at least one smoking gun. It’s odd that it took Jose Canseco going rogue in a bitter snit to bring that house of cards crashing down.

Enter the Bernie Fine fiasco at Syracuse and ESPN dubiously spiking its own foray into the story as ably chronicled by Sports By Brooks. Makes me start to ponder what else the news side of ESPN has decided wasn’t newsworthy or verifiable over the years. Considering all the insider connections that a lot of the on-air talent brings to the table, how is it that ESPN can’t confirm anything that goes on in the sports world?

Right now, since the Bernie Fine/ESPN affair is mostly percolating in the blogosphere, it’s really only a “smoldering” gun. But chain a few of these incidents together, continue the theme of particular odious crimes, sprinkle in a few higher profile reporters, and cracks could start to show in the Worldwide Leader’s foundation.

Who knows, maybe this theme will catch the eye of another (jealous?) major news organization that runs with it. However, we know there’s at least one such crew that’s right out.

Hat tip to the The LaVar Arrington and Chad Dukes show (warning overdone Flash site).

Proper curly quotes courtesy of admonishment from Brent Simmons and Tim Bray.


Discogs API Redux

Discogs Logo Way back in January of 2009, I noted that Discogs.com had a REST API. It’s been awhile (at least since I last looked) and Discogs has updated their API for the modern era. Bonus! They now have monthly data dumps!!

Hat tip Paul Lamere


Sans Laptop

WordPress Logo Since noonish last Tuesday until 5PM today, I’ve been on the road for the Thanksgiving holiday. The wife’s side of the family is all from Chicago, and I lived and worked there for close to 9 years. Unfortunately, my mother-in-law is a bit technophobic so doesn’t have Internet at her house. Heck, she just upgraded to cable this summer and I was lucky enough to finally have a full slate of turkey day sports for once.

This of course makes regular blogging a challenge. More after the break. Really!So I decided to give the WordPress for iOS app, a serious test drive on my iPhone. I cheated a little, but I’m giving it a thumbs up. With relatively minimal extra effort, I managed to successfully post the last 5 days, without using my laptop. In fact, I only once had to drag myself to a Starbucks and crack the laptop simply to check the time and numbers for a potential Monday telecom. Come to think of it, I probably could have gotten away with a text to deal with that situation.

How’d I cheat? I built up a backlog of mostly cooked posts, that needed minimal editing. These were then posted to my WordPress site with draft status. Then the WordPress iOS app was used for some light editing and clean up, before adjusting the publish date and switching the status to published.

I did however, do one full post completely in the iOS app. This wasn’t too bad, although the on-screen keyboard is obviously quite a bit slower than a real keyboard. So longer posts seem somewhat prohibitive. My main observation is that collecting and adding links definitely has high friction. On the desktop, you can multitask and fast switch between the browser, to look up stuff, and your favorite blog post editor, MarsEdit for me. Doesn’t work quite so well on the iPhone, although maybe that’s something I need to work on. I’ll also start thinking about the types of posts that make sense for a combo of an iPhone and WordPress. For example, impromptu photo posting feels like it would be quite natural, as opposed to my longer form text entries.


Infochimps Geocoding

Link Parkin’: infochimps now provides a geocoding API to translate textual geographical references into lat/long coordinates. Also includes a confidence score.

© 2008-2025 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.