home ¦ Archives ¦ Atom ¦ RSS

Plexus Ranger Chronicles: Week 10

PlexusRangers Logo Small Back on an even week, and got a win. Broke two streaks. First, a three game losing skid, and then the opponent scoring 100+ fantasy points every week so far.

However, the injury bug strikes yet again. This week Julio Jones goes down early to a hammy.

Otherwise, this was a laugher. DeMarco Murray delivered 26 big fantasy points, along with Larry Fitzgerald finally going nuclear for 30+. The supporting cast included contributions from Steven Jackson and Rob Bironas? Gotta love it when your kicker chips in 13 points. Raspberry though for the KC DEF, who only scored 1 point against Team Tebow.

With those four guys, I beat my opponent. He made the unfortunate choice of starting both Darren Sproles and Mark Ingram for a combined total of 4.2 points. A final opposing total of 76 points was much appreciated.

Oh! Almost forgot about the 34 points that Aaron Rodgers threw in for me. It’s starting to feel like 25+ is routine for him.

Let’s see if I can get back-to-back wins and lurch back into the playoff hunt.


Python Reads

Link parkin’: Jesse Noller’s Good to Great Python Reads. Does what it says on the tin.


SpiderDuck Deep Dive

Once upon a time, I was developing a small scale RSS feed aggregator for hundreds to thousands of feeds. Man the intricacies of repeated HTTP fetching were hairy. Getting HTTP headers right, obeying robots.txt, throttling domain access, dealing with domain resolution, avoiding spider traps, scheduling revisits. Not to mention that if you surmounted all of those hurdles, you still had to deal with people’s often ill-formed RSS. Good times!

While not a one-to-one mapping of concerns, I still found quite interesting this deep dive into the technical details of SpiderDuck, Twitter’s scalable, real-time URL fetcher. They’re not really crawling sites, just trying to fetch URLs as quickly as they flow through the Twittersphere, so Twitter doesn’t have to worry about recursive fetching. Probably don’t have to deal with revisits, but still have to throttle request rates to particular domains. I guarantee there are some twisted souls out there trying to suck their fetching into infinite URL traps.

A lot has changed since I was doing my half-baked tinkering. For example, the whole Hadoop/HDFS/Cassandra/Memcached scalable computing infrastructure flat out didn’t exist. At the same time, a lot of the same issues are still there. The Web is still The Web, warts and all.

Hat tip: Nelson Minar’s Pinboard


SQL The Hard Way

Link parkin’: Zed Shaw is bringing his “The Hard Way” style (worked well for Python) to SQL:

This book will teach you the 80% of SQL you probably need to use it effectively, and will mix in concepts in data modeling at the same time. If you’ve been fumbling around building web, desktop, or mobile applications because you don’t know SQL, then this book is for you. It is written for people with no prior database, programming, or SQL knowledge, but knowing at least one programming language will help.

Learn SQL The Hard Way. It’s only in Alpha though, so I wouldn’t rush out and start working through it, unless you want to file bug reports.


Continued Evaluation

Marcus Intalex Cover Regarding some recent purchases, DJ Marky’s FabricLive.55 and Marcus Inatlex’s FabricLive.35 are both definitely keepers and getting repeated play. Both discs of Pete Tong and Felix Da Housecat on All Gone Ibiza are solid but don’t feel like goto listens. Noisia’s FabricLive.40 was better than anticipated on second listen, may have to reevaluate. And finally, I thought I would have listened to DJ Heather’s Tangerine a little more, but I’ve really been groovin’ on the DnB purchases.


Enjoying Some Bits

On its recent 10th birthday, I’m acknowledging Nelson Minar’s Some Bits weblog as another site I routinely enjoy. As Minar says:

My weblog is an old school blog, a public diary of things that personally interest me. I mostly write as a way to summarize what I’m learning about something new like living in Paris or flying airplanes. … Go ahead, write something, it’s not hard! Even if no one but yourself ever reads it it’s worth your time.

I’m not so much into the foodie or GA posts, but I usually find the tech nuggets quite good. For example, I was tipped off, by Some Bits, to Adam Fast’s Neogeography blog, which looks top notch.


Batteries Included

Python logo At work, I was thinking up a Python script to stream through a bunch of compressed text files. I was starting to devise the logic to run through a list of filenames and present it as one input stream, when I thought “maybe There’s One Obvious Way To Do It already”.

A little Googling and voila! Python’s fileinput module. Does exactly what I need it to do, right down to being able to detect and decompress gzip compressed files. Even better Doug Hellmann has done a Python Module of the Week (PyMOTW) entry for fileinput, meaning there’s clear usage examples on top of the excellent documentation.

I had a script to cleanly zip through a 100+ Mb of compressed data in under a hour. Python, it’s a beautiful thing.


Plexus Rangers Chronicles: Week 9

PlexusRangers Logo Small While the even week win trend was busted last week, the odd week losing skid continued. Brutality abounded in multiple ways:

  • The opponent, in the league’s last position, was 1 and 7 entering play.
  • I had a great matchup with Aaron Rodgers and the Packers going against the San Diego Chargers. Problem was the other side had Philip Rivers and Vincent Jackson. Rodgers gave me 30 points, but Rivers, despite two pick-sixes, and Jackson teamed up for 66+.
  • Miles Austin pulled a hamstring in the first half. Thanks for the 6 points buddy.
  • For the second time this season, I played a defense that went for negative points. 7 players versus 8 is a tough get.
  • I knew I had lost by the end of the 4 PM games. I was down 7 points, and the opponent still had two players left to go. So I lost 7 versus 6.
  • The final killer, I had Julio Jones and DeMarco Murray on the bench leading to a net of about 28 points wasted. I didn’t start Murray because I didn’t want to go with 3 Cowboys on my roster. But I didn’t really consider benching Austin for Jones and playing Murray. Duh!

I’m in 7th out of 8 teams, with 5 games to play. Every opponent this year has scored 100+ fantasy points against me. I lead the league on Points Against by 130+. A player in my starting lineup has gotten injured each week. This just might not be my year.


Occupy and V

V For Vendetta Poster

Back in 2006, I caught the movie V for Vendetta at a matinee. As a middling comic collector in the ‘90s I had the foresight to get in on the limited series at the ground floor. So seeing the movie rendition, even though Alan Moore disowned it, was mandatory.

When I walked out, I was pretty satisfied but a little disappointed. The film was a solid adaptation, taking modern liberties where needed, yet keeping most of the spirit of the graphic novel. For the time it was released, 2006, the movie spoke deeply about government, and society writ large, falling apart. The US administration of the time mirrored some aspects of V’s Britain, and I believed a really great film could have sparked some change.

Turns out the times have caught up with the film. John Scalzi called it in the moment, that V for Vendetta could be a hidden gem of the new millennium’s first decade. Of course he’s correspondingly patting himself on the back, but I also enjoyed how he put the Occupy movement’s appropriation of V imagery in context. The Guy Fawkes Mask now carries symbolism in the US as well as Britain. But, depending on your perspective, it might be a good thing that most Occupiers don’t “remember, remember the 5th of November.”

Makes we wanna do an Amazon Instant or iTunes rental this weekend.


Social, Graph, Neither

I’ve long enjoyed Maciej Ceglowski’s, well thought out, if rambling, writings. He’s got that wry tone of Olin Shivers at his best. Recently Ceglowski unloaded The Social Graph is Neither. Seems to have struck a nerve in the blogosphere. Choice quote:

Imagine the U.S. Census as conducted by direct marketers - that’s the social graph.

Summarizations and armchair analysis don’t do it justice. Check out his essay in full. Well worth the time.


The Overweight Lover

Heavy D Cover The Overweight Lover is no longer in the house. Heavy D. passed away on Tuesday in LA. As a child of 80’s and 90’s hip-hop this stings akin to the passing of Guru.

One could argue Heavy D. was of middling importance in the history of rap. Not the greatest (but still pretty good) lyricist. Never really connected with any innovative style. In no way edgy. Cuddly and friendly on purpose.

You could argue that, but you’d be wrong. In concordance with his girth he had outsized impact. He and The Boyz cranked out a decent sized body of work, and a ton of appearances. Three platinum albums are no joke. Heavy D. was always good for a quality guest spot on an up and comer’s track. The theme song for In Living Color helped put real hip-hop on prime time TV. His rap for Janet Jackson’s Alright added a little street to one of her biggest hits, along with spicing up a great video. He helped expand the boundaries of hip-hop by teaming up with Teddy Riley and blending rap into the broadly accepted New Jack Swing style. This was a precursor to today’s hip-hop infused (dominated?) top of the pops.

Godspeed, kind sir.

On a personal note, Heavy D. was exactly 2 days older than me. That’s it. As Morpheus says Death can come for us at any time, in any place. Granted I’m due for a mid-life crisis, but it makes one stop, evaluate, and reconsider the arc of one’s life.


GeoIQ Streaming

GeoIQ Logo Through work, I’ve had the pleasure of meeting Matt Madigan and Sean Gorman (along with a few others) at GeoIQ, nee FortiusOne. Love their browser based mapping products, and even snuck a few maps into some projects. Failed to ever ignite a larger project, but there’s still hope.

Thanks to an Andy Hickl retweet, I got wind of GeoIQ’s streaming data features. The capability for real-time ingest of data and updates to maps was something I was really looking for in my work projects. Congrats to the GeoIQ team!

Now that I’ve got a pile of geo-tagged data, I might try and test drive the features on GeoCommons as a hobby project. Chris Helm also does a great job of going a layer down and highlighting the tech GeoIQ is using to implement this capability. Hmmm, a vote on the positive side of the ledger for MongoDB.


Gibson in The Paris Review

Link parkin’: Speaking of Torkington’s 4 short links, comes a link to an in-depth interview with William Gibson in The Paris Review.


Slowing the MongoDB Roll

I liked what I’ve read so far about MongoDB, but that doesn’t mean the actual experience will match in practice. Just to temper my enthusiasm, I’ve been keeping an eye out for critical commentary on the NoSQL database.

The comical animated short above takes down some of the common fanboisms at a high level. Meanwhile, Michael Susens-Schurter has a deeper critique of MongoDB with more technical detail from down in the trenches.

I still think MongoDB might be the best fit for my Twitter data noodling, mainly because the streaming data comes out in JSON format. Even if MongoDB fails on a few core DB capabilities, it seems so tuned to storing and querying JSON the reward might be worth the risk. And besides, I’m not doing anything mission critical or at Web Scale.


System D

Link parkin’: Charlie Stross comments on System D:

System D is the planetary unregulated black market, concentrated in the developing world. Excluding traditional criminal activities (robbery, illegal narcotics, extortion) but including small-scale entrepreneurial activities that don’t bother with red tape or taxes or safety regulations, it employs up to 50% of the planet’s work force (1.8 billion workers) and is estimated to be worth $10 trillion a year.

Over and above an apparently important emerging trend, the term System D just has a cool origin (check the link).

Also, another shootout to a blogger (and author) who’s work I really enjoy.


Shoutout to @gnat

Just wanted to take a moment to show my appreciation for Nat Torkington’s work at O’Reilly Radar. The eclectic “Four short links” typically makes my day and has an impressively high hit rate of links I like to stash for further reading.


1Password Banishment

1Password Icon Well only banished from inside Chrome. Turning on the Chrome process tracker revealed that the 1Password extension would blow up to over 100 Mb of virtual memory. Then when I would kill that specific process, Chrome would wholeheartedly lock up. A force quit was needed to take out the crashing browser. My laptop was showing a load of over 35!!

AgileBits doesn’t seem close to solving the problem, so I’m just going to stick to using the 1Password extension in Firefox. There it seems much more well behaved. Meanwhile I just completely disabled the extension within Chrome. Now my laptop is a much happier camper.


Plexus Rangers Chronicles: Week 8

PlexusRangers Logo Small Even week! Must be a win right! Wrong.

I go into the Sunday night game with over 85 points, a 7 point lead, and 2 players left: Miles Austin and Jason Witten. Going against the Eagles no less. Sure, my opponent had LeSean McCoy who I knew could bust out, but 2 prime players against 1 should be advantage Rangers.

Sadly, I would only go on to add 9 more points (jeez I hate the Cowboys) while McCoy hit for 33. The Plexus Rangers lost by 17 points. And just to rub the salt in my wounds, I didn’t start the Rams’ Steven Jackson. 33 points sitting on my bench.

We’ve had one go round through the league plus one. Every game I’ve had 100 points scored against me. Makes for a 3 and 5 record, 7th place in an 8 team league, and in bad playoff position. At least I’m only 2 games out in the loss column.


MongoDB and Python

Python and MongoDB Cover No apologies for going back to back on Python posts. I’m slowly easing back into exercising and spent a 1/2 hour on the stationary bike today. Great opportunity to put the Kindle to use, and I did so knocking out about half of Niall O’Higgins’ Mongo DB & Python. Very productive use of my time.

At 50 printed pages, it’s a slim tome, but O’Higgins so far has done two things really well. First, in about two or three pages he succinctly summarizes the differences between MongoDB and typical RDBMS systems. Frankly I found 10Gen’s documentation to be pretty poor in this regard. Second, some time is spent on the MongoDB operators that can atomically mutate embedded documents. This is somewhat the equivalent of Redis’ atomic data structures, although not quite as clean in my opinion. Take that with a boulder of salt as I’ve used neither MongoDB nor Redis for even minor experiments much less in production. The key point is that I learned MongoDB is somewhat closer to Redis than I thought.

At $19.99 MongoDB & Python seems a bit pricey for the page count, but I bought the e-book version, which is cheaper, and got a half off deal. So I only wound up paying $8 which is quite is reasonable.

Apropos of nothing. My Kindle is showing some cracks in the casing. It’s less than a year old! Then again, My Little Guy(™) has gotten a hold of it a few times.


Python and AWS

Python and AWS Cover

Link parkin’: Mitch Garnaat, the author of the boto toolkit, has finished an O’Reilly book Python & AWS Cookbook. Given my tweet collection hacking, looks like it might be useful with its focus on EC2 and S3.

I recently bought a bunch of O’Reilly e-books on Python and MongoDB. What’s one more on the stack?

Garnaat’s blog is probably also a good catch and worth a subscription. For example, he’s got a tutorial on multi-part S3 uploads, which I was having trouble with previously. So I wound up using Cyberduck. Now I can see how to do it right.


On Precision

There’s no sense in being precise when you don’t even know what you’re talking about. — John Von Neumann

I liked Ben Horowitz’s post on Lead Bullets. But I really enjoyed the above quote that was attached.


Recent Purchases

Tangerine Cover Finally treated myself to some new music purchases. Just a quick listing along with some initial reactions:

  • DJ Marky, FabricLive.55. An interesting mix of DnB and other stuff from a Brazilian prodigy. Probably worth repeated listening
  • Noisia, FabricLive.40. On first listen, not danceable enough for me. Every time the beat gets going, the mix turns all bleep, bloopy. Probably not due for heavy rotation.
  • Marcus Intalex, FabricLive.35. DnB with a somber bent. Likely to get more listens.
  • Mastercuts, Classic Salsoul, Volume 1. An oldie, but expensive, goodie that I just sprung for used. Worth it just for First Choice’s Let No Man Put Asunder.
  • Pete Tong and Felix Da Housecat, All Gone Ibiza, 11. Only listened to the Felix Da Housecat disc, which had a solid, old time Chicago House feel.
  • DJ Heather, Tangerine. On deck. Can’t wait to hear one of my favorite dj’s earliest commercial releases.

I bought half of these as used CDs from the Amazon Marketplace since there was no digital media purchase option. Worked out pretty well. Despite having to pay standard shipping, since these aren’t covered by my Amazon Prime, they were all still pretty cheap. And it all arrived within 5 business days of my order.


Chrome and 1Password

1Password Icon Okay, my kvetching about Chrome causing machine slowdowns might have been slightly misguided. Disabling Flash has clearly improved things, but I’m still noticing bouts of poor machine behavior. In some way, shape, or form the 1Password Chrome extension is part of the problem. Often times when I go to it to fill in a password it’ll take minutes (yes minutes) to pop up the selection dialog. And the Chrome Task Manager shows the 1Password extension process taking up a couple hundred MBs of memory.

Note that other folks are having similar issues. At least AgileBits is aware of the issue. But if this keeps up I’ll have to push all of my 1Password usage to Firefox. Not horrible, but a bit of a pain.


2MM Lines

On the tweet collection front, the file in which I’m stashing my real-time notifications from Twitter has reached over 2 million lines. The requested notification format is length delimited, a line with the tweet length followed by the tweet text. So by my best estimation, I probably have over 1 million tweets harvested.

I actually need to do some validation processing, but I think I’m pretty close to declaring victory on this front. Once again, I’m impressed at how little handholding was necessary. My only worry is that it looks like my script restarted its curl process a couple of more times (yeah), but I know I didn’t add any code to cleanly mark those restart points (boo). So there may be some lurking data corruption.

I’ll let it run overnight just to provide a little safety margin, then get to some data crunching. A 2.5 Gb file is a heaping plate. 100 10K tweet files sounds like a job for Elastic Map Reduce!


ReAbandoningArt

Generative Art Cover I took the time to finally pick up Matthew Pearson’s Generative Art. Since, I’ve got the honest-to-gosh physical book, I’m trying actually read all of it end to end. Have to admit, the long preface and first chapter were a bit of a slog. But then I didn’t really need any convincing that generative art is art and is worth exploring. Nice color plates though. Now I’m working my way through chapter 2 even though it’s terminally remedial for me.

Pearson’s AbandonedArt.org was an interesting find though, given my 100Hours project that I eventually abandoned. Pearson had a little more focus and determination, completing his goal of 100 Processing sketches.

He also had the foresight to put most of his code under a Creative Commons license. I had a notion that taking each piece, shoving it into a GitHub repository, and tweaking from there would be a worthwhile effort. A combination of working on generative art skills and DVCS skills. Plus you could see various artistic paths preserved for others to fork.

Sounds like a plan.


StrongSteam

Link parkin’: StrongSteam, an app store of Artificial Intelligence APIs. Buzzword compliant but interesting concept of making AI a service.


RIP JMC

Scheme Logo My first taste of writing code was Fortran or BASIC, can’t remember which led, but I’d seen both before graduating high school. Either way I can’t say I’d really done any programming.

I got a brutal, 6 week introduction to C (not Mark Randolph’s fault, just a tough intro language (inadvertent DMR reference)) in a summer prep program before matriculating at MIT. Still, not really a programmer.

Then I had the great fortune of having 6.001 with Jerry Sussman. The whole course was taught in Scheme, a highly elegant, beautifully designed, descendant of John McCarthy’s original Lisp. For whatever the reason, I took to the language like a fish and I’ve been programming, sometimes in dribs and drabs, ever since. I got Lisp well enough to do a UROP under Steve Ward, hacking on Lisp machines. My One True Editor, Emacs, is inseparable from Lisp. The key concept of my dissertation was stuffing Lisp, as an extension language, in various platforms like a Web browser.

Paul Graham has one of the better descriptions of the essence of Lisp, but it’s safe to say that John McCarthy had a pretty profound, indirect influence on my life. Sad to see the end of McCarthy’s life arrive.

Oh and Lisp was just a sidebar for all of the AI work he really wanted to do.

Godspeed, kind sir.


Plexus Rangers Chronicles: Week 7

PlexusRangers Logo Small No need to wait until tomorrow to report on this. Both sides have no players in tonight’s tilt.

Odd week? Must be a loss for the Plexus Rangers. 79 fantasy points for. 146 against. And I thought Man U had it bad this weekend.

I picked up Earnest Graham, who continued the injury streak by going down with a season ending knee/ankle injury after 16 yards gained. At least I dumped Santana Moss for Graham. Moss is now out 4-5 weeks with a broken hand. And here I was seriously considering picking up DeMarco Murray and his 251 yard performance.

Miles Austin chipped in a lousy 2.5 points. Rob Bironas scored a single lonely extra point. Larry Fitzgerald failed to make it to the end zone. At least Aaron Rodgers hit for his regular 25+ points.

My opponent must be pissed he wasted such a great outing on such a feeble opponent. He played both Drew Brees for 40 fantasy points, and Arian Foster for 43 more. That’s right. Two of his players outscored my whole entire lineup.

Onward to next week.


Derby Day

Premier League Logo I’m glad the Barclay’s Premier League returned last weekend from a round of international friendlies and qualifying. Since the live matches shown here in the states are on Saturday and Sunday morning, gives me something to watch instead of the US football pre-game yakfests.

And jeez did Manchester City deliver an interesting display this morning, opening a can of whoop-ass on Manchester United (yeah that Manchester United) to the tune of a 6-1 beating. This was administered on the Man U home pitch of Old Trafford. By the end it sounded like the Red Devils were just mailing it in. Now Man City is five points clear of the pack.

Then of course, my horse in the race, Chelsea FC, spit the bit, losing to Queens Park Rangers. The Blues could have jumped over Man U to take second in the tables. I feel like Chelsea has a solid team, with a lot of quality, but not quite the character to take the trophy.

Still, there’s a long way to go. By my calculation the league hasn’t even reached the quarter pole in terms of games played, and there’s 7 months left in the campaign.


Mugs

Nefeli Caffe

Being in Loudoun County Virginia, Leesburg nestles in one of the wealthiest parts of the country. But I wouldn’t exactly call it cosmopolitan. Not the boonies, but definitely exurb.

Which makes the notion of building a relationship with a good homey coffeeshop a bit tricky. One’s lucky to find a “cafe” a notch above Starbucks, and then it’s usually placed in a scenic mall setting with a stunning view of a parking lot.

When I first got to Leesburg, there was a franchise of the regional Greenberry’s chain that exactly fit that description. Not Nefeli Caffe, but it did the trick. In fact, the still standing Greenberry’s in Maclean actually has a good cafe feel. Then the Leesburg one changed ownership a couple of times, ditched the Greenberry’s affiliation (name change to Dolce Coffee House), and just developed an overall incompatible vibe with my tastes. Couldn’t put my finger on it, but some different commute patterns over the summer, and general indifference led to a steep dropoff in visits. (Unfortunately this led to a lot of Starbucks visits, but that’s for another post.)

Checked back in to Dolce twice this weekend and things feel a bit different. I think they might have changed ownership again, which may or may not be a good thing for the baked products. But the coffee’s still good. The same youthful staff is there on all of my recent visits. They’ve got a little of that urban flair. They may eventually get around to fixing the floorboards, but at this point it’s sort of a cute affectation.

But best of all, they’ll give you a hot drink in a ceramic mug. If I had to put my finger on the singular difference between Starbucks, along with its clones, and real cafes is that one. Mugs are for folks who are going to sit for a leisurely bit, relax and enjoy the atmosphere, and maybe commiserate with like minded customers. Paper cups are for people who are leaving soon.

Bravo Dolce!


Processing Memo

Memo to self, check back in with processing. Looks like there are some big upcoming changes. Time to give up on the “processing in Python” dream and learn to love the best possible tool. I pulled the trigger on Matt Pearson’s Generative Art which should kickstart the process.


Speaking of Data Collection

PubSubHubbub Logo So Gowalla and Instagram have real-time streaming apis. Not quite as easy as Twitter since those services use PubSubHubbub to stream their updates. It also means running a notification receiver that accepts incoming HTTP requests on my Linode. So for my next trick, I’d like to see if I can’t get a million updates out of each of those services.


The 500K+ Mark

MongoDB Logo Passed the 500K mark in tweets collected. Halfway home. I’m not so much amazed that I’ve gotten this far, but that there haven’t been more bumps. Basically, I’ve had a single script running for well over a week. That script has only needed to launch two child processes and only recover once from a child failure. The same script is a paltry 51 lines of code including empty and comment lines. Meanwhile the data file is now well over 1GB.

That’s a testament to 1) Python’s concision, 2) Linux/UNIX’s simple clean design, 3) Twitter’s clean streaming API and 4) curl‘s robustness. I’ve just glued a bunch of really solid parts together to good effect.

Now that I’m on the downhill, I’m starting to think of datastores to hold, index, and query the data with. MongoDB seems like it might be the best fit of the NoSQL camp since the Twitter streaming API pumps data out in JSON. MongoDB also seems to have a nice query language, to have indexing for time along with geography, and to be production ready at scale.


Plexus Rangers Chronicles: Week 6

PlexusRangers Logo Small Let’s see, even numbered week. Oh that must mean VICTORY! It was big scoring, for a bye week, but not a big victory. Then again, I was essentially a man down, what with Minnesota as my DEF delivering a -1 on the scorebook. Yes, I would have been better off with an empty spot.

Still won with a comfortable 9 point spread. I’ll give myself a GM pat on the back for picking up the Giant’s Ahmad Bradshaw. Loving his three TDs. Aaron Rodgers met expectations while Buffalo’s Fred Jackson doubled up his projection. After that it was a bunch of mediocre performances.

My opponent must be wondering how he lost. He had the second highest score in the league this week. And the opposing team spotted him a point, and then effectively played 7 guys versus his 8. We were both expecting a track meet between Dallas and New England, but he was hurt worse when a defensive struggle broke out. On my sidle Miles Austin and Jason Witten were solid, but on the other side Tom Brady and Wes Welker were significantly down. But enough of him, I’m enjoying a win.

Holding down the playoffs in fourth position. Now can I put together back to back victories?


Gibson’s Setup

Obmention: William Gibson’s Setup. Not all that interesting to be honest.


Big Data DC

Meetup Logo Attended the Big Data DC meetup this evening. The meeting was surprisingly conveniently located for me and I enjoyed both speakers, even though I only had surface knowledge of the night’s topic. The presentations were focused on the Cassandra NoSQL data store. We got an overview of the fresh off the press Cassandra 1.0 release, and some insights into the CQL query language for Cassandra.

Didn’t mind the dive into infrastructure, but I’m more of a straight-up data and analytics guy. Looking forward to more Big Data DC meetups though.


250K+ More Tweets

Actually, I’ve got over 335K tweets and counting captured on this collection. There were two keys to making progress over my last effort. First, I moved my collector to a personal Linode virtual private server (VPS). As opposed to my ancient home Linux workstation, the VPS can stay up and on the network for days and weeks at a time. No outages due to brief power shutdowns or arbitrary Verizon glitches. Second, my script actually recovered from a network hiccup that killed its child curl process. Picked up the right error code, forked off a new curl, and kept on trucking.

The back of the envelope rate of collection is about 60K+ tweets per 24 hours. At this rate, it’ll take me about another week and a half to hit the million mark. I’m not holding my breath, but glad to see a significant advance over the last milestone.

P.S. After a few months, can thoroughly recommend Linode’s services.


Chrome Click To Play

OS X Lion Belay that MacBook Air lust. My poor little white MacBook seemed to hit a severe thrashing wall. (Load avg 23!!) A little inspection and it turns out a bunch of Chrome processes were taking up all the CPU time. I could have gone tab killing, but I had a hunch about another culprit.

A little less than a year ago, John Gruber dumped Flash. I’ve always sympathized with that approach. My laptop’s fan starting up seems to correlate with playing Flash content. Now I suspected Flash was the real villain in my MacBook slowdown. What to do?

Turns out you can configure Chrome to not autoload Flash content. Subsequently you click on demand for playback. Works wonders.

Still got the same number of tabs open in Chrome, but now my machine is qualitatively much more useful.

Hat tip to Chris Kasten


SimpleGeo

SimpleGeo Logo Fingers crossed, someday I’ll meet my target of collecting 1 million geotagged tweets. So then what? Well I’m not sure, but I’m guessing some geospatial based analysis might be fun.

Enter SimpleGeo. I don’t quite know what their business model is, but I like their products/APIs:

  • Context: gives you contextual information about a place.
  • Places: allows you to search for further Points of Interest (POIs) near a given location
  • Storage: lets you store, index, and query geospatially tagged data

Plus SimpleGeo’s pricing seems pretty reasonable for hobbyist tinkerer like me. Not to mention they make the PolyMaps library, which seems mighty handy.


Python Command Line

Python logo Intermittently over the past few years, I’ve been writing various command line apps in Python. Things like split, but with a little more complexity and in a higher level language.

I’ve often felt I wasn’t quite writing these in a Pythonic fashion. I’d stashed away Guido van Rossum’s BDFL blessed idiom for main, but never really put it to use.

Steve Lott has a much better and simpler style of writing Python main functions, that seems more easy to ingest and adopt. He even includes a nice example of how to tie option parsing into logging.

Now it would be great if someone could come up with a cookbook for using the argparse module.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.