home ¦ Archives ¦ Atom ¦ RSS

Pure GeoIndexing

LazyWeb I beseech you.

I could use a pure geographic bounding box library or server that does for geoqueries what Sphinx does for full text search. Maybe a bit prickly, idiosyncratically extensible, horizontally scalable, and fast as hell by default. Throw in uniquely identified polygons, take queries for all intersections or all contains, return ids. That is all. No transformations or manipulations, just insert, delete, index and answer queries.

PostGIS is great, but the performance of the geographic indexing seems opaque and hard to optimize, at least to this simple soul.


Good Flow

For whatever reason, the quality of items in my various knowledge serendipity flows (a.k.a. webfeeds, although not exclusively RSS) has had a burst of high quality recently. Obviously I want to save some items for further blogging, but a number of them are looking like this quality post on “A Year with MongoDB” found via Hacker News. Obviously this is one anecdotal experience, but it somewhat confirms my experience with MongoDB. Interesting to start, painful in practice.

On a side note, I’m looking at some of the data storage numbers the HNers are mentioning in the comments and feeling a sense of pride that a project at work has a dataset easily comparable.

Yeehah! I get to hang out with the cool kids.


Greg Still Geeking

I‘ve been a fan of Greg Linden for a loooong time. Heck, he even tipped me off to MapReduce back when it was just a moderately interesting distributed programming model and well before it was a hip ecosystem. Although now he‘s much more intermittent, I still enjoy his posts which are much more link dumps, such as this, these days.

I suspect he‘s adding more depth over at his G+ abode, but that‘s One Social Network Too Many © for me.


Peak Opacity

Jamais Cascio does his usually great job of mentally exploring the future contradicting the notion that “data is the new oil”, precisely because data is massively increasing in quantity and especially in the face of corporate interests singularly focused on collecting as much personally identifying information as possible. Instead he projects that the ability to obscure inspection of oneself, a.k.a. opacity, is rapidly becoming scarce. Opacity is even more analogous to oil in that it can be hazardous, a pollutant, and difficult to extract.

Cascio proposes three potential opacity regimes emerging in concert and conflict:

  • Top down regulation: slow moving and hard to get right
  • Bottom up protections: individually powerful but clearly against the interest of large powerful concerns: corporations and governments
  • Emergent disruption through pollution: hard to stop, hard to reverse, and chock full of unintended consequences

An interesting thought experiment.


The EPL Plot Thickens

Premier League Logo So just when I was about to check out on the English Premier League, what with ManU up by 8 points with 6 matches left, the Red Devils spit the bit. In an interesting mid-week set of matches, Wigan Athletic beat Manchester United, 1-0. Meanwhile, Manchester City rolled on West Bromich, 4-0, putting The Citizens 5 points back with 5 to play. Plus, there’s still another leg of the Manchester Derby at the Etihads, where City could conceivably pick up 3 points. Definitely appointment viewing for that match. One more toestub by ManU and things get really interesting.

Arsenal also continued their incredible revival, squashing Wolves 3-0. The Gunners are close to locking up 3rd and Wolverhampton close to locking up relegation.


NexGen MacBook Pros

ArsTechnica, which is fairly reputable, reports that 15” MacBook Pros seem to be in short supply:

“15” MacBook Pros are starting to become scarce among popular resellers, suggesting an Ivy Bridge update could be coming as soon as the end of April. Users hoping for updated 13” and possibly 17” models will likely have to wait until at least June, however.”

Touting an April 29th announcement for new 15-inchers is getting my hopes up. However, I’m not sure “slimmer … sans optical drive” definitely means MacBook Air slim, which is what I’m really fiending for.

And 8GB Ram.

And 512GB SSD for a reasonable price.

A man can dream can’t he.


Stupefyingly Bad

Wizards Logo 2012 No, not the Washington Wizards. Yes, your 2011-2012 Charlotte Bobcats.

Tonight’s game against the Wizards was the first I’d seen of the Bobcats on an “extended basis”. In front of an empty Monday night house in Charlotte, the Wizards had a 30 point lead well into the fourth quarter. And it didn’t even feel that close.

There’s nothing I can really put my finger on in terms of why the Bobcats suck so bad. They don’t really have a bona-fide star other than Kemba Walker. Then again, the Wizards really only have John Wall.

But enough of those losers. Yes I know the Wizards aren’t really winning. The level of play has taken an uptick though. They’re not getting mocked on SportsCenter on a daily basis. Guys like Kevin Seraphim, James Singleton, and Cartier Martin are proving surprisingly serviceable, although on a good team they’d be on the the end of the bench.

And at the end of the day, Jan Vesely is showing real signs he might pan out, which would be a coup for Ernie Grunfeld (still needs to go).


Loud Leadership

Ryan Tomayko’s take on his management style as Director of Engineering at GitHub is something to aspire to. Check it out and read it all. Choice quote for me:

“I actually don’t show people how to make decisions and ship product in any real direct way. There’s no How To Ship Product training class or anything like that. Instead, I just do work.”


Hooking GitHub

Tarek Ziadé notes a burgeoning trend based on distributed version control that I think is quite important:

“There’s a trend these days on Github-based online services. That is — point me your Github repo and I’ll do something with it everytime you push a change.”

DVCSs, among other things, fundamentally make completely explicit the act of creating a delta on a repository. The explicit commit presents all sorts of opportunities for automation over a code repository. As Tarek points out, it‘s not exactly a new trend, but seems to be accelerating with the popularity of services such as GitHub.

His point about dashboards is intriguing. I‘d extend it to whole ecosystems of repositories. Betcha GitHub has all sorts of interesting dashboards for their internal QA monitoring.


DJ Sneak : Fabric 62

Fabric 62 Cover Amazon’s e-mail notices finally brought me something useful. An alert that DJ Sneak’s Fabric 62 had been released. With digital download immediately available to boot.

Sneak occupies an odd place in my house music affections. I love, Love, LOVE, his cut Show Me the Way, off of The Polyester E.P.. It may be one of my favorite anthems ever. On the other hand, I’ve never really fallen for any of his mix cd’s. They’ve all been passable, and I got Fabric 62 just in case there’s a breakthrough, but none of them have ever been on continuous repeat for me. That’s been reserved for folks like Lil’ Louie Vega and Evol Intent.

After first listen, seems like Fabric 62 won’t hit heavy rotation. Lost my headphones and haven’t been able to give it a repeat, but we’ll give it a chance to breath.


pandas PyCon Tutorial

Link parkin’: Video recording of Wes McKinney’s tutorial on pandas at PyCon2 2012. Hard to get a better source than the project’s lead developer.

Serendipitously picked up the full catalog of NextDayVideo’s PyCon 2012 recordings.


A Top Notch Sports Week

Was talking with a colleague at work and noted that April’s first week, Monday to Monday, might be the best week in sports:

  • NCAA Men’s Basketball Championship: Okay Kentucky was impressively the best team in the land, but I’m still waiting for this one to be vacated.
  • NCAA Women’s Basketball Championship: I don’t care who you are, 40 games in a row is spectacular. But Baylor also beat two #1 seeds in the Final Four including Notre Dame for the second time in the season. They also beat perennial blue bloods Tennessee (twice also!) and UConn, along with last year’s national champion Texas A&M (twice also!). Definitely not a creampuff schedule.
  • Major League Baseball Opening Day: We’ll ignore whatever the hell that was over in Japan. Frankly, I’m not sure baseball has opened until the Cincinatti Red Stockings have played.
  • The Masters: A tradition unlike any other. Can’t say I’m a huge golf fan (yet) but Tiger got me hooked on watching, especially on Sunday.
  • NHL Regular Season Finale: At least here in Washington, given the Caps situation this year, the NHL playoffs have already started.

The only other week that’s comparable is Saturday to Saturday of the last week of March, which includes the first day of the Final Four, but leaves out the last day of The Masters.


Milan v Barca 2

DVRed the second leg of the AC Milan versus FC Barcelona UEFA Champions League quarterfinals match. Can’t say I was captivated but did see Messi’s brilliance. That dude can accelerate with the ball, like nobody’s business. Two PKs weren’t thrilling but at least Milan put in a little spice by scaring Barca with a leveling goal.

Like I said though, bet on Barca.


PostGIS 2.0

PostGIS Logo Small After well over 2 years of development, there’s a new release, 2.0, of PostGIS. The old graybeard general wisdom was that one never took a x.0 release seriously as there were bound to be bugs. Still, I’m mildly intrigued as at work I have multiple millions of geolocated objects stored in PostGIS. There are a few queries that could use any help they can get. Then again, I really need to sit down, do some analysis and benchmarking, and really understand the distribution of my data.

Still, maybe a squeaky new PostGIS can help in the performance arena. Hopefully, the query selectivity has been improved at least.


AdoptedArt

Speaking of the remix project, I finally figured out the perfect name:

The AdoptedArt Project

TA DAH!! Adopted being the antonym of abandoned.

I’ve even gone ahead and snagged the domain name AbandonedArt.org to eventually provide a Web home for the effort. Nothing to see there as of yet though. Move along.


Slowly Getting Git

I’ve been trying to use, or more importantly absorb the ethos, of git off and on for a while now. It’s one thing to read about basic branching and merging in a book, and another to internalize an intuitive feel for how to put the facility to use.

Recently, between work and an initial start on the proposed AbandonedArt remix project, I’ve been getting a heavier dose of git usage. ”Practice makes perfect,” and that’s definitely happening here. I’ve finally internalized that branches are coding excursions, you have to checkout a branch you want to merge into, and then name the branch you want to merge in.

Now if I could only get a sustainable working model of remote repositories. I‘m functional, but there are definitely useful bits I’ve missed and still get hung up on a jagged edge here or there.

And I think resolving merge conflicts is an area most git coverage could use some extended attention. I’m guessing conflicts are supposed to be rare but they popup enough, and are tricky enough, that more detail would be helpful to this git apprentice.


ProGit

Link parkin’: Scott Chacon’s ProGit. A handy reference on the Web for using git. Help him out and buy a copy.


Tab Killin’, Python Edition

Python logo Flushing out some of the many tabs I’ve collected in Chrome:


Twitter Utility

Twitter Bird Small I’ve actually been on Twitter since 2007, but really haven’t had much use for it. I was connected to a few friends, but wasn’t tweeting much useful or seeing many useful tweets.

Recently I added a few data science folks, and began slowly expanding my follows. Now I’m getting a lot of interesting links, even hitting Tweetbot multiple times in a day. Props to Ben Lorica, @bigdata, Jimmy Lin, @lintool and Joe Hellerstein, @joe_hellerstein for bringing the good bits. Hilary Mason @hmason and Andy Hickl @andyhickl provide a little personality and I’ve got an emerging cluster of Python folks including David Beazley, @dabeaz, Wes McKinney, @wesmckinn, and Adam Klein, @atomklein.


Champions Dud

Well, I got all excited about the AC Milan v FC Barcelona Champions league match yesterday for nothing. A 0-0 draw makes for a dud. Even worse, the big players just missed plays in the final third. And I’ve never seen topflight defenders diddle around in the box so much. Maybe it‘s a strategic approach I‘m not sophisticated enough to get, but seemed like they were constantly playing with fire.

Hopefully the other half of the tie, next week at the Camp Nou, will be a lot better. Winner take all, bet on Barca.


Yet Another Project

I‘ve noodled around with turning the Retrosheet data into SQL for MySQL and PostgreSQL, but never did anything further. Now that I’ve dug into Tastypie a bit, it might be fun and easy to wrap such a database with a RESTful Web API. Desktop visualizations could be easily created, but even better you could do neat browser based renderings given the direction of new toolkits like D3.


Ivy Bridge MacBooks

Marco Arment, who stays way more up to date on this stuff than I do, is making some reasoned projections on changes to Apple’s MacBook lineup. The dream of the 15” Screen/8GB RAM/250GB SSD seems a bit more promising. I’m actually seeing more and more reports of the current line of MacBook Air processors being quite comfortable for computationally intensive activities, which is where I would have been willing to compromise.


bbfun and point total pool

Here’s another project, slightly more complicated than remixing AbandonedArt (although getting pyprocessing working on OS X Lion is a bit challenging). This is sparked by the fact that my long time March Madness bracket league collapsed under the weight of its own success. Last year the amount of prize money in the pool raised the ire of PayPal and the organizers barely managed to get the purse out of them. This year the organizers came to their senses, realized they have better things to do with their lives, and shut the whole thing down. I wasn’t too put out, since I never won any money anyway, and picking brackets is starting to feel stale.

But there’s still a little competitive juice flowing during March Madness.

Back in my late days of undergrad, early years of grad school, there were a couple of fun contests run on USENET, (yes USENET), around college basketball. The first was bbfun, which was essentially a confidence pool over the week of Big 10 men’s basketball games. You picked the winners at the beginning of the week, ranked them in order of confidence, and the scoring was weighted by your rankings.

The second game, Matthew Merzbacher’s point total pool, had you select 8 NCAA tournament teams. You collected contest points for each real-world point your teams scored throughout the tournament. Later on Merzbacher spiced it up by adding bonus points for having lower seeded teams in your slate. In addition to the serious attempts, which could actually involve a fair bit of analysis, plenty of people entered fun theme or joke slates.

Both of these games seem eminently implementable using modern Web frameworks and toolkits. I actually wouldn’t be surprised if either or both had already been done but it would be a fun “reinventing the wheel” project. If executed smartly, probably wouldn’t be too taxing in terms of compute resources, and a small fee from a decent sized participant pool would cover your infrastructure costs. If one could navigate the purse vice gambling issue, prize money could even be incorporated.


DC vs Detroit

Wizards Logo 2012 At the other end of the spectrum, I can’t believe I’m watching one of the worst NBA matchups this season. The Detroit Pistons against the hometown Washington Wizards. Both teams have lost at least 2 out of every 3 games this season. The Wizards have the second worst record in the league. The Pistons are in a bad bunch with the New Jersey Nets, the Toronto Raptors, and the Cleveland Cavaliers.

I know why the Wizards are awful. What I don’t understand is how the Pistons can be so bad. When you look at their roster, they’ve got two guys with championship rings in Tayshaun Prince and Ben Wallace. Ben Gordon is a proven scorer and has playoff experience. So there’s veteran experience. Greg Monroe and Brandon Knight look like promising young players. Will Bynum, Jason Maxiell, Rodney Stuckey, and Charlie Villanueva have proven NBA talent.

It’s a shame a once proud franchise has fallen so far with no relief in site.

As for the Wizards, with Andray Blatche on the sideline for “conditioning”, that’s 4 out of 5 that need to be gone. Although the way Jordan Crawford’s shot selection is going, he might make the list soon. Dude ease up on the pound the dribble for 10 seconds then shoot possessions. The Wizards show flashes of quality but not well strung together and the team really doesn’t know how to finish. But it beats the crap they had on the floor before.


Champions League Quarters

Chelsea FC Logo So somehow Manchester United, Aresenal, and Manchester City are out of the UEFA Champions League and Chelsea is still alive in the quarterfinals. Something’s unjust in the world. Can’t say as I’m too excited about the Blues v Benfica tie, but might be worth watching given what’s at stake. Maybe Chelsea’s vets will get up off the deck and show some pride after AVB get summarily dismissed.

Now FC Barcelona versus AC Milan? That’s a battle of titans to be admired. Messi against Ibrahimovic is just the starting point. Might have to fire up the DVR for those two matches.


The Lantern Logo

Green lantern corps logo The modern update of the Green Lantern logo has to be at the top of the list of superhero iconography. I’ve seen the logo in a number of pop culture appearances. I might argue that it’s surpassed the Batman and Superman logos for social currency.Makes for a damn nice t-shirt. And as my OS X logon image, the operating system automatically adds some nice shiny highlights that make it look even better.

Too bad it’s tied to one of the dopiest characters in the DC universe. I mean c’mon, a guy who creates randomly convenient physical manifestations sheerly by willpower? You’ve got to be joking. Oh yeah, doesn’t work against the color yellow.


OpenBastion Conferences

While somewhat opaque, The Open Bastion, looks like it’s running an interesting series of Python conferences. DjangoCon US, Sep 3-8 in Washington, DC is a no-brainer for me, although I’m a bit tempted by Open Django, June 8-9, in Chicago.


Behren’s PyCon

Shannon -jj Behrens is doing a great job summarizing a number of the PycCon talks.

I especially like his summary of Ned Batchelder’s Pragmatic Unicode talk. I agree with him that it was on of the best talks of the conference. Despite my limited attendance, we had quite a bit of overlap in session attendance. Maybe that’s an indicator I have good Python taste ;-/


Fiendin’

I’m actually really liking the notion of remixing AbandonedArt (AbandonedRemixed?) and taking it on as a project. But I had to catch myself prematurely optimizing and fiending for new hardware.

New project? Of course I need a new MacBook Air so I can work it on it during all my idle moments. Like that worked out so well before.

Okay, it would be nice to have a new low end, or used laptop, just for this project. Ugh. Can’t get much for a measly $250. Besides, I have more important things to spend money on at the moment.

Hey! Let’s get an inexpensive desktop box just for the sake of this effort. Wait, we’re trying to cut back on stuff this year, not add more.

Howabout this. Get started using The Trusty Ole‘ MacBook. IF performance really is an issue then deal with it. If not and the mythical 15”/8 Gb RAM surfaces, reward yourself.

Right answer.


PyProcessing and Abandoned Art

Processing Logo Now this might be a feasible 100 Hours project for yours truly. Download the processing sketches at AbandonedArt. Upload sketches to github. Clone each sketch, see how well it comes out in pyprocessing, fixup as needed.

Can be worked on in small discrete chunks. Public visibility would be nice and maybe others would join. And it would provide good feedback for the pyprocessing project, maybe leading to contributions.

Processing logo retrieved from Wikimedia Commons, through a Creative Commons License

N.b. Adjusted publish date to real release date. WP just doesn’t handle dates on draft posts correctly to my mind


Circus Process Watcher

Link parkin‘: Circus is a program that will let you run and watch multiple processes.”

Might be a little bleeding edge for what I’m doing at work, but looks very attractive.


Moo PyCon Results? Thumbs Up!!

MooCard Capture PyCon, the conference experience that keeps on giving the gift of blogging material.

So I actually did take advantage of the promotion that moo did with PyCon, providing a 50 business card sample for the cost of shipping. I picked out the whizzy design you see attached, placed the order, chose the eco-friendly card stock, and waited to see the results.

The pickup experience couldn’t have been easier. Just asked at the registration desk and boom, right in my hands. The unboxing was pretty cool too. moo has nice wrapping. And ultimately the cards were pretty sweet as well. I think the design I chose probably needs stock that can support a bit higher ink resolution and some gloss for best effect. But for what I paid, $4, I can’t really complain.

Also should have sprung for a cardholder.

So consider me a happy new customer of moo. The PyCon set was mostly whimsical, although I now think Mad Data Scientist LLC (or Inc) might make for a pretty good company name. And having “Mad Data Scientist” on my badge seemed to tickle a few folks. Have to hand out a few more cards and check out some more reactions.


Health Datapalooza

The third Health Data Initiative Forum is holding The Health Datapalooza right here in Washington, D. C. June 5-6 2012, right downtown. I do some healthcare technology related thinking for work, so this seems like a no-brainer. Not sure I like the pure app contest + keynotes format, but I’ll give it a shot.


Mason at Dropbox

Link parkin’: Hilary Mason presenting a techtalk at Dropbox.

I’ve often found Mason’s talks a little fluffy, but this one looks like it has a little more meat about bit.ly engineering behind the scenes.


Pandas Timeseries

Link Parkin’: Adam Klein on pandas upcoming timeseries capabilities.

pandas got a lot of love at PyCon 2012. Looks like it might be the next center of gravity for Python and high performance numerical computing.


Down Goes Duke!

There have been upsets in the NCAA Men’s Tournament before, but nothing like Lehigh (15) beating Duke (2).

Two observations. First, Lehigh outplayed Duke in both halves of the game. Duke was a bit lucky in getting a first half lead.

Second, Lehigh looked as good as, if not better, than any ACC team that has played Duke this year (besides North Carolina who basically pwned the Blue Devils twice). Lehigh’s defense was eminently capable of guarding Duke for extended periods of time and Duke never looked to have any sort of physical advantage. I mean C. J. McCollum got ‘em for 30 points! Used to be a time when only folks at UNC, Maryland, UNLV, Michigan, Kentucky, et. al. could do that.

The margin between the rest of the country and the just-below-top tier of the big conferences is non-existent.


3 Out Of 5

Wizards Logo 2012 Yeehaw! It is a wonderful day here in Wizardsville!

Ernie Grunfeld started to evoke my worst fears with NBA trade deadline moves. Thankfully, he actually achieved two of my higher priority desires for the Wizards. Javale McGee and Nick Young are no longer with the team.

Let me say that again. NICK YOUNG AND JAVALE MCGEE ARE NO LONGER WASHINGTON WIZARDS!! With Flip Saunders fired, that makes for 3 of my 5 wishes. Even so, Young pulled a dick move, rejecting a trade to Denver and forcing a landing in Los Angeles. Good luck getting signed next year bub.

Now Ernie did saddle the franchise with the contract of NeNe, which has 4 years at $13 million per left to go, a guy who has difficulty staying healthy. Still just getting rid of two clowns who constantly wind up on viral YouTube videos for the wrong reason, and get openly mocked and laughed at on SportsCenter is a good thing.

Also, I didn’t know Ernie’s contract expires this summer. Depending on whether that’s before or after the NBA draft, Ted Leonsis can just wait him out and gracefully bring in someone new. While I thank Ernie for today’s service, he’s still a central part of the problem. His removal is part 4 of my wishes.

Too bad he couldn’t move Aundray Blatche, the fifth and final piece of the puzzle, but hey, baby steps. And (mouth watering) there’s always that veteran amnesty slot.


One Last Thing About PyCon

PyCon 2012 Logo Unless I find something else interesting to report.

I pretty much went with the notion that the talks were going to be the centerpiece of my PyCon. But it’s pretty clear, that while the talks are solid, they should not be what one goes to the conference for. It should really be about interacting with the larger community of Python enthusiasts. Let me give you an example.

Despite my miserable condition, I did make it to a few talks. One, by Frank Wiles, was on the uses of plpython, a Python embedded in the PostgreSQL database. I have a little knowledge of the subject from a project at work.

Frank did a great job, but I thought he left out one key potential use case. PostgreSQL supports the notion of “table functions”, which allows a procedural function to return multiple, multi-attribute values. Said results are treated much like an SQL table, and can be used anywhere a table expression can. This means you can do fun things like joins against the results of a procedure call. The utility is awesome.

After Frank’s done, I feel it’s my public duty to let others know about this awesomeness. So I shamble up to a mic, grunt out a few hoarse sentences that include the phrase “table functions” and nod my head at the response.

After that, a couple of other seasoned PyConners, Charlie Clark (all the way over from Europe!) and Catherine Devlin, spend some time chatting me up. If I was feeling better I would have tried to arrange to meet one or both for a drink later. Charlie was kind enough to almost instantaneously send me a couple of interesting papers on sophisticated PostgreSQL usage.

There was a lot of stuff like that at PyCon. A lot of opportunity to do stuff (besides watch a presentation) with others: tutorials, parties, expo hall, open spaces, poster sessions, programming sprints, programming contests, a 5K run, the hallway track.

To maximize your time at PyCon all you have to do is make the effort to speak up a little. It can even be as simple as saying “Hi!” to someone you’ve never met before.


Bon Voyage Posterous

Posterous logo Even though I sang its praises quite early, I never really got into the Posterous product in a big way. Something about self-hosting my blogging space is just a better fit. However, I did manage to use Posterous for an extended period of narrating my 100Hours project.

Well, Posterous has recently been acquired by Twitter. Looks like the service will run for a while, but these events always are uncertainty generators. Probably a good idea to export my data for “posterity”, yuk, yuk. Congrats to the team for a successful exit.


Speaking of Health

Before bowing out to illness, I did manage to catch Paul Graham’s PyCon keynote. For someone with such a flat affect, he was surprisingly funny, probably because flat means an effective deadpan for delivering zingers.

In any event, he basically read one of his latest essays Frighteningly Ambitious Startup Ideas. They’re all thought provoking, but I found idea number 7, Ongoing Diagnosis intriguing:

“One of my tricks for generating startup ideas is to imagine the ways in which we’ll seem backward to future generations. And I’m pretty sure that to people 50 or 100 years in the future, it will seem barbaric that people in our era waited till they had symptoms to be diagnosed with conditions like heart disease and cancer.”

I find the “Quantified Self” movement somewhat pretentious because the community of people who want to continually look at charts about themselves is vanishingly small. But if there were positive outcomes of the kind Graham describes, and you didn’t actually have to look at the data, then I could see it happening.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.