home ¦ Archives ¦ Atom ¦ RSS

Mark Farina SoundCloud

DJ Mark Farina Avatar Feeling like it’s time to change up my listening habits on the iPhone. Right now, I only listen to about 4-5 playlists on a regular basis. There might be another 10 (and that’s being generous) that I listen to occasionally. Finding good mixes on the Amazon MP3 or iTunes Music stores is getting more difficult.If I can get up to speed on iTunes’ Smart Playlists, I’ll dump many of my regular playlists off of the iPhone. Then I’d really like to dial into some house, DnB, and other electronic music podcasts and downloads. Thanks to Tweetbot, I’m seeing a number of podcast retweets in my Twitter stream.

For example, DJ Mark Farina recently wiped out his SoundCloud page and then uploaded a boat load of material. I’ve probably downloaded a bunch in previous incarnations but maybe these are cleaner or higher resolution. Almost definitely a few I haven’t downloaded.

Time to get some inflow from the magnificently wider world.


NetNewsWire Updates

NetNewsWire Logo I know it’s not fair, but I’m sort of pining for a NetNewsWire update. Black Pixel just had a release back in October. But when the product changed hands, Brent Simmons and Daniel Pasco got me all hyped up. Unfortunately, Black Pixel aren’t the prolific bloggers compared to Brent Simmons’ output.

Update. That’s what I get for posting late when tired. Title changed to NetNewsWire from MarsEdit, another fine application which has definitely been regularly updated by its publisher.


PostGIS Index Selectivity

PostGIS Logo Small Maybe I’m doing something wrong, but I find I’m fighting PostGIS’s query planner way too often. The problem is that I have a table with a lot of records, the planner incorrectly thinks a geospatial filter will eliminate a lot of the rows, and then decides to do an index scan across the entire table. This happens even when I pair the geo filter with a highly selective date range query. Seems like a known problem with GIST indexes.

Reworking the query to use a subselect, wrapping the geo select around the date filtered select, seems to do the trick. But when you’re working against the system either you’re doing something the system wasn’t designed to do, or you don’t understand the system. Or both.

More investigation needed.


Interesting Analogies

Even though I’m pulling a big quote, David Galbraith’s I Used to Love Trains, is worth reading in its entirety.

So why isn’t the same true of the road network? That requires massive infrastructure and in France the road network is possibly the most free-market driven infrastructure in the world with private toll freeways operated by companies in places like Dubai. The French road network is ironically far more capitalist than America’s. But it isn’t the nature of the ownership of the infrastructure but the nature of the network itself that determines its character. Railways can only pack a very few number of trains on at one time and there are few exits or stops. Roads have many vehicles like little packets of transport. The road network is like a packet switched one, like the Internet whereas the rail Network is like the legacy, fixed line, monopolistic telephone system.

Galbraith is someone else I’ve been following for quite a while. He’s done quite a bit of interesting stuff in the past. Some of the projects have been quite successful so he speaks with some authority on technology. Galbraith doesn’t post all that frequently. So each one should be savored.


PyCon: Still Locked, Partially Unloaded

PyCon 2012 Logo Foo. An exceedingly high profile work event has popped up. Luckily it’s located in Menlo Park, CA but of course it’s scheduled for March 7th and 8th. That’s right, dead on top of the 4 tutorials I planned to attend at PyCon 2012.

The downside is that I can’t mainline my PyCon experience like I wanted to. And I was really looking forward to catching up with Allen Downey, a contemporary of mine at UC Berkeley.

The upside is that my flight should now be reimbursed, I get a couple of nights in a hotel, and I can probably expense a rental car. Plus two vacation days I was planning to take go back in the bank. Yeah me!

Maybe I’ll make up for it at SciPy 2012. There’s some fine folks I’d like to visit in Austin, TX.

Kudos to the PyCon arrangements team though. I was refunded my tutorial payments with no hassle and the reversal cleared my credit card a day or two ago.


Hotel Connectivity

WIFI Icon I had the pleasure this past weekend of attending the 2012 BEYA Stem Global Competitiveness Conference in Philadelphia. The event was a much bigger deal than I expected, punctuated for me by conversations with some really high achievers. I also really enjoy visiting The City of Brotherly Love. My accommodations at the Philadelphia Downtown Marriott were quite acceptable, except for one crucial technology element.

Internet access (wired or wireless, everywhere, even in the lobby) cost $12.95 per day. WTF!!

I scoff at you Philadelphia Downtown Marriott and your outrageously overpriced Internet access. Thanks to Starbucks, Cosi, and 3G on my iPhone, I partook of none of your price gouging product. It was a touch inconvenient, but that’s a few less shekels in your coffers. Shame on you!

Just more incentive to get a pay as you go MiFi.


One Laptop

Just a few random thoughts on my iPhone 4 even though it too is getting a little long in the tooth. First, the dang thing is pretty doggone beautiful. Probably like many folks, I got suckered into buying an ugly case before I walked out of the AT&T store, on the pretense of protecting it from a fall. Recently I just took the dang case off and decided to live with the consequences. The iPhone 4 is a sweet looking device and deserves to go unadorned.

Second, the device has gotten me down to typically transporting one laptop and often no laptop. For a while I was lugging my personal MacBook along with my work machine. I was under the delusion that there was enough personal stuff needing doing that I had to have my own machine at all times. Turns out all I really needed to do was 1) check e-mail and occasionally respond, 2) check bank accounts, and 3) keep up with my feeds in NetNewsWire. Well the iPhone takes care of that, gives me the web, provides my GPS navigation, and even let’s me keep up with my posting.

People bitch about not getting jetpacks but hell, you got a ParcTab. Beats flying around looking like a dork.


Death By Mountain Lion

Despite my extolling the endurance of my late 2008 White MacBook, it looks like Apple is going to deny it an update for the next version of OS X. Dubbed Mountain Lion, Ars Technica is reporting the list of supported machines. And my Trusty Old MacBook doesn’t make the cut.

C’est la vie. Hopefully they’ll provide a few updates for Lion and I can keep productively using the old laptop. At least until I get my hands on my dream 15 inch Air-like PowerBook.


MOOCon

Just for the heck of it, I was considering getting some cards from MOO to hand out when I attended PyCon. I’ve got spiffy cards from work but this is a personal trip. Looks like a few other folks had the same idea, MOO caught wind of the trend, and now they’re running a PyCon special on MOOcards. Nice touch delivering directly to the conference for pickup.


Very Tastypie

Previously I had posted about tire kicking Tastypie, a toolkit for creating RESTful APIs in Django. Well after a few kicks I think I can recommend it as a good solution for exposing data repositories as Web services. I’m currently using it strictly for read-only access to a relatively simple, but large, database, but getting going was insanely easy. And tastypie turns your Django models into full on REST resources, not a poor RPC over HTTP knockoff.

What really won me over is that I ran into a problem where the default behavior led to a SELECT COUNT(*) from a very large table. That’s a no-no in PostgreSQL. Overriding one method to do some raw SQL into a summary table solved the issue.

Bonus, works very nicely with GeoDjango’s geospatial extensions, which also surprised me with it’s ease of use. Helps that the geobjects have “convert to JSON methods” that just do the right thing. Worked straight out of the box.


Dancing Deconstruction

Roxie and Velma Link parkin‘: Neato prints by Niege Borges that deconstruct dances from various popular movies and television shows. Includes illustration for one of my guilty pleasures, Chicago and one of my all time favorites (top 5 no less) Pulp Fiction.

Seriously considering buying one or two to put in my office as conversation starters.


On Data Science Teams

I’m anointing myself as a “Mad Data Scientist, Apprentice 3rd Class, Autodidact” based upon hacking activities at work and at home. One of the things that’s starting to interest me is constructing teams of people to do investigations into massive data sets. Herewith a couple of interesting links related to the topic:

Kurt Schrader, Building a Data Science Team at a Startup - An Engineering Perspective

“A data science team seems to end up having people from a much more diverse set of backgrounds than a normal engineering team does, and because of that you’re going to need to figure out how to fit the data science team into your organization. One thing that we’ve eventually come around to is to throw our normal engineering practices out of the window and to start working towards having the data science team work with whatever tools they need in order to move as quickly as possible in their own way.”

DJ Patil, Building Data Science Teams

“All the top data scientists share an innate sense of curiosity. Their curiosity is broad, and extends well beyond their day-to-day activities. They are interested in understanding many different areas of the company, business, industry, and technology. As a result, they are often able to bring disparate areas together in a novel way. For example, I’ve seen data scientists look at sales processes and realize that by using data in new ways they can make the sales team far more efficient. I’ve seen data scientists apply novel DNA sequencing techniques to find patterns of fraud.”


Redesigned MacBook Pros

Really, I had no idea about this MacBook Pro redesign rumor when I posted about my lust for a 15” MacBook Air. Unclear what the processor, RAM, and storage specs would be but I have to imagine anything with “Pro” in the name should be able to sport 8 Gb RAM.

I’m guessing that as a first iteration on blending Air and Pro this’ll wind up being a premium product. Hopefully it only cost an arm or leg, not both.


iTunes Long Playlists Redux

For the longest time I’ve been really irritated by an old iTunes on iOS misfeature. If a playlist or track name was really long, there was no way to scroll it and see the cut off portion.

Imagine my surprise this past Friday, when I just happen to press on a playlist name for an extended period of time and up pops a black balloon with the full name. Don’t know when Apple fixed it, but irritant removed. Good on ya, Cupertino.


RESTing Django

Previously I had posted about Sleepy.Mongoose, a module for making “RESTful” services out of MongoDB. Turns out Sleepy.Mongoose is really more RPC over HTTP then REST. Plus it seems to be a bit neglected. I gave it a test drive at work and wasn’t feeling very comfortable. So I’m giving Sleepy.Mongoose a pass.

I’m planning to use Django for some backstage Web administration tools, so thought I’d see if there’s anything good for that framework. Ran across Tastypie and looks like it has potential. Especially since there’s a clear example of how to wrap a non-Django ORM data sources. Along with continued maintenance and an active community. Kicking the tires this weekend.

Link parkin‘ Piston just in case Tastypie falls through.


White MacBook EOL

About This Mac Snap ArsTechnica is reporting that Apple’s venerable white MacBook has reached the end of the line. They’ve informed their educational channels there aren’t any more to be had. It’s not even clear if Apple had been manufacturing the product at all recently.

Longtime followers (follower?) will note that the vast majority of this blog has been written on one of those MacBooks, from the second post up to this very one you’re reading. The laptop has been one of the better product purchases I’ve ever made. In fact, if you have pretty basic computing demands, it’s still a good value now, three and a half years later on. I’d honestly recommend picking up one on eBay if it’s in good condition.

Thanks to This Old MacBook, I can wait patiently for the 15” MacBook Air sporting 8 Gb of RAM and an affordable 500 Gb SSD. Right along with a Unicorn!


PyCon Locked and Loaded

PyCon 2012 Logo Finished up my PyCon 2012 registration by adding a heap of tutorials:

I would have preferred Raymond Hettinger’s Advanced Python 1 instead of Social Network Analysis, just to break up the data intensive run, but it was sold out. Maybe sometime in the future.

That’s gonna make for a pretty packed and intense schedule when you factor in the talks on Friday, Saturday, and Sunday. Not to mention I’m essentially taking red-eye flights on each end. But the goal is to mainline the Python experience and come back firing on all hacking cylinders. We’ll see if I survive.


Wizards Worst? Noes!

Wizards Logo 2012 Figures of course. Flip Saunders dismissed. Andray Blatche on the disabled list. The Wizards play better. Not great just better.

But they are no longer threatening for the worst win percentage in league history. What’s that about small victories?

Of course now they’re just down to three stooges. Nick Young is on a one year contract so I’m not too worried about him. I won’t stop holding my breath about Ernie Grunfeld until after the trade deadline. Sure I’d like to see him move Blatche and/or Young, but I’m scared of what might come back. Another Rashard Lewis? Tyrus Thomas? Saw his act when he was first drafted by the Bulls. Hey, Mike Bibby isn’t getting any run in New York! That worked out well last time.

And I was just about to say I could live with Javale McGee until the world was regaled with this lovely reminder of his talents:


Hardware Upgrades

HengeDock MacBook Looking to clean up my desktop, I recently purchased some new MacBook related hardware.

  • A Henge Dock to park my old MacBook. They’ve got great marketing materials which sold me, but at the end of the day this feels a bit like a glorified cable organizer. Took a little more effort than I expected to get my machine integrated, and the Ethernet port on my laptop seems misaligned. However, the Henge Dock’s vertical orientation of the computer did help me clear some desk space. Still need to live with it for a bit. I wouldn’t discourage anyone from buying one, just know what you’re getting into. Having said all that, I don’t intend to send it back.
  • An Apple Magic Trackpad. As expected, very elegantly designed and “just works”. Need to get the hang of clicking, selecting, and dragging with it though.
  • A Microsoft Wireless Desktop 2000 kit. The various Apple keyboards just didn’t feel right. I really wanted to go wireless so the full sized USB Apple version was right out. The Apple wireless keyboard is small and light, but cramped. So I went cheap just to get started, and picked up this Microsoft bundle. Seems to be working out fine and the included software correctly remaps the keys to make them Apple compatible.

Super Apathy

Superbowl XLVI is today. I have to say I’ve become increasingly disinterested in the event. The actual game is appointment television, but the two week buildup, the pregame, the partying, the halftime show, etc. etc. do nothing for me. I’ve taken to tuning in as close to kickoff as possible, even missing up to the first five minutes of the game. Only got burned on Devin Hester’s opening kickoff return, ’natch.

I’d like to think it’s age and maturity. Could just be familiarity and cynicism.

Okay, with the right people I can be interested in partying, but I’m just sayin’.


Conference Dates of Note

Still working to schedule my three large scale tech events this year. PyCon is close to set, but now I need conferences two and three.

Don’t know how I missed this, but O’Reilly and Cloudera combined Hadoop World and Strata. It’s going to be in New York, in October. Perfect topic, timing, and location. We’ll have to see what the registration costs are and of course New York isn’t exactly the cheapest city to visit.

Meanwhile, SciPy 2012 will be in July in Austin. Downsides? Another Python conference and Texas in July. Upsides? Another Python conference and it’s Austin, Texas. With friends in the area, it’s very tempting.

And I’d still like to squeeze in a David Beazley Chicago Python course.


RESTing MongoDB

Link parkin’: Kristina Chodorow’s Sleepy.Mongoose Python module turns a MongoDB server into a Web based REST server. Might make for a really convenient, fast, lightweight object store.


Meet Magit

Jeremy Zawodny on the Meet Magit screencast:

“Having seen the video, I find that I most often refer to the Magit Cheatsheet and the Magit User Manual. If you’re an emacs and git user, you probably owe it to yourself to spend 15-20 minutes watching the video and trying out magit. I think you’ll quickly realize how useful it is.”

That would be me sir. I’ve been using magit regularly but in the most naive of fashions. So I will be following your suggestion posthaste.


1 Billion Tweets

Twitter Bird Small Once upon a time, I viewed capturing 1 million tweets as a challenge. Now that that’s in the rearview mirror, I’m pondering new achievements. This one is really audacious.

1 BILLION TWEETS! (curling lip ala Doctor Evil and chuckling diabolically)

That would actually be something outstanding for a single individual. Consider that to do it in a calendar year, you’d have to average just a little south of 2.75 *million tweets per day, 114K per hour, 31 per second. I’m not even sure that’s possible with Twitter’s streaming API. And then there’s the bandwidth and storage issues. Not to mention maintaining pretty high annual uptime to stay on target.

Now that would just be the capture. What the heck would you then do with all that data? Imagine the possibilities.

I think a reasonably well funded individual could pull it off with careful exploitation of Amazon Web Services, but it would definitely be non-trivial. I can dream can’t I!


pyCLI Redux

Previously, I had posted about the pyCLI package, mostly just stashing the link away. Recently I’ve had occasion to put the module to use for real and started digging into the docs.

Better. Than. Remembered.

The argument parsing really easy and convenient. I am now salivating about the logging and daemonizing features. pyCLI doesn’t do anything spectacular, but it makes a couple of good to do things, easy to do.


Spotlight Keywords

At work, I moved to a brand spanking new Macbook Pro. Yeehaw! But in the migration of my old data, Spotlight search over my Thunderbird e-mails no longer worked. This was a big hindrance as I archive e-mail by stashing it in Thunderbird local folders. Thunderbird search is getting better, but I’ve got LaunchBar muscle memory for shooting off Spotlight searches. In effect, I underwent partial institutional memory loss.

After an epic hunt across the Interwebs, I finally managed to find a solution to enable Thunderbird Spotlight indexing thanks to GetSatisfaction. Suffice it to say it involves gnarly file permission tweaking.

Then I got to thinking, wouldn’t it be great to limit Spotlight searches to just e-mail? Gotta be a way to do that right? Turns out Spotlight supports a number of keywords to match various file types. kind:email gets both Thunderbird and Outlook messages. Handy!


Enough Technology

I was diggin’ through the notes crates, looking for something good to post about, when I ran across this oldie but goodie from Paul Buchheit of GMail fame:

“I don’t believe that’s true though. There is an optimistic way of understanding my first point, and that’s my second point: Even if you aren’t the smartest person around, and your product is kind of ugly and broken, you can still be very successful, if you just build the right product. YouTube and MySpace are both fine examples of this.”

…“When Google acquired YouTube, many people inside the company were flabbergasted, “But they have no technology!?” They didn’t understand that you only need enough technology to make the product work.”

I’ve been trying to bring a more entrepreneurial mentality to work. Unfortunately, the tendency is towards trying to game the customer, rather than building “products” they want. Now we’re in science and technology applied research for the global security market, so product is somewhat ill-defined. But I know one thing, more slideware doesn’t cut it these days. Unfortunately, here’s the pervasive tendency within my org. 1) Think up some cool idea, 2) put it in PowerPoint, 3) shop it to program managers, 4) PROFIT (or not).

And I should really know what Buchheit is saying, given what I’ve seen recently, but it’s always a struggle. Enough technology, or data, or analysis, can make the product, our intellectual services, work.


I Heart Sphinx

It’s ornery and has sharp pointy teeth, but I’m coming to appreciate the Sphinx full text indexing and search engine. Might not have the greatest documentation or APIs but damn does it index like a bat out of hell.

I’ve personally seen it rip through approximately 4 Gb of data on a 5 year old server with only 8 Gb of RAM, said data on a suboptimal Linux ext3 filesystem, on top of an untuned kernel, and with no thought given to the IO and HD subsystems. Grand total of 21 minutes.

That is a nice capability to have.

I know Lucene with Solr on top is sort of the default open source choice for full text indexing, but if you’re in the market Sphinx is worth a tire kick.


Hadoop World 2011 Presentations

Link parkin’: A fairly comprehensive collection of slides and videos from the Hadoop World 2011 program presentations have been posted directly to the Hadoop World site. Also as a straightforward list of materials on Cloudera’s site, with registration required.

Hadoop World 2012 in the Fall is fairly attractive.

Hat tip: Jon Zaunich


Another Earlybird

Link parkin’: PDF Warning Earlybird: Real-Time Search at Twitter

Rapidly updated text indexing from folks who really have to care about it.


Lockout Upside

NBA Logo Small The NBA Lockout has resulted in a highly compressed schedule. This has the nice upside that between TNT, ESPN, and NBA-TV, practically every night there’s a game on the flat panel. Many of these are interesting match ups.

Of course the downside of the lockout and compressed schedule is that often these games wind up really sucking. Teams had no real training camp, have no real practice time, and are often playing tired. I’m not sure I’ve ever seen this many 20+ blowouts. And it’s only the first month or so of the season.

Speaking of which, at this very moment the Orlando Magic are beating the Boston Celtics by 21 points, 58 to 37, at the half. A few days ago the Magic only scored 56 points in an entire game against the Celtics.

Makes for good blog writing background noise at least.

Update. Don’t know if this confirms or disconfirms my assertion, but the Celtics came back from 27 points down to win by 8. Watching Orlando collapse was sort of painful. If not good basketball at least it was entertaining.


Seismic Data Science

I couldn’t extract a really good money quote, but found Josh Wills post on Seismic Data Science: Reflection Seismology and Hadoop well worth reading. First Wills delves into the squishy term “data science” and usefully adds some definition. Then he looks at how the company he’s with, Cloudera, built some interesting infrastructure to adapt Hadoop to the seismic data processing domain.

One observation Wills made is really on point. The original core Hadoop infrastructure is starting to look like basic plumbing. Meanwhile there’s a coming (ongoing?) explosion of domain specific programming models, tools, and applications being built on top of Hadoop as a platform. Sort of like how Lisp macros enable the proliferation of domain specific languages.

Except not quite as elegant. But I quibble.


One Down, Four to Go

Wizards Logo 2012 The easiest piece of the Wizards horrible Jenga puzzle got yanked today. Flip Saunders was relieved of his coaching duties and probably restored of his sanity. Godspeed kind sir. You deserved better than the dreck you were served.

The scary thought is that this move could buy Ernie Grunfeld more time to damage the team. As Jason Reid points out, he needs to go quickly. Especially because there are some player personnel moves that need to be quickly executed in the hopes that John Wall and others don’t permanently catch the losing bug.

Basketball Prospectus had the scariest headline I’ve seen today: “What if the Wizards Don’t Get Better?”.


Earlybirdin’

PyCon 2012 Logo I’m locked and loaded for PyCon 2012. Took advantage of the early bird rate to save a few shekels and I’m at least guaranteed a spot for the talks. Still need to sign up for a couple of tutorials, but I will be back in Da Bay for the first time in well over a year, probably starting late on March 6th.

I definitely want the promised cookie. If on the off hand there are any other Pythonistas reading this blog planning to attend, feel free to shoot b m d at crossjam dot net an e-mail ahead of the proceedings.


Diggin’ On: The Basketball Jones

The Basketball Jones Logo

Have to say recently I’ve quite been enjoying The Basketball Jones in general, and Trey Kerby in particular. Mostly because Kerby’s been entertainingly skewering the hometown team, a.k.a. the Washington Wizards. But the overall NBA coverage is a hoot as well.

Seriously, I actually laugh out loud on a daily basis thanks to TBJ!


Arsenal v ManU

Arsenal Logo While I haven’t been watching a lot of English Premier League matches recently, I have been keeping an eye on the tables. The usual suspects, Manchester City, Manchester United, Tottenham , Chelsea, and Arsenal, are at the top, with the Blues a bit disappointing, the Gunners surprisingly resilient, and the Spurs a welcome new face. After the 8-2 beatdown ManU gave Arsenal earlier in the year, and the acrimony heaped on Arsene Wenger, the Gunners are within striking distance of Chelsea, the fourth spot, and Champions League next year.

We’ll see how far they’ve really come though, what with a visit from Manchester United this afternoon. Add to that Tottenham traveling to Manchester City and it’s a pretty important day in the Premier League.

Good timing for a great match day with the NFL Conference Championships late in the day and no games between ranked teams in NCAA Men’s Basketball.

Update. Both matches were quite entertaining, although I found the Man City / Tottenham game much more compelling. Gotta give the Spurs credit for coming back from two goals down to level. Thought the Gunners were gonna cave, but they fought back too. Not a top move by Wenger though, bringing in Arshavin.


Social Futurism

No not futurism done socially, but futurism that focuses on social developments. Noted futurist Jamais Cascio discoursed on how futurism has been pretty good in predicting technological developments, but pretty poor in foreseeing the grand social changes of our times:

And on and on. If futurists have become almost too good at technological foresight, we remain woefully primitive in our abilities to examine and forecast changes to cultural, political, and social dynamics.

Enjoyed the piece and am quite sympathetic. Definitely worth your time. Being employed by the American military-industrial complex I’ve sort of been at the frontline of this overriding preoccupation with technology and inability to deal with large scale social dynamics. Increasingly I’m coming to the conclusion that understanding and shaping social dynamics are central to solving the world’s problems. Technological means will be a part, maybe even the key catalyst, they’re not a sufficient condition unto themselves.

We’ve demonstrated we can come up with a tech doodad to solve many complex issues and systematic processes for generating those doodads. But thanks to globalization and hyper-connectivity, maximizing the impact is always a challenge of collective action bordering on a wicked problem.

Maybe this is an indicator of why I enjoy William Gibson so much.

P. S. Even though this is my first link, Cascio’s Open The Future is a good thought provoking read on a continuous basis.


Bob Wyman and FrackIM

A long time ago there was an interesting little startup named PubSub that provided prospective search. Back in my past life, I definitely had a thing for PubSub. And I especially enjoyed Bob Wyman’s blogging before he finally tailed off after PubSub crashed.

I had heard that he’d migrated to Google, but wasn’t really sure what he was up to. Can’t say as I do know now, but Wyman popped up recently in my feeds, experimenting of course with prospective search. Apparently Google’s AppEngine now supports a Prospective Search Service. Wyman has glued in some external real-time streaming services as sources, the Prospective Search Service for matching/filtering (which he knows a lot about), and instant messaging through XMPP to deliver matches to clients. The result is FrackIM.

Neat! And the possibility of following Wyman might be enough to get me to actually use Google+.

A lot has changed technically on the Web since PubSub went off the air. Wonder if they were just ahead of their time, which means the core ideas could make a comeback. Then again, there never turned out be much of a business model for these services.


Maui and Wikipedia Miner

Link parkin’:

maui:

Maui automatically identifies main topics in text documents. Depending on the task, topics are tags, keywords, keyphrases, vocabulary terms, descriptors, index terms or titles of Wikipedia articles.

Wikipedia Miner:

WikipediaMiner is a toolkit for tapping the rich semantics encoded within Wikipedia.

It makes it easy to integrate Wikipedia’s knowledge into your own applications, by:

  • providing simplified, object-oriented access to Wikipedia’s structure and content.
  • measuring how terms and concepts in Wikipedia are connected to each other.
  • detecting and disambiguating Wikipedia topics when they are mentioned in documents.

Bonus, Chromium Compact Language Detector:

Wonderfully, Google has open-sourced most of Chrome’s source code, including the embedded CLD (Compact Language Detector) library that’s used to detect the language of any UTF-8 encoded content. It looks like CLD was extracted from the language detection library used in Google’s toolbar.

It turns out the CLD part of the Chromium source tree is nicely standalone, so I pulled it out into a new separate Google code project, making it possible to use CLD directly from any C++ code.


SOPA Dope

As someone who’s been on the Internet for closing in on 25 years, I am remiss in not seriously tracking the Stop Online Piracy Act, or SOPA (and PIPA), drama. Especially so since I’ve seen and personally benefited from the explosive expansion of the global network of networks. Unfortunately, I just didn’t have time to contemplate blacking out Mass Programming Resistance.

So as penance, I plan to call my Congress-critters offices and give ‘em 2 cents of opposition to SOPA. Senator Warner and Congressman Wolf seem to be on the right side of things, but a little reinforcement is a good thing. Don’t know about Senator Webb though. Given the amount of Internet and Web infrastructure and business based in Northern Virginia, I’d think they’d all be against something that has very clear potential to kill jobs in Virginia.

I like Andy Baio’s explanation of how SOPA would damage his longstanding, and highly entertaining, site. Hat tip also to Baio for this Khan Academy link explaining the downsides of SOPA/PIPA for the more visually inclined. I also like Tim O’Reilly’s reasoning against SOPA as captured by Colleen Taylor.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.