home ¦ Archives ¦ Atom ¦ RSS

8 Years of S3

It feels like more than eight years though, maybe because I’ve been all over it since the beginning:

We launched Amazon S3 on March 14, 2006 with a press release and a simple blog post. We knew that the developer community was interested in and hungry for powerful, scalable, and useful web services and we were eager to see how they would respond.

Of course, I was dead wrong in my analysis. “S3 is not a gamechanger.” What was I thinking? Too much focus on the storage economics and not enough on the business model inflection point.


Python’s wheels

Packaging has always been a bit of a sore spot for Python modules. Maybe wheels are going in the rant direction. Armin Ronacher has written a nice overview of how to put wheels into actual useful practice:

Wheels currently seem to have more traction than eggs. The development is more active, PyPI started to add support for them and because all the tools start to work for them it seems to be the better solution. Eggs currently only work if you use easy_install instead of pip which seems to be something very few people still do.

So there you have it. Python on wheels. It’s there, it kinda works, and it’s probably worth your time.


Pandas, payrolls, and pay stubs

Brandon Rhodes penned a nice, light, practical introduction to Pandas while using “small” data:

I will admit it: I only thought to pull out Pandas when my Python script was nearly complete, because running print on a Pandas data frame would save me the trouble of formatting 12 rows of data by hand.

This post is a brief tour of the final script, written up as an IPython notebook and organized around five basic lessons that I learned about Pandas by applying it to this problem.


Missing Strata West…

Grumble. Not that I’ve ever been to a Strata Conference, but the Twitter feeds of @bigdata and @joe_hellerstein are taunting me.


Pandas 0.13.1

ICYMI, there’s a new Pandas out with a lot of goodies. Python + tabular data processing + high performance == yum!


Diggin’ On Avro

After some initial trepidation, I’m starting to enjoy working with Apache Avro. The schema language and options (avdl, avsc, avpr) are a bit obtuse, but the cross-language interop seems to work as advertised. Which is a good thing.


Spark Summit 2014

This looks like it will be bad timing for me, but as an AMPCamp 2013 and Spark Summit 2013 attendee, I can vouch for the event quality:

We are proud to announce that the 2014 Spark Summit will be held in San Francisco on June 30 – July 2 at the Westin St. Francis. Tickets are on sale now and can be purchased here.

For 2014, the Spark Summit has grown to a 3-day event. We’ll have two days of keynotes and presentations followed by one day of hands-on training. Attendees of the summit can choose between a 2-day conference-only pass or a 3-day conference and training pass.

If you can’t/didn’t get to Strata West 2014 this will be your next, best opportunity to get a deep dive into the Spark ecosystem.


Data Community DC

I don’t know if it’s the best or the biggest, but DC has one damn well organized community of data enthusiasts:

Data Community DC (DC2) is an organization formed in mid-2012 to connect and promoting the work of data professionals in the National Capital Region. We foster education, opportunity, and professional development through high-quality, community-driven events, content, resources, products and services. Our goal is to create a truly open and welcoming community of people who produce, consume, analyze, and work with data — data scientists, analysts, economists, programmers, researchers, and statisticians, regardless of industry, sector, or technology. As of January 2014, we are currently over 5,000 members strong from diverse industries and from a large variety of backgrounds.

But that’s what we do here in the DMV, build bureaucratic organizational structures. Ha, ha! Only serious.


Trifacta Launch

[embed]https://twitter.com/joe_hellerstein/status/430722901596061696[/embed]

Glad to see Trifacta ship their first product. I had a bit of an insider seat on the Lockheed Martin collaboration. They’ve iterated like crazy since I saw a very primitive version in June. Good luck to Dr. Hellerstein and the team, and of course Go Bears!


Apache Spark 0.9.0

Good to see a release of Apache Spark with GraphX included, even if the graph package is only in alpha:

We are happy to announce the availability of Spark 0.9.0! Spark 0.9.0 is a major release and Spark’s largest release ever, with contributions from 83 developers. This release expands Spark’s standard libraries, introducing a new graph computation package (GraphX) and adding several new features to the machine learning and stream-processing packages. It also makes major improvements to the core engine, including external aggregations, a simplified H/A mode for long lived applications, and hardened YARN support.

Spark is an open source project on the move. Previously, in-memory distributed computation was the big selling point. Now it’s unification of disparate computational models cleanly embedded within the Hadoop ecosystem.


Yet Another MapReduce DSL

Apache Crunch has been around for a while, but the recent addition for support of Apache Spark and a Scala REPL caught my eye:

Running on top of Hadoop MapReduce and Apache Spark, the Apache Crunch™ library is a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce. The APIs are especially useful when processing data that does not fit naturally into relational model, such as time series, serialized object formats like protocol buffers or Avro records, and HBase rows and columns. For Scala users, there is the Scrunch API, which is built on top of the Java APIs and includes a REPL (read-eval-print loop) for creating MapReduce pipelines.


AWS Tips

Link parkin’

AWS is one of the most popular cloud computing platforms. It provides everything from object storage (S3), elastically provisioned servers (EC2), databases as a service (RDS), payment processing (DevPay), virtualized networking (VPC and AWS Direct Connect), content delivery networks (CDN), monitoring (CloudWatch), queueing (SQS), and a whole lot more.

In this post I’ll be going over some tips, tricks, and general advice for getting started with Amazon Web Services (AWS). The majority of these are lessons we’ve learned in deploying and running our cloud SaaS product, JackDB, which runs entirely on AWS.


Linden Still Linkin’

Greg Linden’s got a new batch of interesting links. That’s worth coming out of posting hibernation (n.b. not retirement).


So Far, So Good

ITermScreenSnapz004

In the past few days I’ve:

  1. Transitioned my feedreading experience post-GReader
  2. Upgraded my WordPress installation and a couple of plug-ins
  3. Gone from Ubuntu 11.10 (pictured above, 591 days uptime wow!) to Ubuntu 12.10

Everything so far has been pretty painless, other than one lingering bug in feedbin that only seems to affect the feed for Tim Bray’s Ongoing. Unfortunately, this is one of my favorite feeds. Seems a bit suspect that the issue lingers as Bray and his feed have been around for like ever, and a good feed library should process his, of all people’s, correctly. But I’ll chalk it up to feedbin’s growing pains.

And ReadKit is passable, but I wouldn’t exactly call it … zippy … on Ye Olde MacBook.


The Final Call

GReader Logo Give or take a few due to potential timezone adjustments, in 6 hours Google Reader will go dark. Once again, shout out to all GReader staff past and present for delivering a ton of value for nothing out of my pocket. No heapings of scorn from this quarter. Execs made a business decision and I wasn’t exactly a paying customer. It was a good run while it lasted. Special kudos to Mihai Parparita for whipping together the eminently useful readerisdead toolkit on short notice. Somehow it successfully slurped down multiple gigabytes of Reader data for me across multiple accounts.

Moving onward! I’ve decided to go with feedbin.me since it’s approved for use with Mr. Reader. I realized that despite my affection for NetNewsWire, I now do the vast majority of my feedreading on my iPad, either on the couch or interstitially. So tilting towards my favorite reader there means the least dislocation. Meanwhile, Marco Arment somewhat put a stake in the prospects of NetNwsWire. To compensate on the desktop, I’m adding ReadKit to the mix.

However, like dangerousmeta, I waited until the last minute to make up my mind. I’m reserving the right to radically change my mind as I see fit.


And We’re Back

A month or so off felt pretty good. Lots of great stuff out there in the RSSsphere, despite the coming apocalypse. This choice nugget from Cal Newport on building a great career really hit home:

The courage culture paints a tempting picture of how people end up with remarkable lives. It tells a story where you’re the main character, fighting evil forces, and ultimately triumphing after a brief but intense battle.

The reality is decidedly less exciting. Remarkable careers require that you become remarkably good. This takes time. But not necessarily a string of defiant rejections of some mysterious status quo.

As for feed reading, I’ve stashed my feed list, and may just go Brent Simmons and live with the next NetNewsWire.


Slowing Down

This post is of a pair with 540. And due to Hokey Smokes, there’s even a little more spice. I finally get a link and then I’m going to put on the brakes!

Since today is my birthday, I try and reflect on things I can readily change up to stay out of unhealthy ruts or just to keep myself fresh. 540 days in a row is more than enough to prove that I can keep a posting streak alive. The conjunction of birthday, nice round number, and national holiday seems more than auspicious timing to give up that streak. Plus, I’ve been at this blogging thing off and on for well over 10 years. (Remember when it was all about “social software”?)

Even though I disagree a bit with the whole post, Greg Linden recently captured a bit of where I’m at:

I find my blogging here to be too useful to me to stop doing it. I have also embraced microblogging in its many forms. Yet I am left wondering if there is something we are all missing, something shorter than blogging and longer than tweets and different than both, that would encourage thoughtful, useful, relevant mass communication.

We are still far from ideal. A few years ago, it used to be that millions of blog and press articles flew past, some of which might pile up in an RSS reader, a few of which might get read. Now, millions of tweets, thousands of Facebook posts, and millions of articles fly past, some of which might be seen in an app, a few of which might get read. Attention is random; being seen is luck of the draw. We are far from ideal.

I don’t think blogging is dead. I’m not sure blogging was always about journalism. And I personally haven’t embraced microblogging, although Twitter makes for a great link stream. But blogging is too useful, and fun!, for me to stop cold. I will however, be slowing down a bit. Might be a couple of posts a week but probably no less than once per. I will, however, feel no obligation to any given frequency. So if you’ve been using this blog as your daily hit of excitement, I thank you for your attention, but encourage you to add another source or two as a replacement.

And even though the posting streak wasn’t particularly onerous in terms of time, I’ll be trying to turn the same habits of mind to side projects involving coding and data analysis. This also means my content should trend to more technical topics but we’ll see. With this year’s #3 pick in the NBA draft, the Washington Wizards luck is looking up, so they might be even more interesting to talk about in 2013-14. I also have this half-assed idea to do a series of 10 REM posts, reminiscing on 10 years of blogging, by trawling through the archives of Mass Programming Resistance, New Media Hack, and out into the wider web.

Be seeing you!

P.S. Feels like Greg has another start-up thread within him even though there’s already a clear direction for Geeky Ventures!


540

Check this out

Enthought Python Distribution -- www.enthought.com
Version: 7.3-2 (32-bit)

Python 2.7.3 |EPD 7.3-2 (32-bit)| (default, Apr 12 2012, 11:28:34) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "credits", "demo" or "enthought" for more information.
>>> import datetime
>>> datetime.date(2013,05,26) - datetime.date(2011,12,3)
datetime.timedelta(540)
>>>

That’s my way of saying I’ve posted for 540 days straight. Also on the order of 599 out of 600. Yeah me!

In fall of 2011, just on a lark and as an experiment in behavior modification, I set a goal of posting for 365 days straight. I slipped up, got back on the horse and never looked back. Mission accomplished.

Got a few other things to say, so more after the break:

It’s fairly amazing some of the things I’ve managed to push past:

  • Holidays and holiday travel
  • “Vacations”
  • Crunch times at work
  • Work travel
  • Long days of work + kid + social
  • Personal illness
  • Oral surgery
  • When my wife was in the emergency room after a car accident
  • When I was in the emergency room after passing out from dehydration
  • Days when I thought I had nothing to say
  • Days when I really didn’t want to say anything
  • Days when the world had more important things going on

I learned a lot on this road. Every day I tried to push out some content at least of interest to me, and maybe to someone else. If you’re not in it for the money or the fame, that’s the best you can do. Getting worked up about some fictitious “audience” doesn’t do you any good.

A wide, but not overwhelming, variety of source feeds is necessary. Particular tools don’t make much difference, although posting from the iPhone came in handy a few times. I occasionally did the prepared in advance post thing, but 90% of them were fresh baked that day. Queueing ’em up always seemed a bit like cheating.

One final, interesting observation about a time oriented goal like this one, is that you can’t make it go any slower or faster. No way to speed it up and get it over with quicker. No way to put it on pause just because you’ve got an issue in your life. It’s all about grinding out the clock. But in a twist, you start to crave the daily achievement. You wake up thinking about what today’s post should be. And you get a bit fidgety late in the evening if you haven’t yet met deadline. It becomes an addiction, but in this case a good one.


Hokey Smokes!

Yowsa! I actually got a link-out from dangerousmeta! I’m showing my blogging age here, but I’ve noted in the past my admiration for the site. Meanwhile, I don’t do any audience tracking or visit analytics at all for MPR. Pretty much have no idea who’s actually reading this stuff, if anyone. So it’s one of those old time, early ’oughts (yup, I go back that far) thrills to see the site title pop up in another feed.


Essential Formulae

Evan Miller’s statistical material for programmers might come in handy:

As my modest contribution to developer-kind, I’ve collected together the statistical formulas that I find to be most useful; this page presents them all in one place, a sort of statistical cheat-sheet for the practicing programmer.

Most of these formulas can be found in Wikipedia, but others are buried in journal articles or in professors’ web pages. They are all classical (not Bayesian), and to motivate them I have added concise commentary. I’ve also added links and references, so that even if you’re unfamiliar with the underlying concepts, you can go out and learn more. Wearing a red cape is optional.


Deep Into Partitions

Network partitions that is, and their implications for some common, popular, open source datastores. Kyle Kingsbury has cooked up “Call Me Maybe”

This article is part of Jepsen, a series on network partitions. We’re going to learn about distributed consensus, discuss the CAP theorem’s implications, and demonstrate how different databases behave under partition.

In-depth technical content on the Web. Who knew! You have been warned.


CommaFeed As Backup Plan

I was all set to put CommaFeed on the list of potential GReader replacements after seeing a mention coming across the MetaFilter feed. Then I started reading the MeFi comments and this one from Rhaomi really hit home:

It’s not just the interface and UI, which is pretty easy to clone. It’s the staggering infrastructure that powers it — the sophisticated search crawlers scouring the web and delivering near-real-time updates, the industrial-scale server farms that store untold petabytes of searchable text and images relevant to you (much of it from long-vanished sources), the ubiquitous Google name that makes the service a popular platform for innumerable third-party apps, scripts, and extensions.

It’s possible to code up something that looks and feels a lot like Reader in three months, with the same view types and shortcuts. But to replicate its core functionality — fast updates, archive search, stability, universal access, wide interoperability — takes Google-scale engineering I doubt anybody short of Micosoft/Yahoo can emulate. It was very nearly a public service, and its going to be frustrating trying to downsize expectations for such a core web service to what a startup — even a subscription-backed one — can accomplish.

Not to mention the current CommaFeed landing page annoyingly doesn’t have any type of “About” page, just a force funnel to registration. Hey, I like to at least be sweet talked a little before wasting a password!


NewsBlur Recco

Rafe Colburn is bullish on NewsBlur as a replacement for Google Reader, especially after the recent redesign:

The upside of NewsBlur has been that it works really well at its core purpose, fetching feeds and displaying them for the user. It also has solid native mobile clients, enabling you to keep read status in sync across devices.

That’s a good enough endorsement for me. With the clock ticking on the GReader shutdown, I’ll give NewsBlur the first crack at filling the void for me.


Cal Berkeley GraphX

C’mon Bears, cut it out. It’s getting embarrassing how much Spark related output there has been recently. In a good way!

From social networks to targeted advertising, big graphs capture the structure in data and are central to recent advances in machine learning and data mining. Unfortunately, directly applying existing data-parallel tools to graph computation tasks can be cumbersome and inefficient. The need for intuitive, scalable tools for graph computation has lead to the development of new graph-parallel systems (e.g. Pregel, PowerGraph) which are designed to efficiently execute graph algorithms. Unfortunately, these new graph-parallel systems do not address the challenges of graph construction and transformation which are often just as problematic as the subsequent computation. Furthermore, existing graph-parallel systems provide limited fault-tolerance and support for interactive data mining.

We introduce GraphX, which combines the advantages of both data-parallel and graph-parallel systems by efficiently expressing graph computation within the Spark data-parallel framework. We leverage new ideas in distributed graph representation to efficiently distribute graphs as tabular data-structures. Similarly, we leverage advances in data-flow systems to exploit in-memory computation and fault-tolerance. We provide powerful new operations to simplify graph construction and transformation. Using these primitives we implement the PowerGraph and Pregel abstractions in less than 20 lines of code. Finally, by exploiting the Scala foundation of Spark, we enable users to interactively load, transform, and compute on massive graphs.

Need to drill in to see how GraphX stacks up to the current spate of “big data” graph toolkits, especially GraphLab. Ben Lorica reports that GraphX is more oriented towards to programmer productivity as opposed to raw performance:

GraphX is a new, fault-tolerant, framework that runs within Spark. Its core data structure is an immutable graph5 (Resilient Distributed Graph – or RDG), and GraphX programs are a sequence of transformations on RDG’s (with each transformation yielding a new RDG). Transformations on RDG’s can affect nodes, edges, or both (depending on the state of neighboring edges and nodes). GraphX greatly enhances productivity by simplifying a range of tasks (graph loading, construction, transformation, and computations). But it does so at the expense of performance: early prototype algorithms written in GraphX were slower than those written in GraphLab/PowerGraph.


Ricon East Talks

Wow! Basho’s Ricon East conference was a little more diverse and wide ranging than I anticipated. This was evidenced by Anders Pearson’s summary of the talks he attended. For example, this lede on ZooKeeper for the Skeptical Architect by Camille Fournier, VP of Technical Architecture, Rent the Runway:

Camille presented ZooKeeper from the perspective of an architect who is a ZooKeeper committer, has done large deployments of it at her previous employer (Goldman Sachs), left to start her own company, and that company doesn’t use ZooKeeper. In other words, taking a very balanced engineering view of what ZooKeeper is appropriate for and where you might not want to use it.

Of the talks Pearson summarized, only two were by Bash employees while the rest were by some pretty serious distributed folks such as Margo Seltzer and Theo Schlossnagle. Plus there was a healthy dose of industry war story experience at scale.

Good on Basho!

Via John Daily


HDFS Gets Snakebitten

[embed]https://twitter.com/pypi/status/335412456396558336[/embed]

Another good find from the PyPi Twitter stream. Had to do a quick Google search to get the real details on snakebite, a pure Python library for interacting with Hadoop’s HDFS:

Another annoyance we had with Hadoop (and in particular HDFS) is that interacting with it is quite slow. For example, when you run hadoop fs -ls /, a Java virtual machine is started, a lot of Hadoop JARs are loaded and the communication with the NameNode is done, before displaying the result. This takes at least a couple of seconds and can become slightly annoying. This gets even worse when you do a lot of existence checks on HDFS; something we do a lot with luigi, to see if output of a jobs exist.

So, to circumvent slow interaction with HDFS and having a native solution for Python, we’ve created Snakebite, a pure Python HDFS client that only uses Protocol Buffers to communicate with HDFS. And since this might be interesting for others, we decided to Open Source it at http://github.com/spotify/snakebite.

Roger that on the annoyingly slow response of hadoop fs. Thanks Spotify.


Jepp, CPython and Java

TIL about Jepp:

Jepp embeds CPython in Java. It is safe to use in a heavily threaded environment, it is quite fast and its stability is a main feature and goal.

Could be handy for cutting down performance overhead at some points in the Hadoop stack where Python and Java come together. I’m looking at you Hadoop Streaming. Also for helping Python out with the myriad of serialization formats that Java does oh so well.

Via Morten Petersen


Praising Data Engineering

Metamarkets’ M. E. Driscoll gives a shout out to those mucking about with the bits:

A stark but recurring reality in the business world is this: when it comes to working with data, statistics and mathematics are rarely the rate-limiting elements in moving the needle of value. Most firms’ unwashed masses of data sit far lower on Maslow’s hierarchy at the level of basic nurture and shelter. What is needed for this data isn’t philosophy, religion, or science — what’s needed is basic, scalable infrastructure.

The more data analysis I do, the more plain ’ole wrestling with the data becomes critical. And figuring out the plumbing and tools to make that happen becomes more interesting.

Via Rafe Colburn


EC2 Instance Primer

Amazon EC2 is a great service but sometimes it’s hard to keep track of all the virtual machine types that are provided. Jeff Barr put together a handy comprehensive backgrounder to Amazon EC2 instance families and types:

Over the past six or seven years I have had the opportunity to see customers of all sizes use Amazon EC2 to power their applications, including high traffic web sites, Genome analysis platforms, and SAP applications. I have learned that the developers of the most successful applications and services use a rigorous performance testing and optimization process to choose the right instance type(s) for their application.

In order to help you to do this for your own applications, I’d like to review some important EC2 concepts and then take a look at each of the instance types that make up the EC2 instance family.

Even better he covers the intended use cases for each family and their designed performance tradeoffs. Keep it in your back pocket if you’re an EC2 hacker.


GraphLab Inc.

[embed]https://twitter.com/bigdata/status/334341075068137473[/embed]

I’ve mentioned GraphLab and have been toying with it since before it’s 1.0 release. Now the stakes have been raised with a de-cloaking and a heap of venture capital. Good luck to Professor Geustrin and crew.


Truer Words

[embed]https://twitter.com/UnlikelyWorlds/status/334384901547757568[/embed]

Truer words were never spoken of your humble narrator. Would that he could get his outer pedant under control.


Python XML Processing

The Discogs.com data is in some humongous XML files, which is a little unruly for many data hacking tasks. Python has some great XML processing modules, but it’s always good to have a little guidance. Enter this oldie but goodie from Eli Bendersky on Processing XML in Python with ElementTree:

As I mentioned in the beginning of this article, XML documents tend to get huge and libraries that read them wholly into memory may have a problem when parsing such documents is required. This is one of the reasons to use the SAX API as an alternative to DOM.

We’ve just learned how to use ET to easily read XML into a in-memory tree and manipulate it. But doesn’t it suffer from the same memory hogging problem as DOM when parsing huge documents? Yes, it does. This is why the package provides a special tool for SAX-like, on the fly parsing of XML. This tool is iterparse.

I will now use a complete example to demonstrate both how iterparse may be used, and also measure how it fares against standard tree parsing.

If I was going to update Bendersky’s post, I wouldn’t change much, other than to mention lxml and lxml.etree which provide high-performance streaming XML processing.


More Git Material

Haven’t finished working through them, but these git intros feel pretty useful. Slideshare alert if you’re allergic.

Introduction to git is by the venerable Randal Schwartz. Got a little dust. If up to typical Schwartz standards, still well worth reading.

Lemi Orhan Ergin’s Git branching Model might be overly stylish, but looks like it goes into detail on merging in addition to branching.

Via Rajiv Pant.


SourceTree

Link parkin’: SourceTree, Atlassian’s desktop GUI DVCS client:

Full-powered DVCS

Say goodbye to the command line – use the full capability of Git and Mercurial in the SourceTree desktop app. Manage all your repositories, hosted or local, through SourceTree’s simple interface.


Mission Accomplished

Still checking for consistency, but it looks like I’ve completed my mission of grabbing all the currently available Discogs.com data dumps. Have one more to grab and verify the checksum. Then I should be good to go. 45+ Gb (compressed) to romp through.

Oddly, it looks like we’re only getting releases updated for the month of May. Curious.


Emacs Temp Files in Their Place

Really handy tip from Emacs Redux:

Auto-backup is triggered when you save a file - it will keep the old version of the file around, adding a ~ to its name. So if you saved the file foo, you’d get foo~ as well.

auto-save-mode auto-saves a file every few seconds or every few characters …

Even though I’ve never actually had any use of those backups, I still think it’s a bad idea to disable them (most backups are eventually useful). I find it much more prudent to simply get them out of sight by storing them in the OS’s tmp directory instead.

I find the biggest pain with autosave files is getting git to ignore their existence. Yeah, I can fiddle around with .gitignore files, but that never quite seems to be universally applied correctly for me. Not even having emacs temp files in project directories makes the whole issue go away.


GLA38

[embed]https://twitter.com/djmarkfarina/status/330332464927080450[/embed]

Go get your latest Mark Farina podcast, NOW!


Continuous Partial Insanity

Playing off of continuous partial attention, a particularly bad patch of TV convinced me it’s just a medium for “continuous partial insanity”. Between The News, “reality shows”, the fictional programming, and the advertising the only intent is to keep you in a state of intense emotional elation or despair. Mostly despair since fear drives sales.

Criminy! Sports is a relative island of rationality, structure, and order.

Interestingly, a Google search for “continuous partial insanity” currently only brings up a long abandoned blog, parked on it as a tagline. Seems like an opportunity.


A Week of Google Glass

Luke Wroblewski takes interface design and user experience in a serious fashion. So his Google Glass experience was the first commentary I took seriously:

Almost a week ago I picked up my Glass explorer edition on Google’s campus in Mountain View. Since then I’ve it put into real-world use in a variety of places. I wore the device in three different airports, busy city streets, several restaurants, a secure federal building, and even a casino floor in Las Vegas. My goal was to try out Glass in as many different situations as possible to see how I would or could use the device.

During that time, Scott Jenson’s concise mandate of user experience came to mind a lot. As Scott puts it “value must be greater than pain.” That is, in order for someone to use a product, it must be more valuable to them than the effort required to use it. Create enough value and pain can be high. But if you don’t create a lot of value, the pain of using something has to be really low. It’s through this lens, that I can best describe Google Glass in it’s current state.

Definitely worth a full read, especially for the punch line.


Tell Us How You Really Feel

Like I said, I enjoy a good curmudgeonly rant. Stephen Few has not been having a good couple of months with publishers.

When I fell in love with words as a young man, I developed a respect for publishers that was born mostly of fantasy. I imagined venerable institutions filled with people of great intellect, integrity, and respect for ideas. I’m sure many people who fit this description still work for publishers, but my personal experience has mostly involved those who couldn’t think their way out of a wet paper bag and apparently have no desire to try.

Said most recent experience involves a bait and switch by Taylor & Francis (the publisher) on rights to some material Few was providing to an academic journal. Guy goes out of his way to put something together, I’m sure of high quality, and they want to reserve the right to modify his work. After they agreed in principle to his terms.

Something similar happened to Danah Boyd and I notice a pattern. Good intentioned journal editor from academia agrees to reasonable terms from fellow academic. Publisher waits until last minute to pull the okee-doke “Well, we can’t really do that. If you don’t agree to our onerous terms we’ll have to pull your article.” If these guys didn’t have their hooks so tightly intertwined with the tenure process, this behavior would be so over.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.