home ¦ Archives ¦ Atom ¦ RSS

Markdown Fieldguide

Like me are you looking to become a more sophisticated Markdown user? You might want to checkout MacSparky’s Markdown Fieldguide. Only 10 bucks!

Markdown started as a clever way to write for the web but has become so much more. This book demystifies Markdown, making it easy for anybody to learn. This book includes 130 pages and 27 screencasts totaling more than one and a half hours of video. There is also an additional hour of audio interviews. This book will take you from zero knowledge of Markdown to being a Markdown pro and change the way you write for the better.

I’ll be checking out the PDF version since I’m a Kindle Store sucke… err, fan.

Update: Decided to go with the iBookstore version just for a little variety.

Via Daring Fireball


Emacs 24.3 and Python

OEmbed Link rot on URL: https://twitter.com/wesmckinn/status/314928646068531201

I’m with you Wes. Python mode in Emacs 24.3 doesn’t feel very polished. May have to fall back to your solution as well.

OEmbed Link rot on URL: https://twitter.com/wesmckinn/status/314932693752242177


Wiz Lakers

C’mon @DidTheWizWin! You can do better than that. The Wiz came from double digits down, at the Staples Center, against a fully loaded Lakers with Pau and Kobe. Plus the Wizards went without Bradley Beal, Emeka Okafor, and AJ Price. Only nine guys got in the box score.

And John Wall showed something with 24 points and 16 assists in 44 minutes under the bright lights.

Weird box score observation. The Wiz totaled only 236 minutes, but by my math every regulation game should have 240. Guessing due to round down of fractional minutes.


Spotify’s Luigi

Been looking around for a flexibly engineered batch scheduling tool that’s Hadoop friendly. Spotify open sourced its Luigi framework which looks like it might fit the bill:

Luigi is a Python package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else.

There’s also an attendant post on other uses of Python at Spotify.


Iterate Like a Superhero

I enjoyed reading Ned Batchelder’s presentation on idiomatic iteration in Python, even though I knew of everything he presented. Made me feel like I actually knew something about Python! It’s good baseline knowledge any Pythonista should comprehend and internalize.

He also ingeniously made it available in multiple formats.

This is a presentation I gave at PyCon 2013. You can read the slides and text on this page, or open the actual presentation in your browser (use right and left arrows to advance the slides), or watch the video:


Bracketless

Cal Logo Small Thank goodness I’ve retired from filling out NCAA Mens Basketball Tournament brackets. Georgetown, New Mexico, and Wisconsin would have given me headaches, and if the Hoyas don’t come back, whatever bracket I had would be busted.

Now I can just kick back, root for Cal, and keep my blood pressure low.


rvm

Link parkin’: rvm, it’s like virtualenv for Ruby.

RVM is a command-line tool which allows you to easily install, manage, and work with multiple ruby environments from interpreters to sets of gems.


Stupid Feed Tricks

I was amused by Brent Simmons’ republishing of Brian Reischl’s Stupid Feed Tricks

Brian is intimately acquainted with the different ways feeds can be screwed up. So he posted Stupid Feed Tricks on Google Docs.

I quote the entire thing below for people like me who don’t have Google accounts. The below is all by Brian:

… Putting a tiny number of posts in the feed (sometimes just one). These types then usually publish 10 articles in the space of two minutes, and wonder why you’re missing 9 of them. …

Now ponder scaling these issues up to (tens?, hundreds?) millions of feeds and users and you can see the challenge facing anyone providing a Google scale feed synch ecosystem.

Also, don’t miss Brent’s favorite way to screw up a feed. Really, hotel connectivity needs to die a horrible death at the hands of 4G and LTE. Conference provisioning is “only” way overpriced. Lodger oriented is overpriced and busted.

Ahhh the wonders of the Web. Thankfully, it’s the least worst, global, open, networked, hypermedia system we have.


Who I Was Waiting to Hear From

With the GReader Apocalypse upon us, I was waiting to hear what the vendors of my two key feedreading tools were going to do. Got an answer now:

First, Black Pixel and The Return of NetNewsWire:

First, we intend to bring sync to future versions of NetNewsWire. It’s too soon to go into details about this, but you should know that we recognize how extremely important it is and that it is a top priority for us.

Second, even though we’ve been quiet about it, we have been working on new versions of NetNewsWire for Mac, iPhone, and iPad. We have some great new features and a modern design that we can’t wait to show you.

Next up, Oliver Fürniß, publisher of Mr. Reader, and his thoughts:

In my opinion all of those Google Reader alternatives should support this API, instead of going the easy way and creating their own. This would be a big win for millions of Google Reader users. Those services will have native clients within a short time-frame across many different platforms and devices. It’s not a big deal for the client developers to adjust the API endpoint. I’ve already changed my code (not uploaded yet) so that the API endpoint can be edited like suggested by Marco (Instapaper developer). Yes, I already hear your feedback that some alternatives provide many additional and awesome features. But do we really need features like ‘folders in folders in folders’ like those implemented in, for example, Tiny Tiny RSS?

Basically, uncertainty. I want to believe Black Pixel, but it’s been so long since a NetNewsWire update, I can’t have complete confidence. But I’ll keep on using it until it’s completely busted.

I really do think it would be great if Yahoo! jumped in and provided GReader compatible infrastructure. Microsoft could probably pull it off as well and maybe such an ecosystem could help out Bing.

A little over three months until D-Day. I’ll be looking to see how it all shakes out.


Dangerous Paywalls

Man, if Garret can’t hack it in the slowly diminishing, open news ecosystem, it’s a really bad sign:

So, to sum up: I soon may not be able to get the information I want in a form optimized for linkblogging. I can’t offer ‘added value’ because the most-worthy ‘valuables’ are hidden behind paywalls. Trying to overcome these two issues to maintain quality is already taking more time than I can spare. The few items I can succeed with to my satisfaction, are not enough to attract and keep an audience. Quality and volume were a hallmark of my blogging style. The internet is now actively working against me on those two points. I can’t go back to having long lists of links on my site — I just can’t. That was so time-consuming as to be idiotic. It was only the excitement of the early metacosm that kept me going in that fashion.

I’ll also toss in the pernicious creep of embed culture, a kissing cousin of animated GIFs, as another factor, even though I’m a judicious offender myself. Cedes a lot of control back to the content provider and breaks “fixity”.


vbox

Link parkin’: vbox, “Yet another Python library of Python bindings for Virtual Box CLI (Command Line Interface).” OEmbed Link rot on URL: https://twitter.com/pypi/status/313271372153491456

If I ever get super dissatisfied with vagrant, and it does have a wart here or there, I may have to take another look at vbox.


Continuum.io’s Anaconda

Currently kicking the tires on Anaconda, Continuum.io’s Python distribution focused on high performance compute processing:

Completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing

Looks really nice and well put together although it doesn’t seem to play well with virtualenv and has its own environment model and tool, conda:

… Our users need to work with different versions of Python, NumPy, SciPy, and a variety of other packages. Moreover, they must be able to easily share live, runnable versions of their work, including all supporting packages, to their colleagues or the general public.

We created the conda package and environment management system to solve these problems. It allows users to install multiple versions of binary packages (and any required libraries) appropriate for their platform and easily switch between them, as well as easily download updates from an upstream repository. …

Bit of shame as virtualenv is really a core part of the Python ecosystem and personally makes my life a lot easier. But I can understand the tradeoff.

Anyway, Continuum provides the baseline Anaconda, which feels pretty competitive with the Enthought Python Distribution, for free and with a reasonable (YMMV), proprietary friendly license.


TIL HTSQL

Via Catherine Devlin, and her PyCon lightning talk, I just learned of HTSQL

HTSQL is a comprehensive navigational query language for relational databases.

HTSQL is designed for data analysts and other accidental programmers who have complex business inquiries to solve and need a productive tool to write and share database queries.


Not A Bad Thought

Maybe Yahoo! could pick up the ball that Google Reader will be dropping:

But Yahoo now has a huge opportunity to make itself relevant again to the types of consumers who wrote the company off long ago. Now that Yahoo is referring to itself as a tech company rather than a media firm, it can bolster its techie cred with a product that provides utility that Google won’t match.

I say it’s not a bad thought because Yahoo! probably still has enough engineering talent and infrastructure to execute on this at scale. Mashable (really Todd Wasserman) misses an even bigger opportunity in that a small team could clone the unofficial GReader API and provide an alternative feed synching ecosystem. That would really be hot for mobile.


Fabric or Capistrano?

So at this point in my multi-vm explorations, it’s pretty clear I’ll need some configuration automation beyond provisioning. Sometimes you just need to shut down and restart services across a bunch of machines in a certain order. SSHing into each one and doing it by hand is now sufficiently painful to consider how the pros do it.

Capistrano has the advantage of getting me deeper into Ruby. Fabric, based on Python, means an easier learning curve. Choices, choices.


TIL dnsmasq

Today I Learned about dnsmasq

Dnsmasq is a lightweight, easy to configure DNS forwarder and DHCP server. It is designed to provide DNS and, optionally, DHCP, to a small network. It can serve the names of local machines which are not in the global DNS. The DHCP server integrates with the DNS server and allows machines with DHCP-allocated addresses to appear in the DNS with names configured either in each host or in a central configuration file. Dnsmasq supports static and dynamic DHCP leases and BOOTP/TFTP/PXE for network booting of diskless machines.

Dnsmasq is targeted at home networks using NAT and connected to the internet via a modem, cable-modem or ADSL connection but would be a good choice for any smallish network (up to 1000 clients is known to work) where low resource use and ease of configuration are important.

So far dnsmasq has come in real handy for multivm, host-only, networks of virtual machines. It seems to cleanly handle reverse DNS lookups and is nicely packaged for most distros.


GReader’s Real Value

Chuck Shotton gets to the heart of why Google Reader shutting down will be painful:

Here is the incredibly powerful thing that Google Reader provides that will leave a huge, gaping hole in my daily RSS reading:

Synchronization.

Google Reader was at best an average RSS reader. But it excelled at keeping all of my other 3rd party RSS reader apps in sync. By providing a set of APIs that allowed remote readers to mark/unmark individual articles as read, it let me start reading news on my phone with Feeddler, continue on my desktop with Google Reader, and switch to Flipboard on my tablet later in the day without having to wade through the same news articles twice. What was marked as read on my phone never showed up as unread on my tablet. It also gave me centralized management of all my RSS feeds. When I nuked an entire feed on my desktop computer, it disappeared from my mobile devices.

I wouldn’t hold my breath Chuck on someone filling the void. Matching the scale, speed, availability, and cross-client API support of GReader will be tough. Then again scalable data processing techniques and infrastructure have advanced far faster than the RSS reading population has. So maybe a small team can take this on as a side project, or even lifestyle business hustle.


The Apocalypse Is Nigh

Google Reader, dead July 1, 2013. Victim of another Google Spring Cleaning.

We launched Google Reader in 2005 in an effort to make it easy for people to discover and keep tabs on their favorite websites. While the product has a loyal following, over the years usage has declined. So, on July 1, 2013, we will retire Google Reader. Users and developers interested in RSS alternatives can export their data, including their subscriptions, with Google Takeout over the course of the next four months.

As a feedaholic from the earliest days, this’ll be the end of an era. But maybe it’s the extinction event that will spur a new round of innovation with RSS. I’d do a Munch like Scream, but everyone knew this was coming.

I would like to heartily thank the GReader team for a boatload of utility over the years.


On Timestamps

What he said. ISO 8601 everywhere please.


OneTab For Chrome

As a tabaholic, maybe the OneTab plug-in can help with my addiction. It might also get me back on Chrome more regularly although really my main issue is the inconsistent behavior of the 1Password extension.

Whenever you find yourself with too many tabs, click the OneTab icon to convert all of your tabs into a list. When you need to access the tabs again, you can either restore them individually or all at once.

When your tabs are in the OneTab list, you will save up to 95% of memory because you will have reduced the number of tabs open in Google Chrome.


Emacs 24.3

Looks like Emacs 24.3 is an official release now. Mickey at Mastering Emacs has an overview of some of the changes:

A new version of python.el, which provides several new features, including: per-buffer shells, better indentation, Python 3 support, and improved shell-interaction compatible with iPython (and virtually any other text based shell).

I blogged about a new python mode a long time ago and it seems it’s made it into trunk. That’s probably good news for most Python users, but as I haven’t yet explored the new python.el mode yet, I will cover this in much greater detail in another post.

Better iPython integration would be a boon although I’ll admit the Emacs IPython Notebook is pretty saucy. Definitely, a must test drive.


Sparkin’ EMR

Link parkin’. How to layer Spark within an Amazon Elastic MapReduce cluster. Basic idea is to use a bootstrap script to deploy the toolkit.

In this article, we’ll explain how to install Shark and Spark on a cluster managed by Amazon EMR. By combining these technologies, you’ll be able to enjoy the speed enhancements of the Shark data warehouse as well as the operational and financial advantages of running your cluster on Amazon EMR.


Bad Ball

NBA Logo Small The Sports Gods had a bad night last night. Of the seven NBA games on Saturday, March 9, six had a victory margin of 10 or more points. That’s the league’s definition of a blowout I believe. The Knicks-Utah game was over in 6 minutes. The Wizards even got a laugher over the Bobcats, who look horribly awful.

Plus the Caps got blown out by the Islanders. Georgetown crushed Syracuse. North Carolina stunk it up, on their home court, on Senior Night, against Duke. Man City won 5-1 in their FA cup match. Tiger Woods is starting to pull away in the PGA Tour stop.

Sunday was looking sort of weak as well, with Man United going up 2-0 early in the first half against Chelsea. At least The Blues got their act together, turned it on in the second half, and put a lot of exciting pressure on the Red Devils.

Thankful for a taste of quality this weekend. Maybe they’re saving up for NCAA Conference tournaments and March Madness.


NetflixGraph

Interesting. If you have relatively small and relatively static graph data, you can easily ship it around a distributed processing platform thanks to NetflixGraph

NetflixGraph is a compact in-memory data structure used to represent directed graph data. You can use NetflixGraph to vastly reduce the size of your application’s memory footprint, potentially by an order of magnitude or more. If your application is I/O bound, you may be able to remove that bottleneck by holding your entire dataset in RAM. This may be possible with NetflixGraph; you’ll likely be very surprised by how little memory is actually required to represent your data.

NetflixGraph provides an API to translate your data into a graph format, compress that data in memory, then serialize the compressed in-memory representation of the data so that it may be easily transported across your infrastructure.


Beware The Ides of Data

An interesting interview with Kate Crawford

Kate Crawford: I’m currently researching how big data practices are affecting different industries, from news to crisis recovery to urban design. This talk was based on that upcoming work, touching on questions of smartphones as sensors, on dealing with disasters (like Hurricane Sandy), and new epistemologies — or ways we understand knowledge — in an era of big data.

When “Six Provocations for Big Data” came out in 2011, we were critiquing the very early stages of big data and social media. In the two years since, the issues we raised are even more prominent.

I’m now looking beyond social media to a range of other areas where big data is raising questions of social justice and privacy. I’m also editing a special issue on critiques of big data, which will be coming out later this year in the International Journal of Communications.

Bonus: Nassim Nicholas Taleb’s take on Big Data.


Only In Berkeley

Nefeli Caffe

One Turing Award winner, Karp, reading reviewing another, Valiant, at a friendly little neighborhood cafe. Of course, I have a soft spot for Nefeli, since I used to work there. Wondering if I’m the only Computer Science Division student to have that privilege. I’ll have to ask Nasos the next time I’m in town.

Hellerstein is no slouch either.


Reinout’s vagrant setup

More vagrant-fu from Reinout van Rees:

I said in using vagrant for developing on OSX: why? that I chose vagrant for setting up my development environment. Now it’s time for some specifics.

I’ll also point out that vagrant’s pretty good at multivm specification, provisioning, and booting. After you get the hang of it, it’s not too bad setting up a virtualized Hadoop cluster.


basebox

I’ve managed to use veewee to successfully build baseboxes for vagrant, but didn’t exactly find the experience pleasant. Mainly because the veewee post install scripts are totally isolated within the vm. I really wanted to replace the default ssh keys for the default vagrant account which are hardwired to be downloaded from a public url on the net. Feels like a recipe for disaster to me, but I could only come up with kludgy post basebox build fixup-script to solve the problem.

Maybe the Python based basebox can make this a little cleaner:

Basebox is a small Python library for building and interacting with Vagrant boxes using Fabric. Its goals are somewhat similar to the veewee project, but is specifically geared toward developing and testing Fabric deployments.


Definitely On The DL

Unlike this guy, I’m not seriously considering dumping the NFL, but I’m definitely creeping with The Premiership. And The Champions League. La Liga every now and then. Plus macking on the FA Cup once in a while.

(6) A team owned by an insane Russian oligarch, who considers money no object to pursuing success (I think this is Chelsea, though it might be Man City. In any case the other one is owned by a sheik who consider money no object etc.)

Hey, at least I know Man City is owned by the Sheik who considers money no object to pursuing success.

The best part of world football is that the dang games are done in two hours, guaranteed. Plus the time zone shift means the slate is essentially done by 2 PM Eastern, so you really can’t blow an entire day laying on the couch watching live game action.

Now if I could only get the Bundesliga’s phone number.


Strata Trip Reports

Michael Malak does yeoman’s work writing up his observations (Days 0 & 1, Day 2, Day 3) of the Strata Conference Santa Clara. First knock on Storm I’ve heard of:

Spark streaming, according to the presented, beats its competitor Storm at calculating metrics on the fly as they come off a queue like Kafka or Flume because Spark has fault-tolerance through node redundancy and because Spark avoids Storm’s problem of double-counting events by maintaining full historical data in memory for the specified desired window (e.g. 10 minutes). He said there is a layer over Storm that can prevent double-counting, but it achieves it by wrapping each individual event in its own transaction, and most users just abandon that solution for being non-performant.

In a barely audible aside during the presentation, they confirmed the weakness of Storm that was stated during the previous day’s Spark Streaming presentation, which is that the layer on top of Storm, Trident, that prevents double-counting is not performant.


Vagrant DSTK

Pete Warden built a Data Science Toolkit Vagrant basebox:

I have fallen in love with Vagrant over the last year, it turns an entire logical computer as a single unit of software. In simple terms, you can easily set up, run, and maintain a virtual machine image with all the frameworks and data dependencies pre-installed. You can wipe it, copy it to a different system, branch it to run experimental changes, keep multiple versions around, easily share it with other people, and quickly deploy multiple copies when you need to scale up. It’s as revolutionary as the introduction of distributed source control systems, you’re suddenly free to innovate because mistakes can be painlessly rolled back, and you can collaborate other people without worrying that anything will be overwritten.

Before I discovered Vagrant, I’d attempted to do something similar with my Data Science Toolkit package, distributing a VMware image of a full linux system with all the software and data it required pre-installed. It was a large download, and a lot of people used it, but the setup took more work than I liked. Vagrant solved a lot of the usability problems around downloading VMs, so I’ve been eager to create a compatible version of the DSTK image. I finally had a chance to get that working over the weekend, so you can create your own local geocoding server just by running:

vagrant box add dstk http://static.datasciencetoolkit.org/dstk_0.41.box

vagrant init

Cool! I’m becoming more of a fan of vagrant as well. This may have to be the first basebox I try out on Ye ’Olde MacBook. I was thinking a CartoDB 2.0 basebox build would be fun to do, but someone already beat me to it.


Harlem Shake-Off

OEmbed Link rot on URL: http://www.youtube.com/watch?v=Ir2TdfSwH8g

Speaking of the Miami Heat, I’m usually immune to Internet memes, but I got sucked in by the Miami Heat’s version of the Harlem Shake. My Little Guy (™) had to watch it 20 times in a row. Where’s Andy Baio when we need him?

That’s a lot of high-priced talent having a big old goof. LeBron James seems to really get into it. But what I really want to know is who’s wearing the championship belt? My guess is Joel Anthony, but more investigation is needed.

However, I’m actually somewhat partial to the Kansas Jayhawks edition which I suspect might have inspired the Heat. via Mario Chalmers? Could be a skoosh longer but has the upside of a Bill Self appearance:


Strata Observations

More Ben Lorica with some observations coming out of the recent O’Reilly Strata Conference:

Here are a few observations based on conversations I had during the just concluded Strata Santa Clara conference.

Spark is attracting attention

I’ve written numerous times about components of the Berkeley Data Analytics Stack (Spark, Shark, MLbase). Two Spark-related sessions at Strata were packed (slides here and here) and I talked to many people who were itching to try the BDAS stack. Being able to combine batch, real-time, and interactive analytics in a framework that uses a simple programming model is very attractive. The release of version 0.7 adds a Python API to Spark’s native Scala interface and Java API.

I’m already in the tank for Spark, but Lorica’s got a couple of other interesting observations to add.


Top N Best Ever

NBA Logo Small Too often I hear pro sports commentators cut loose with “Flavor of the Month is one of the top n best ever…”. Drives me batty.

Jeff van Gundy, in today’s Heat-Knicks broadcast, let fly with “LeBron James is one of the 10 best NBA players ever.” Oh really Jeff? Here’s ten names of NBA greats, in no particular order:

  • Bill Russell
  • Wilt Chamberlain
  • Kareem Abdul-Jabbar
  • Shaquille O’Neal
  • Michael Jordan
  • Larry Bird
  • Magic Johnson
  • Jerry West
  • Oscar Robertson
  • Kobe Bryant
  • Honorable mention to: Tim Duncan, David Robinson, Isaiah Thomas, Elgin Baylor, Hakeem Olajuwon, Karl Malone, Charles Barkley, and Pete Maravich

I love James’ physical gifts and talents, but right now, which one of those names are you replacing with LeBron James?

Oh, and could some destitute NBA team hire van Gundy so I don’t have to listen to him on the TV anymore?


BDAS Tutorial

This tutorial-the first of a two-part series-will provide an introduction to BDAS, the Berkeley Data Analytics Stack. BDAS is an open source, next-generation data analytics stack under development at the UC Berkeley AMPLab whose current components include Spark, Shark and Mesos. We will start by covering Spark, a high-speed cluster computing system compatible with Hadoop that can outperform it by up to 100x thanks to its ability to perform computations in memory. Spark provides concise, high-level APIs in both Scala and Java, and is in use at Foursquare, Conviva, Klout, Quantifind, and other companies. We will provide an overview of the Spark architecture, typical data analytics workflows (e.g., loading data from HDFS into memory and interactively querying it), and how users are applying Spark. In addition, we will also introduce Shark, a port of Apache Hive onto Spark that is compatible with existing Hive warehouses and queries. Shark can answer HiveQL queries up to 100x faster than Hive without modification to the data and queries, and is also open source as part of BDAS.

Tutorial Part 1 (with PowerPoint slides) and Part 2.


AtBat ’13

MLB AtBat Logo
Erica Ogg at GigaOM did a quick review of MLB’s AtBat ’13 mobile app:

Opening Day of the Major League Baseball season is still about a month away. But the best app for following all the games is here already.

I agree and have already anted up my for my season pass. It was well worth it last year. By far the best and most useful of the professional sports apps.


API Monetizing

Jacob Perkins describes turning a side hacking project into something “profitable”:

Mashape was just what I needed to monetize the text-processing API, and it’s improved tremendously since I started using it. They handle all the necessary details, plus a lot more, like usage charts, latency & uptime measurements, and automatic client library generation. This last is one of my favorite features, because the client libraries are generated using your API documentation, which provides a great incentive to accurately document the ins & outs of your API. Once you’ve documented your API, downloadable libraries in 5 different programming languages are immediately available, making it that much easier for new users to consume your API. As of this writing, those languages are Java, PHP, Python, Ruby, and Objective C.


Spark and Amazon EMR

Howto on deploying Spark into Amazon’s elastic environment:

A common business scenario is the need to store and query large data sets. You can do this by running a data warehouse on a cluster of computers. By distributing the data over many computers, you return results quickly because the computers share the load of processing the query. One limitation on the speed at which queries can be returned, however, is the time it takes to retrieve the data from disk.

You can increase the speed of queries returned from a data warehouse by using the Shark data warehouse system. Shark runs on top of Spark, an open-source cluster computing system optimized for speed. Spark speeds up data analytics by loading data into memory, providing much faster performance than a disk-based system like Hadoop. For more information on Spark, see http://spark-project.org/.


Spark and Python

Confirmed. PySpark and Streaming Spark in the same release? I may have died and gone to heaven. Go Bears!

Updated link to point to Spark release notes


LibShortText

Looks like this might be a useful library for processing social media:

LibShortText is an open source tool for short-text classification and analysis. It can handle the classification of, for example, titles, questions, sentences, and short messages.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.