Fatboy Slim Hï Ibiza

Posted on: Mon 02 March 2026

Diggin on’: Fatboy Slim Live At Hï Ibiza, on Apple Music.

Cover image of Fatboy Slim Live At Hï Ibiza, Aug 10, 2022 (DJ Mix)
on Apple Music

This was a DJ mix from August, 2022. It’s notable to me because it’s much more of a House music vibe than Norman Cook’s typical big beat style.

And it’s a damn good mix.

Awesome libghostty

Posted on: Sun 01 March 2026

For some reason the Ghostty terminal emulator popped to the top of Hacker News again. Mitchell Hashimoto jumped in to comment:

I’m the original creator of Ghostty. It’s been a few years now! I don’t know why this is on the front page of HN again but let me give some meaningful updates across the board.

First, libghostty is way more exciting nowadays. It is already backing more than a dozen terminal projects that are free and commercial: https://github.com/Uzaaft/awesome-libghostty I think this is the real future of Ghostty and I’ve said this since my first public talk on Ghostty in 2023: the real goal is a diverse ecosystem of terminal emulators that aim to solve specific terminal usage but all based on a shared, stable, feature-rich, high performant core. It’s happening! More details what libghostty is here: https://mitchellh.com/writing/libghostty-is-coming

I suspect by the middle of 2027, the number of people using Ghostty via libghostty will dwarf the number of users that actually use the Ghostty GUI. This is a win on all sides, because more libghostty usage leads to more stable Ghostty GUI too (since Ghostty itself is… of course… a libghostty consumer). We’ve already had many bugs fixed sourced by libghostty embedders.

And here’s what he said about libghostty back in 2025.

Libghostty Is Coming

Over two years ago, in one of my first public talks about Ghostty, I shared my vision for libghostty: an embeddable library for any application to embed their own fully functional, modern, and fast terminal emulator. Libghostty is finally starting to take shape, and I’m excited to share more details about my plans for it.

The first libghostty library will be libghostty-vt: a zero-dependency library that provides an API for parsing terminal sequences and maintaining terminal state, extracted directly from Ghostty’s real-world proven core. It doesn’t even require libc!

The current wave of AI coding agent enthusiasm is forging a renaissance in text user interfaces. As Hashimoto hoped, libghostty seems to be right in the mix, as exhibited by the Awesome libghostty repository: “A curated list of awesome projects, tools, and resources built with or for libghostty.”

There are some interesting projects in there. Ghostty hasn’t quite stuck with me, and the HN thread has a few gripes pointing to issues that could be problematic. I’m on the lookout for a robust agent orchestration front end to manage multiple asynchronous coders. It’s good to know the core of Ghostty is helping a thousand flowers bloom.

Building Agentic Kernels

Posted on: Sat 28 February 2026

After having digested Mario Zechner’s lessons regarding the pi coding agent, along with pondering Armin Ronacher’s dissection of OpenClaw and pi, I find the core ideas of agentic coding harnesses intriguing. Mario and Armin appeared together on a Syntax podcast episode and were quite the hoot! The gist of the discussion was that pi is really simple, ridiculously effective, and highly dangerous. The danger comes from letting an agent (LLM+tools+loop) have full access to your file system with your permissions. The effectiveness emerges because pi starts with limited functionality but can easily modify itself while continuing to run.

Multiple recent webinars I’ve attended have similarly proclaimed: “You can build one of these harnesses in (much) less than 1,000 lines of code,” with 1,000 being generous. The prolific Hugo Bowne-Anderson recently held a workshop / live-coding session that did exactly this.

Yesterday, I ran a workshop with Ivan Leo (ex-Manus) called “Building Your Own OpenClaw from Scratch” to show you how to build your own AI assistant from first principles. We covered:

How coding agents are really general-purpose computer use agents that happen to be great at writing code

Building the core agent loop with an LLM and tool calls

Context management, memory compaction, and progressive disclosure

How agents can write their own tools and hot-reload them on the fly (via a factory pattern)

Making the agent trigger actions automatically (send a Telegram message, log to a database, fire off an email) using hooks

Connecting the agent to Telegram via FastAPI

Sandboxing and production deployment with Modal

We wrote this blog post for those who don’t have 100 minutes to watch the entire workshop right now.

I’m one of those folks who didn’t have 100 minutes this past Friday to watch, but I definitely want to go back and review the recording. The blog post is pretty hefty, building up a conceptual design and framework, then providing the gory details of the actual implementation in Python. Thanks so much to Ivan and Hugo for this great gift.

Having read the article end-to-end, I see a clear path for a developer to fit the entirety of the harness into their brain, use it as a pedagogical starting point, and then build further. This could mean starting with Leo’s code and extending it with new concepts, or porting it to another language or system. The code from this session is small enough that it can be an operational kernel for building more complex things. It’s also great for personal experimentation and tinkering.

pi is another such kernel, written in TypeScript. I’ve seen similar harnesses in other languages pop up around the Web. These pi-inspired agentic coding harness implementations are kernels in the same way that the Scheme programming language is a kernel.

Scheme is a very simple language, much easier to implement than many other languages of comparable expressive power. This ease is attributable to the use of lambda calculus to derive much of the syntax of the language from more primitive forms. For instance of the 23 s-expression-based syntactic constructs defined in the R5RS Scheme standard, 14 are classed as derived or library forms, which can be written as macros involving more fundamental forms, principally lambda. As R5RS (§3.1) says: “The most fundamental of the variable binding constructs is the lambda expression, because all other variable binding constructs can be explained in terms of lambda expressions.”

Digging into some of the commercial terminal harnesses (many of which have source code openly available on GitHub), you can see they aren’t all that complex either. There’s a simple core of plumbing, a kernel, wrapped in fancy porcelain.

Anyhoo, once you’ve worked through implementing and experimenting with one of these kernels, all sorts of opportunities catch the eye. What’s possible if I embed such a harness in a more complex system? How much mileage can I get out of extending a harness with domain-specific features? It’s also great education for assessing trade-offs in a given implementation or comparing across complete packages.

Amped to kick off some explorations!

Six Years of Pluralistic

Posted on: Sun 22 February 2026

Cory Doctorow’s Pluralistic had its sixth birthday last week! 💥 🎉 🎂

Six years ago today, after 19 years with Boing Boing, during which time I wrote tens of thousands of blog posts, I started a new, solo blog, with the semi-ironic name “Pluralistic.” I didn’t know what Pluralistic was going to be, but I wasn’t writing Boing Boing anymore, and I knew I wanted to keep writing the web in some fashion.

Six years and more than 1,500 posts later, I am so satisfied with how Pluralistic is going. I spent a couple of decades processing everything that seemed interesting or significant through a blog, which created a massive database (and mnemonically available collection of partially developed thoughts) that I’m now reprocessing as a series of essays that make sense of today in light of everything that I’ve thought about for my whole adult life, which are, in turn, fodder for books, both fiction and nonfiction. I call this “The Memex Method”:

Looks like I first discovered Pluralistic back in January 2022. So not a super-early adopter, but I’ve been on board longer than the half-life.

The first segment of his post focuses on what he gets out of the hard graft needed to create Pluralistic on a mostly daily basis, both personally and professionally.

Making Pluralistic is several kinds of hard work. Over the past six years, I’ve become an ardent collagist, spending more and more time on the weird, semi-grotesque images that run atop every edition. Anything you devote substantial time to on a near-daily basis is something that gives you insight – into yourself, and into the thing you’re doing.

Off and on, I’ve been a daily blogger over the years. I even pulled off a posting run of 540 days in a row. I can heartily agree with this sentiment.

A third segment of his post, after documenting publishing tooling in the middle, is an extended screed about AI, its hazards, and its opportunities. Taken as a whole, this part could be a bit overcooked. I was amused that he uses AI in a similar way to me: copy editing.

There is one technology that has made my POSSE life better, and it might surprise you. This year, I installed Ollama – an open-source LLM – on my laptop. It runs pretty well, even without a GPU. Every day, before I run Loren’s python publication scripts, I run the text through Ollama as a typo-catcher (my prompt is “find typos”). Ollama always spots three or four of these, usually stuff like missing punctuation, or forgotten words, or double words (“the the next thing”) or typos that are still valid words (“of top of everything else”).

Instead of going fully local with Ollama, though, my homegrown copyedit tool calls out to a frontier model API. Invoking a local model is just a bit of reconfiguration away, so it would probably be a good experiment to try an open-source one for a while. I’m just a little too lazy to set up and keep running the local inference server, though.

In any event, cheers to Pluralistic and here’s hoping it’s around for many more years.

Hint: If you’re into Pluralistic, Doctorow operates a Discourse, not Discord, forum at chinwag.pluralistic.net. Fully open source, ’natch. Low volume, high signal.

cyclopts

Posted on: Sat 21 February 2026

Link parkin’: cyclopts

Cyclopts is a modern, easy-to-use command-line interface (CLI) framework that aims to provide an intuitive & efficient developer experience.

Why Cyclopts?

Intuitive API: Quickly write CLI applications using a terse, intuitive syntax.

Advanced Type Hinting: Full support of all builtin types and even user-specified (yes, including Pydantic, Dataclasses, and Attrs).

Rich Help Generation: Automatically generates beautiful help pages from docstrings and other contextual data.

Extendable: Easily customize converters, validators, token parsing, and application launching.

Being the CLI aficionado that I am, any newly discovered CLI toolkit is of interest. That being said, it’s nearly impossible to pry Click from my hands.

A Click feature I appreciate is that it’s not trying to be magical with functions or types. There can be a bit of magic in how it implements processing declared by option and argument decorators. However, it’s not trying to imply intent from the code or to make specifying the generated CLI as terse as possible. ‘Explicit is better than implicit’ is part of The Zen of Python, and Click is closer to explicit than many of its successors.

Here’s the intro to a comparison of Cyclopts vs. Typer.

Much of Cyclopts was inspired by the excellent Typer library. Despite its popularity, Typer has some traits that I (and others) find less than ideal. Part of this stems from Typer’s age, with its first release in late 2019, soon after Python 3.8’s release. Because of this, most of its API was initially designed around assigning proxy default values to function parameters. This made the decorated command functions difficult to use outside of Typer. With the introduction of Annotated in python3.9, type-hints were able to be directly annotated, allowing for the removal of these proxy defaults.

Additionally, Typer is built on top of Click. This makes it difficult for newcomers to figure out which elements are Typer-related and which elements are click-related. It’s also hard to tell whether the following criticisms stem from Typer, or the underlying Click. For better-or-worse, Cyclopts uses its own internal parsing strategy, gaining complete control over the process.

This section was originally written about Typer v0.9.0 (May 2023). Some criticisms have been addressed in later Typer versions; updates are noted in the respective sections below.

I find Typer palatable, know of a few admirable libraries that use it, and have put it in practice myself. Typically, though, I’ve always found Click up to anything I needed to get done, including some pretty gnarly CLI argument hacking.

Likely won’t be putting Cyclopts to the test, but it’s good to be as informed as possible.

Tipped off to Cyclopts by looking at the source code for the Talk Python CLI.

Talk Python CLI

Posted on: Fri 20 February 2026

Link parkin’: Talk Python has a CLI

Michael Kennedy introduces the tool in a recent blog post:

TL;DR: Talk Python now has an open-source CLI tool (talk-python-cli) that lets you search 500+ podcast episodes, transcripts, guests, and training courses directly from your terminal. Install it with uv tool install talk-python-cli. It supports text, JSON, and markdown output: built for humans, scripts, and AI/LLMs alike.

…

Why build a CLI tool?

I believe Talk Python becomes more valuable as it is used and referenced more by the community. The CLI tool closes an important gap. Here are just some examples:

Terminal-first developers: “I live in my terminal, don’t make me open a browser to find that episode about FastAPI”

Automation / scripting folks: “I want to pipe podcast data into my workflows and script it”

AI / LLM builders: “I need real, high-quality Python community content for my RAG pipeline or podcast assistant”

LLM Chat without MCP: “The MCP server solves the rapid access to live data challenge for AIs that support MCP and whose users has installed it. But many LLMs don’t support MCP. I’m looking at you ChatGPT.”

I’m primarily looking at this as design inspiration for retrocast. As in identifying some basic, essential concepts and operations that make sense for searching a podcast episode repository.

Harness Engineering

Posted on: Thu 19 February 2026

Previously, I had picked up on Mitchell Hashimoto using the phrase “harness engineering”. That term might be catching on.

Quoting OpenAI’s recent post, entitled “Harness engineering: leveraging Codex in an agent-first world”:

Over the past five months, our team has been running an experiment: building and shipping an internal beta of a software product with 0 lines of manually-written code.

The product has internal daily users and external alpha testers. It ships, deploys, breaks, and gets fixed. What’s different is that every line of code—application logic, tests, CI configuration, documentation, observability, and internal tooling—has been written by Codex. We estimate that we built this in about 1/10th the time it would have taken to write the code by hand.

Birgitta Böckeler, a Distinguished Engineer at Thoughtworks, follows up further on the impacts of harness engineering.

It was very interesting to read OpenAI’s recent write-up on “Harness engineering” which describes how a team used “no manually typed code at all” as a forcing function to build a harness for maintaining a large application with AI agents. After 5 months, they’ve built a real product that’s now over 1 million lines of code.

The article is titled “Harness engineering: leveraging Codex in an agent-first world”, but only mentions “harness” once in the text. Maybe the term was an afterthought inspired by Mitchell Hashimoto’s recent blog post. Either way, I like “harness” as a word to describe the tooling and practices we can use to keep AI agents in check.

The OpenAI team’s harness components mix deterministic and LLM-based approaches across 3 categories (grouping based on my interpretation):

Context engineering: Continuously enhanced knowledge base in the codebase, plus agent access to dynamic context like observability data and browser navigation

Architectural constraints: Monitored not only by the LLM-based agents, but also deterministic custom linters and structural tests

“Garbage collection”: Agents that run periodically to find inconsistencies in documentation or violations of architectural constraints, fighting entropy and decay

Working from an assumption that OpenAI is engaged in good-faith reporting on the final app, which is currently an internal beta, Böckeler draws some implications for production software development. Will harnesses become the new service templates? Does the AI need more constraints on its autonomy, not fewer guardrails, to deliver robust software? Is it even possible to apply this approach to pre-AI codebases?

She closes out with:

And finally, for once, I like a term in this space. Though it’s only 2 weeks old — I can probably hold my metaphorical breath until somebody calls their one-prompt, LLM-based code review agent a harness…

Personally, I had been thinking of “the harness” from the perspective of an individual developer. What’s the GUI-based IDE, or TUI that they use on a daily basis to interact with a codebase and drive agentic coding? What are its affordances, and what constraints does it place on a model’s code generation?

Zooming out a bit to the project or organizational level makes a lot of sense, though. As described by OpenAI, that’s not a single system or tool but a set of processes they discovered on the fly for this effort. With repetition, patterns will emerge, become codified, and eventually reified into new software tools. Then you could have a nice intersection between the project’s harness and a developer’s harness. Sort of like how git is a narrow waist that joins CI/CD approaches and developer environments, based upon a particular perspective on source code version control.

Hold on to your knickers! Prepare for turbulence.

The Genius of Lisp

Posted on: Wed 18 February 2026

Help! I’ve been nerd-sniped. Cees de Groot wrote a history book about the Lisp programming language.

Today I am launching my book The Genius of Lisp and this is some background on why I set about writing Yet Another Computer History Book.

…

So, to wrap up a long story, I noticed that there’s a shortage of formal tech history education and a shortage of history books for techies. I felt I could write a book and wanted to try it, so I jumped into what I felt was a niche and started writing a book that talks about how the second oldest programming language came about. With stories about the people behind it, but also all the gory details: math, code, the works. It is one thing to read about “The Maxwell Equations of Computer Science”, it is another thing to implement them. At least, that’s how I feel. I’ve always been a hands-on person, I guess.

I learned Scheme directly from Jerry Sussman. I wrote code on an HP Chipmunk, a TI Explorer, and a Symbolics Lisp Machine. I have dead tree editions of both Lisp books by Paul Graham. So yes, yes I must buy this book. You should too. Grab a copy from Lulu.

NetNewsWire 23rd Birthday

Posted on: Tue 17 February 2026

NetNewsWire officially turned 23 last week! 🎉 💥 🎂

I first mentioned NetNewsWire in a post back on February 4, 2003. Yes, really! Then followed it up in another post a few days later:

Brent Simmons describes how he gets one click subscription from a Web page to NetNewsWire.

Memo to self: Can Chimera do that?

It was very much back in my formative era of blogging. Looks like that was within the first 50 posts. Don’t actually remember what Chimera was.

I’m pretty confident I was using the 1.0 precursor, either NetNewsWire (paid) or, more likely, NetNewsWire Lite, well in advance of the 1.0 release. I’ve been at least an intermittent user of NNW over the years except for a gap when it wasn’t owned by Brent Simmons, and development had stalled a bit, especially on iOS. Also, Google Reader was still on the scene and quite good, so it might have become my main feed reader. Gettin’ hazy on the details, but there are a few posts on this blog about the situation.

Today, I use the iOS version of NetNewsWire, primarily synced with Feedbin, every single day. My set of daily apps isn’t small; banking and news apps help boost the count, but it’s one of the few I’d be really put out to give up. Feed reading, an essential personal activity, is effectively quarantined to my phone and no longer allowed on the desktop. NNW is the nexus of my feed reading.

Here’s to another two decades of the gift that is NetNewsWire!

Routing to Identity

Posted on: Mon 16 February 2026

I still haven’t quite grokked the entire concept, but Sam Ruby recently posted about “routing to identity”:

A Different Architecture

Kubernetes routes to capacity — any healthy pod can serve any request. Cell orchestration routes to identity — a request for this team’s board or this event’s scores must reach the specific instance that holds that data. These are fundamentally different problems, and everything downstream follows from which one you’re solving.

When you route to capacity, you separate compute from storage, make pods fungible, and scale by adding replicas. When you route to identity, compute and storage fuse together — a SQLite file lives with the code that serves it. You scale by creating more cells, not bigger ones. Cells sleep when idle, wake on demand, and can be disposed of when they’ve served their purpose.

He opened the conversation with a call out to cells:

LLM conversations are cells. Each has a unique identity, carries its own state, needs to be routed specifically, is active in bursts, and is fundamentally disposable. The explosive demand for this pattern forced infrastructure providers — Cloudflare with Agents, Fly.io with Sprites, Akamai’s Fermyon with Spin — to build first-class support for identity-routed, stateful, hibernatable compute. They built it for AI. It turns out it’s the right infrastructure for any application where data partitions along human-scale collaboration boundaries.

And chasing the pointer into the AWS documentation, here’s what constitutes a cell-based architecture:

A cell-based architecture comes from the concept of a bulkhead in a ship, where vertical partition walls subdivide the ship’s interior into self-contained, watertight compartments. Bulkheads reduce the extent of seawater flooding in case of damage and provide additional stiffness to the hull girder.

…

The overall workload is partitioned by a partition key. This key needs to align with the grain of the service, or the natural way that a service’s workload can be subdivided with minimal cross-cell interactions. Examples of partition keys are customer ID, resource ID, or any other parameter easily accessible in most API calls. A cell routing layer distributes requests to individual cells based on the partition key and presents a single endpoint to clients.

A cell-based architecture uses multiple isolated instances of a workload, where each instance is known as a cell. Each cell is independent, does not share state with other cells, and handles a subset of the overall workload requests. This reduces the potential impact of a failure, such as a bad software update, to an individual cell and the requests that it’s processing. If a workload uses 10 cells to service 100 requests, when a failure occurs in one cell, 90% of the overall requests would be unaffected by the failure.

Ruby (the person) argues that the new era of isolation for AI agents, especially within cloud providers, is an ideal fit for route-to-identity, with wide implications for building applications and services. The only bit I’m noodling over is which identity mechanisms to use for designation and dispatch. I’ll have to go back and take a closer look at his current work with Ruby (the language) and Ruby on Rails (the platform) to gain more clarity.

It’ll probably help to work my way through his guide, “Shared Nothing Architecture”, part of fly.io’s collection of Guides (Blueprints) for building apps on their platform.

Also, I haven’t dug deeply into how it’s built, but this suggests one possible aspect of how OpenClaw gained such rapid adoption. Decentralized and largely unintentional, those bots exemplify a route-to-identity system.

USearch

Posted on: Sun 15 February 2026

Digging a little beyond my first notice of ZVec via a HN thread, I saw mention of USearch.

Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram

Instead of being a full-on DB engine with storage, query, and indexing capabilities, USearch is more of a search library, akin to FAISS.

FAISS is a widely recognized standard for high-performance vector search engines. USearch and FAISS both employ the same HNSW algorithm, but they differ significantly in their design principles. USearch is compact and broadly compatible without sacrificing performance, primarily focusing on user-defined metrics and fewer dependencies.

USearch integrates with a large number of languages and “for every language implements a custom separate binding”.

I also got a useful reminder of txtai from the thread.

Zvec

Posted on: Sat 14 February 2026

Link parkin’: Zvec

A lightweight, lightning-fast, in-process vector database

High-Performance semantic search, made simple

From the Zvec docs:

Zvec is an open-source, fast, lightweight, and feature-rich vector database that runs entirely in-process — no server, daemon, or external infrastructure required. Simply install it as a Python package and start indexing and querying vectors right away 🚀.

Vector databases are commonly used to power AI applications like semantic search, retrieval-augmented generation (RAG), recommendation systems, and other similarity-based workflows.

Zvec can serve as a standalone vector database for end-to-end storage and search, or it can be seamlessly integrated into existing search systems (such as traditional SQL databases) as a dedicated vector search engine.

For my various personal projects involving LLMs, I’ve been looking for an embeddable vector search engine. None of the well-known candidates were particularly satisfying. They all fell short of the standard SQLite has set in terms of DX and ease of use.

Here’s hoping Zvec fits the bill.

zo.computer and exe.dev

Posted on: Fri 13 February 2026

As a Systems Guy (TM), I see LLMs and agentic coding sparking yet another renaissance in program isolation mechanisms. Two projects, among many, that have caught my eye recently are zo.computer and exe.dev.

zo.computer

Link parkin’: Zo

Zo is a new kind of computer. It’s a personal server, where you can store your files, and connect your tools. And it’s intelligent, so you can ask it to do research, create automations, and build apps on top of your personal context.

Explore – Craft your personal AI. It can browse the web, edit your files, and connect to your apps. You can even text or email your Zo.

Automate – Write workflows in natural language for your AI to execute. Run your workflows automatically.

Create – Use AI to make anything: documents, images, videos, websites… the sky’s the limit.

Host – Websites, APIs, even self-hosted services like n8n. You don’t need to be technical! Just ask Zo.

Ben Guo added a bit more grandiose aspiration

If you boil down what a tech company is, it’s code, compute, collective intelligence, and agents that act. The future is about enabling individuals to work at the scale of a tech company.

This is the original dream of personal computing, but on a much grander scale. It’s the evolution from Personal Computing to Personal Civilization-Building.

“AWS for your mom” is one of the ways we’ve described what we’re building with Zo Computer. But this only scratches the surface of our vision

…

Our mission is to build a general purpose computational mech suit for everyone. This requires designing enduring abstractions for compute and AI that are simple, last for decades, and enable the individual to harness powers that were previously only accessible to entire technical organizations.

Interestingly, Zo Computer comes across as a bit less hyped than clawdbot, having emerged in roughly the same time frame.

exe.dev

In a Discord specifically for Simon Willison’s projects, one of the developers for exe.dev, Josh Bleecher Snyder, popped up a while ago offering promotional credits. Should have taken Josh up on that.

What’s exe?

exe.dev is a subscription service that gives you virtual machines, with persistent disks, quickly and without fuss. These machines are immediately accessible over HTTPS, with sensible and secure defaults. You can share your web server as easily as you can share a Google Doc. With built-in optional authentication, so you can focus on your thing.

Your VMs share CPU/RAM—you pay for underlying resources, not per VM. Make a bunch!

crawshaw.io

The developers of exe.dev include David Crawshaw, who’s gotten a bit of attention posting about programming with LLMs and with agents. That includes some good thoughts in “Eight more months of agents”

Built-in agent sandboxes do not work

The constant stream of “may I run cat foo.txt?” from Claude Code and “I tried but cannot go build in my very-sophisticated sandbox” from Codex is a nightmare. You have to turn off the sandbox, which means you have to provide your own sandbox. I have tried just about everything and I highly recommend: use a fresh VM.

I have far more programs and services than I used to

This is why I am building exe.dev. I need a VM, with an unconstrained agent, that I can trivially start up and type the one liner I would have otherwise put into an Apple Note named TODO and forgotten about. A good portion of the time Shelley turns a one-liner into a useful program.

I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time to write actually exist. I wish I could share this joy with the people who are fearful about the changes agents are bringing. The fear itself I understand, I have fear more broadly about what the end-game is for intelligence on tap in our society. But in the limited domain of writing computer programs these tools have brought so much exploration and joy to my work.

Now, where had I heard David Crawshaw’s name before? Ah yes! Crawshaw was intimately connected to Tailscale, and I posted my delight with his LAN manifesto.

Link Parkin’: Clipaste

Posted on: Thu 12 February 2026

Link parkin’: Clipaste

Clipaste is a privacy-first clipboard manager for macOS and Windows. It keeps a secure, searchable history of everything you copy: text, URLs, images and files. Whether you are a developer, designer or writer, Clipaste helps you organize your snippets and speed up your daily tasks. With powerful search tools, smart bookmarking and drag-n-drop, you can retrieve anything from your clipboard history in seconds. All your sensitive data stays on your device and is encrypted by default.

…

Clipaste was built on a zero-trust architecture: no analytics, no user accounts, and no servers. All clipboard data is stored locally on your computer and is encrypted at rest using AES-256 (keys managed by the OS Keychain and are linked to your system account). We cannot see, sell, or lose your data because we never possess it.

Local Only: Your data is stored 100% on your device.

Encrypted: History is secured with AES-256 encryption by default.

No Cloud: We never upload your data. No servers, no accounts, no tracking.

Your clipboard contains sensitive data. We treat it that way.

I need to either dive deeper into LaunchBar (which has clipboard capabilities and is a Mac-assed App) or investigate Clipaste.

Two downers, though: First, while it’s clearly stated that the software is free, it doesn’t appear to be open-source. The repo is barren. Second, a command-line app would make automation much easier.

Hashimoto: AI Adoption Journey

Posted on: Wed 11 February 2026

Mitchell Hashimoto is another respected developer who is pursuing a considerate, principled approach to using AI for software development. He took a moment to document his journey of adoption

My experience adopting any meaningful tool is that I’ve necessarily gone through three phases: (1) a period of inefficiency (2) a period of adequacy, then finally (3) a period of workflow and life-altering discovery.

In most cases, I have to force myself through phase 1 and 2 because I usually have a workflow I’m already happy and comfortable with. Adopting a tool feels like work, and I do not want to put in the effort, but I usually do in an effort to be a well-rounded person of my craft.

This is my journey of how I found value in AI tooling and what I’m trying next with it. In an ocean of overly dramatic, hyped takes, I hope this represents a more nuanced, measured approach to my views on AI and how they’ve changed over time.

One technique he’s adopted is “Engineer the Harness”

I don’t know if there is a broad industry-accepted term for this yet, but I’ve grown to calling this “harness engineering.” It is the idea that anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again. I don’t need to invent any new terms here; if another one exists, I’ll jump on the bandwagon.

This space is interesting to me. Hashimoto primarily focuses on tweaking AGENTS.md and related tools. It requires careful evaluation to balance costs and benefits, but customizing the actual interactive TUI or agent loop could also be promising. Matthew Rocklin demonstrated TUI hacking to enable affordances that made him more productive. On the agent loop side, I’m guessing context management and domain-specific tooling would be quite possible.

Do any of these harness frameworks provide an extension language that enables such engineering? 🤔

Diggin’ On: Technimatic

Posted on: Mon 09 February 2026

If you listen to a lot of DJ mixes within a certain genre, certain tracks, colloquially known as anthems, will reoccur across DJs. For example, in the tech house space, Cajmere’s Brighter Days is a staple. Yup, I just triggered that earworm if you’re a house head.

Continuing my ongoing fascination with liquid, Technimatic’s Sunburst got me hooked. Introduced on their album For All Of Us, I just kept hearing it in mixes for the Shogun Sessions, Hub Live, and Liquicity Festival.

And it’s a banger.

So I had to chase down the details of the artist Technimatic in the track listings. From their bio over at Shogun Audio:

Musically rich, emotionally charged, and completely evocative, the trademark strain of drum and bass music associated with Pete Rogers and Andy Powell, better known under the duel moniker Technimatic, became one of the most firmly established members of the Shogun Audio roster.

Having started making music together back in 2008 as Technicolour & Komatic, Pete and Andy released their first single ‘Preacher’ on Technique’s sister imprint Worldwide Audio in 2009. The duo went on to showcase their liquid-infused blend of hyper-coloured D&B on labels such as Critical, Hospital, Spearhead, Viper, SGN:LTD and even the seminal Good Looking Records, with Technicolour’s own official remix of LTJ Bukem’s ‘Music’. Shogun Audio were eager to recognise the potential of Technimatic and signed them exclusively in 2012 to their sister imprint SGN:LTD. They immediately signalled their intent, dropping one of the most successful releases in the label’s history, the formidable ‘Intersection’ EP.

2014 saw Technimatic standing tall as fully-fledged members of Shogun Audio’s entourage of electronic music artists, touring the world with an ever-increasing DJ schedule and releasing the drum & bass album of the summer, ‘Desire Paths’. Their beautifully constructed debut LP received acclaim from Radio 1, Resident Advisor and Ninja Tune’s Solid Steel, with Mixmag and Thump both ranking it in their top electronic albums of the year. Endorsed by Annie Mac, Eddy Temple-Morris, Gilles Peterson, Mistajam and Rob Da Bank to name a few, the duo released the equally impressive ‘Flashbulb’ EP in the summer of 2015, as well as providing top-level remixes for the likes of The Ephemerals and Aprés.

Technimatic are up there with the likes of High Contrast, Calibre, DJ Marky, and Pola & Bryson. Highly recommended.

Bonus: Not Far To Go is hittin’ as well.

Paco Nathan: Strwythura

Posted on: Sun 08 February 2026

I first encountered Paco Nathan way back in the Big Data days, when he did a stint at Databricks. The graph data ecosystem that occupies much of his thinking hasn’t managed to draw me in, but I enjoy his writing and advocacy.

Late last year, he published a tutorial entitled Strwythura:

TL;DR: build a Streamlit app for a question/answer chat bot about a specific topic (which behaves like Perplexity, Claude, Le Chat, etc.) using advanced techniques for the knowledge graph and embeddings. This runs locally without lots of cost, including MLOps instrumentation, plus the code can be easily extended for other AI app use cases, other topics, more integrations, etc.

If you can write some Python code, have access to a laptop, and are ready to invest about 90 minutes, you can build this:

Paco is quite accomplished, especially as a trainer and training content developer. High expectations for this piece. Looks like fun.

Command Book

Posted on: Sat 07 February 2026

Michael Kennedy has released a new app called Command Book. Here’s the tagline:

Save your commands. Run them reliably. Monitor their output. Never rebuild your terminal setup again.

In “Your Terminal Tabs Are Fragile. I Built Something Better.”, Kennedy gets into the pain points that drove the development of Command Book:

Each of these commands requires that I remember which working directory to start in (the daemon runs somewhere else than the flask app for example). Getting them up and running is a bit tedious. Certainly doable, but tedious.

You’re working through the day, and you realize some part of your app isn’t working. It crashed. Why? You had —reload set and it dutifully reloaded on file changes. But the code was half-ready. You or Claude wrote the import statement, import new_feature, but you haven’t yet created that module. Or maybe it reloaded while you were mid-function having written def scan_files( and you get an unmatched brace. Reload fails and you’re hunting through terminal tabs for which part stopped working and needs to be restarted.

What I actually needed

Terminals are great for interactive work, exploration, quick commands. But they are terrible as a process manager. What I wanted was a command/process manager for long running commands currently living in my terminal.

I wanted them to be reproducible, instant, reliable and auto restarting if code changes, not just reloading unless the code gets out of sync.

I wanted a terminal companion.

My mental model is Docker Compose without the containers or the YAML specification. Command output capture and process restart are two Compose features that are handy if you’re managing a bunch of long-running processes in development or need to track AI agent sessions launched from a CLI.

As Brent Simmons and John Gruber would say, Command Book is a Mac-Assed Mac app, but it does have a CLI. There’s a free, limited version (5 saved commands, limited log capture). $14.99 gets you unlimited commands and 40× more log capture.

McKinney on Adversarial Review

Posted on: Fri 06 February 2026

Something I haven’t been doing well on my agentic coding projects is close code review. I’m not quite YOLO vibe coding as I do look at the code typically. But I don’t go over it with a fine-tooth comb applying proper software-development practices.

And I really need to do a better job making sure the tests don’t suck.

On a recent episode of Hugo Bowne-Anderson’s excellent Vanishing Gradients podcast, Wes McKinney confirmed a practice that’s been on my TODO list to put in place: “adversarial code review”

And so what I discovered almost haphazardly, which is the solution to the problem, is that you need to have your agents adversarially reviewed. Like all of their output needs to be reviewed adversarially, ideally by other models. So not having Claude review Claude’s output. have Codex and Gemini review Claude’s output. So every time my Claude code sessions do anything,they commit and the commit is immediately reviewed by either Codex or Gemini or both. And I feed that output back into my cloud code session. Be like, I see what you did there, but you’re not getting this past me. You’re going to fix all these bugs. And so I’m not reading the code anymore,but the code is being read aggressively by a lot of agents. And by the time I’m putting up a pull request on GitHub now, on RoboRev, for example, like my code review system, Typically, there’s been 20 or 30 review passes that have taken place if it’s a complex feature

Apologies for the wall of text on that quote. It was taken directly from the podcast transcript.

McKinney also had a spicy take on Python usage:

It’s the agent writing the code. And it’s the development loop of writing the code, building testing, write the code, build test and iterating. And so I do think we’ll see for many types of software, a shift away from Python towards other programming languages. I think Go is probably the best language for those like other types of software projects. And like I said, I haven’t written a line of Go code in my life.

Now Bowne-Anderson is engaging in a bit of selective, promotional quoting. In full context, McKinney asserted that there will be an order-of-magnitude more software written and a lot of agents writing that code. Similar to my language observation on some Matthew Rocklin content, statically typed languages help agents succeed. To me, this doesn’t mean less Python because of Go/Rust/TypeScript. It means more code overall: Python created by humans, augmented by Go/Rust/TypeScript from agents.

Rocklin’s Claude Chic

Posted on: Tue 03 February 2026

Matthew Rocklin has launched Claude Chic, an opinionated take on the CLI version of Claude Code.

A stylish terminal UI for Claude Code, built with Textual.

| uvx claudechic /welcome

Claude Code, but …

Stylish - designed to remove clutter and focus attention

Multi-Agent - run several Claude agents in parallel

Hackable - easily extensible with Python Code

Claude-forward - with the same Claude Code agent you trust

This leverages the Claude Agent SDK to provide the same Claude intelligence with a different UX.

Rocklin wrote up a blog post documenting his motivation for creating the project:

I’ve been deep in AI joy/psychosis the last couple months. The world is super fun right now.

However, as AI speeds up my workflow I find new bottlenecks, and increasingly those bottlenecks are part of my interactions with Claude itself. I wrote about this in AI Zealotry; AI can feel dehumanizing when our primary contribution is administrative, like granting permission.

Beyond permissions, various administrative interactions with Claude became tiresome, and so I worked to improve what I could about the interface for my workflow so that I could feel more human.

Claude Chic is an early result of that endeavor.

I’m enjoying Rocklin’s writing on agentic coding. As is typical of his previous endeavors (Dask, Coiled), he’s approaching it from a considerate, principled, and artisanal perspective. A few other related posts:

Ronacher On Pi

Posted on: Sun 01 February 2026

Maybe I was ahead of the curve relative to OpenClaw? Previously, I noted Mario Zechner’s lessons from building pi, an alternative coding agent. According to Armin Ronacher, pi is at the core of OpenClaw.

If you haven’t been living under a rock, you will have noticed this week that a project of my friend Peter went viral on the internet. It went by many names. The most recent one is OpenClaw but in the news you might have encountered it as ClawdBot or MoltBot depending on when you read about it. It is an agent connected to a communication channel of your choice that just runs code.

What you might be less familiar with is that what’s under the hood of OpenClaw is a little coding agent called Pi. And Pi happens to be, at this point, the coding agent that I use almost exclusively. Over the last few weeks I became more and more of a shill for the little agent. After I gave a talk on this recently, I realized that I did not actually write about Pi on this blog yet, so I feel like I might want to give some context on why I’m obsessed with it, and how it relates to OpenClaw.

Pi is written by Mario Zechner and unlike Peter, who aims for “sci-fi with a touch of madness,” Mario is very grounded. Despite the differences in approach, both OpenClaw and Pi follow the same idea: LLMs are really good at writing and running code, so embrace this. In some ways I think that’s not an accident because Peter got me and Mario hooked on this idea, and agents last year.

Do read Armin’s post. He deep dives into what’s elegant about pi and how that elegance enables the madness that is OpenClaw. Per Armin’s typical attention to detail, he clearly and efficiently describes the key design points of pi. Even better, he introspects on how he extends the system to increase his own productivity.

These are all just ideas of what you can do with your agent. The point of it mostly is that none of this was written by me, it was created by the agent to my specifications. I told Pi to make an extension and it did. There is no MCP, there are no community skills, nothing. Don’t get me wrong, I use tons of skills. But they are hand-crafted by my clanker and not downloaded from anywhere. For instance I fully replaced all my CLIs or MCPs for browser automation with a skill that just uses CDP. Not because the alternatives don’t work, or are bad, but because this is just easy and natural. The agent maintains its own functionality.

I think it’s fair to label a self-modifying, agentic coding framework as madness. 🫤

On OpenClaw

Posted on: Fri 30 January 2026

So the Clawdbot, er … Moltbot, errr … OpenClaw project snuck in under my radar. It literally exploded onto my awareness of the AI space. The only comp I can come up with quickly is when Netscape emerged onto the Web scene.

Can’t say OpenClaw is really for me at the moment, however. I’m an adventurous lad, but it seems like a lot of work is needed to secure the AI-native, instant-messaging-based personal assistant. When folks are buying Mac minis as prophylactics, I’m okay being part of a second wave. Additionally, it looks like there’s no token subsidy, which can cause the expedition to get expensive quickly. Plus, the gold-rush, Kool‑Aid vibes are starting to come on strong from the wider ecosystem.

Also, I don’t know that much about the project 🫤.

But Federico Viticci, of MacStories, has been living with OpenClaw for a bit and has quite a lot to say:

For the past week or so, I’ve been working with a digital assistant that knows my name, my preferences for my morning routine, how I like to use Notion and Todoist, but which also knows how to control Spotify and my Sonos speaker, my Philips Hue lights, as well as my Gmail. It runs on Anthropic’s Claude Opus 4.5 model, but I chat with it using Telegram. I called the assistant Navi (inspired by the fairy companion of Ocarina of Time, not the besieged alien race in James Cameron’s sci-fi film saga), and Navi can even receive audio messages from me and respond with other audio messages generated with the latest ElevenLabs text-to-speech model. Oh, and did I mention that Navi can improve itself with new features and that it’s running on my own M4 Mac mini server?

If this intro just gave you whiplash, imagine my reaction when I first started playing around with Clawdbot, the incredible open-source project by Peter Steinberger (a name that should be familiar to longtime MacStories readers) that’s become very popular in certain AI communities over the past few weeks. I kept seeing Clawdbot being mentioned by people I follow; eventually, I gave in to peer pressure, followed the instructions provided by the funny crustacean mascot on the app’s website, installed Clawdbot on my new M4 Mac mini (which is not my main production machine), and connected it to Telegram.

To say that Clawdbot has fundamentally altered my perspective of what it means to have an intelligent, personal AI assistant in 2026 would be an understatement. I’ve been playing around with Clawdbot so much, I’ve burned through 180 million tokens on the Anthropic API (yikes), and I’ve had fewer and fewer conversations with the “regular” Claude and ChatGPT apps in the process. Don’t get me wrong: Clawdbot is a nerdy project, a tinkerer’s laboratory that is not poised to overtake the popularity of consumer LLMs any time soon. Still, Clawdbot points at a fascinating future for digital assistants, and it’s exactly the kind of bleeding-edge project that MacStories readers will appreciate.

Once I finish up Viticci’s article, I should be a much better informed critic. Meanwhile, best of luck to the OpenClaw project.

Modern CLIs

Posted on: Thu 29 January 2026

Just discovered Derick Schaefer’s book CLI: A Practical Guide to Creating Modern Command-Line Interfaces

Once the backbone of early computing, the command-line interface (CLI) nearly disappeared in the shadow of graphical user interfaces. But today, it’s experiencing a powerful resurgence—driven by DevOps, automation, cloud-native infrastructure, and the rise of generative AI. While its roots trace back to the 1960s, the CLI has evolved into a modern development essential: fast, scriptable, cross-platform, and precise.

…

This book is a modern guide to command-line development, written for engineers, architects, and toolmakers building the next generation of CLI applications. It offers clear explanations, battle-tested patterns, and real-world examples written in Go—an ideal language for high-performance, cross-platform development. Readers will also find Spotlights on widely adopted tools like Git, WP-CLI, and Warp Terminal, revealing the design thinking behind some of today’s most influential CLIs.

The book is interesting to me on multiple fronts. I’d like to think I’m a connoisseur of command-line interface (CLI) tools because of my long-time UNIX use. The history angle of the book is a quick hook.

When starting a new development project, I usually default to building a CLI. I’ve had it in the back of my head to do some research on best practices. Discovering CLI: … is really timely since it offers pragmatic design advice for CLI tools.

Also, Schaefer only publishes physical versions of his books.

I was alerted to the book via an excellent Software Engineering Radio podcast episode interview with Schaefer.

In this episode, Derick Schaefer, author of CLI: A Practical Guide to Creating Modern Command-Line Interfaces, talks with host Robert Blumen about command-line interfaces old and new. Starting with a short review of the origin of commands in the early unix systems, they trace the evolution of commands into modern CLIs. Following the historic rise, fall, and re-emergence of CLIs, they consider innovative examples such as git, github, WordPress, and warp. Schaefer clarifies whether commands are the same as CLIs and then discusses a range of topics, including implementation languages, packages in the golang ecosystem for CLI development, CLIs and APIs, CLIs and AIs, AI tooling versus MCP, the object-command pattern, command flags, API authentication, whether CLIs should be stateless, and output formats – json, rich text.

The interview was a rich conversation with real technical depth. No fluff. I probably don’t listen to SE Radio episodes often enough, but this one was a sterling example of the podcast’s quality. Also, the content has a Creative Commons license attached, making it handy for testing and demos.

Ai2, Open Coding Agents

Posted on: Wed 28 January 2026

Ai2, the Allen Institute for AI, is admirable in that it releases fully open large models: data, code, and weights. This week they announced Open Coding Agents

Over the past year, coding agents have transformed how developers write, test, and maintain software. These systems can debug, refactor, and even submit pull requests—fundamentally changing what software development looks like. Yet despite this progress, most coding agents share the same constraints: they’re closed, expensive to train, and difficult to study or adapt to private codebases.

Ai2 Open Coding Agents change that. Today we’re releasing not just a collection of strong open coding models, but a training method that makes building your own coding agent for any codebase – for example, your personal codebase or an internal codebase at your organization – remarkably accessible for tasks including code generation, code review, debugging, maintenance, and code explanation.

…

The first release in our Open Coding Agents family is SERA (Soft-verified Efficient Repository Agents). The strongest – SERA-32B – solves 54.2% of SWE-Bench Verified problems, surpassing prior open-source state-of-the-art coding models of comparable sizes and context lengths while requiring only 40 GPU days (or fewer) to train on a cluster of 2 NVIDIA Hopper GPUs or NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. SERA models are optimized and compatible with Claude Code out of the box. With our fine-tuning method, you can specialize them to your own codebase including your full engineering stack and conventions quickly and at low cost.

These releases are great building blocks for further open-source development. To be clear, SERA is a fine-tuned version of Qwen 3, so it doesn’t provide complete end-to-end transparency.

They’re also useful for independent learning. A genuinely benchmark-competitive model with freely available training data and code is a nice starter kit for a class. Plus these present realistic application of modern post-training approaches.

With the crazy functionality and robustness of current TUI frameworks, it’s not inconceivable a small-to-medium-sized business can completely own their agentic coding stack.

GitHub Copilot SDK

Posted on: Mon 26 January 2026

GitHub Copilot now has an SDK!

From the announcement blog post:

Building agentic workflows from scratch is hard.

You have to manage context across turns, orchestrate tools and commands, route between models, integrate MCP servers, and think through permissions, safety boundaries, and failure modes. Even before you reach your actual product logic, you’ve already built a small platform.

GitHub Copilot SDK (now in technical preview) removes that burden. It allows you to take the same Copilot agentic core that powers GitHub Copilot CLI and embed it in any application.

This gives you programmatic access to the same production-tested execution loop that powers GitHub Copilot CLI. That means instead of wiring your own planner, tool loop, and runtime, you can embed that agentic loop directly into your application and build on top of it for any use case.

You also get Copilot CLI’s support for multiple AI models, custom tool definitions, MCP server integration, GitHub authentication, and real-time streaming.

I’ll have more to say in an upcoming post, but the fact that platforms like Claude, GitHub Copilot, et al. have SDKs explodes the optionality in agentic coding. You can embed these frameworks into bespoke tooling along with their built-in extension mechanisms, like skills, and MCP. In a past life, I used to make the argument that “scripting” languages were high leverage because they intentionally supported both embedding and extending.

As an old Lisp weenie, I definitely comprehend that there are limits on the utility of extreme environment customization. For solo developers, it’s a win. For teams and organizations that need consistent, shared development practices, it’s a challenge. Letting a thousand flowers bloom is wonderful until you have to transfer standards and expertise between developers.

One angle of deep interest, though, is the rapid creation of domain-specific platforms for niche, high-expertise audiences.

Tansu

Posted on: Sun 25 January 2026

Link parkin’: Tansu

Diskless Kafka on top of PostgreSQL, S3 or SQLite

Tansu is an Open Source, Apache Kafka®-compatible messaging broker. Super simple. Single binary. Built-in schema validation, open table format support (Iceberg, Delta). Built to be stateless and easy to use with plug-and-play storage.

I’ve been interested in messaging systems, especially Apache Kafka, since Kafka was first announced to the world by Jay Kreps in a blog post called “The Log: …”. Has it really been 12 years? Haven’t needed to use Kakfa over the past two to three years, but my ears always perk up when I hear of a new entrant into the messaging space.

Redpanda has been around for a while as a low operations version of Kafka. AutoMQ has existed for a bit as Kafka on S3. Update: I may have been thinking of WarpStream as the originator of Kafka on S3.

Tansu, according to its docs, would appear to be a bit of a generalization in both directions:

Tansu is a drop-in replacement for Apache Kafka with PostgreSQL, SQLite, S3 or memory storage engines. Without the cost of broker replicated storage for durability. Licensed under the Apache License. Written in 100% safe 🦺 async 🚀 Rust 🦀.

…

Similarly, support for the Apache Iceberg or Delta Lake open table formats can be enabled …

A validating schema registry is also baked in. At least when I last engaged with the Kafka ecosystem, this component was an add on maintained by Confluent.

Something to explore.

Mona Lisa Overdrive

Posted on: Sat 24 January 2026

So according to my own blog, the last time I completed William Gibson’s Mona Lisa Overdrive was back in October of 2009 😱, over 15 years ago! Not sure that’s really the case, but I did finish the book again this past week. I revisited Count Zero at the end of last year and figured I’d change things up and not go direct to Neuromancer.

As that past missive indicated, Overdrive isn’t my favorite of Gibson’s oeuvre. Not really sure why. I will note that unlike most of his other books, it has four overlapping threads. Typically it’s three, occasionally two, if I have it right. Now there’s a research project. This leaves a bit less room for development of characters and themes.

There’s also the hazard of wrapping a trilogy, which has confounded many an author.

Other books, especially from the Blue Ant trilogy, and in particular Spook Country, have grown on me over time. I’m now restarting that journey, except going backwards from Zero History. I’m curious to see if there are any new subtleties that come to the surface.

On that research project note, it’s obvious that the text of Gibson’s chapter titles are derived from the actual chapter, usually verbatim. Here’s a research todo list

Scrounge some electronic copies of the works, execute the information extraction task for collecting the titles. Then automate the calculation of the verbatim percentage. There’s always one or two chapter titles per book that confound me. Could a website based this idea be considered fair use?
Investigate what motivated this part of Gibson’s style
Identify other authors that follow the same practice

Apropos Elle Driver, love that word oeuvre. So rarely get to use it in a sentence.

Markdown Link Names, Who Knew?

Posted on: Fri 23 January 2026

I’ve been using Markdown for well over a decade now. How did I only recently learn that link reference names weren’t limited to numbers 🙄?

From the official markdown syntax documentation:

Link definition names may consist of letters, numbers, spaces, and punctuation — but they are not case sensitive.

And even better, a convenient shortcut:

The implicit link name shortcut allows you to omit the name of the link, in which case the link text itself is used as the name. Just use an empty set of square brackets

The Markdown processor I use goes one step further. You can leave out the empty square brackets as well.

Boy, does that improve readability!

Gradient Flow Tool Recommendations

Posted on: Thu 22 January 2026

This is the first time that I’ve seen, or more likely noticed, Ben Lorica, author of the Gradient Flow newsletter, making a coding recommendation.

When reading about AI coding tools, the names that often get mentioned are Claude Code, Cursor, and Google Antigravity. I’d like to put forth another option that I’ve come to enjoy using: the combination of OpenCode and OpenRouter. While I’m not really an early adopter and put off trying the OpenCode Desktop App for a while, I finally jumped in several weeks ago and have to say I’ve really enjoyed using it. This combination has really hit the sweet spot for me — when you pair OpenCode with OpenRouter’s easy access to all the leading models for coding, it becomes an incredible toolset for your projects or for developing tutorials and courses.

I’ve been aware of both OpenCode and OpenRouter but haven’t put either through their paces. Or put them together like Lorica.

P.S. Ben’s a data and AI insider you can take seriously. Gradient Flow is a newsletter worth supporting with a subscription.

chonkie

Posted on: Wed 21 January 2026

TIL: chonkie — repo:

🦛 Chonkie ✨

The lightweight ingestion library for fast, efficient and robust RAG pipelines

Ever found yourself making a RAG pipeline yet again (your 2,342,148th one), only to realize you’re stuck having to write your ingestion logic with bloated software library X or the painfully feature-less library Y?

WHY CAN’T THIS JUST BE SIMPLE, UGH?

Well, look no further than Chonkie! (chonkie boi is a gud boi 🦛)

To be honest, although I’m making a lot of progress with retrocast, it will incorporate my first RAG pipeline. Figured there must be something easier than LangChain or LlamaIndex for tokenizing and chunking text ahead of RAG indexing. Sure enough, I was right.

What are Chonkie’s core values?

Chonkie is a very opinionated library, and it all stems from innate human mortality. We are all going to die one day, and we have no reason to waste time figuring out how to chunk documents. Just use Chonkie.

Chonkie needs to be and always adheres to be:

Simple: We care about how simple it is to use Chonkie. No brainer.

Fast: We care about your latency. No time to waste.

Lightweight: We care about your memory. No space to waste.

Flexible: We care about your customization needs. Hassle free.

Chonkie just works. It’s that simple.

A Sandbox Field Guide

Posted on: Tue 20 January 2026

An excellent deep dive into sandboxes for AI agents by Luis Cardoso.

In the rest of this post, I’ll give you a simple mental model for evaluating sandboxes, then walk through the boundaries that show up in real AI execution systems: containers, gVisor, microVMs, and runtime sandboxes.

Cardoso goes into detail on multiple approaches (with figures!) and then clearly lays out the tradeoffs. The focus is on containers or container-alikes (e.g., microVMs). Also, it’s server and cloud oriented, rather than about code your coding agent is running locally on your laptop. There is good additional material on that topic, though.

I also enjoyed his “three-question model” of 1) boundary, 2) policy, and 3) lifecycle for evaluating sandboxes.

This includes a well-organized and clearly presented discussion of the underlying Linux kernel mechanisms that enable isolation. And their fundamental limitations.

A lot of “agent sandbox” failures aren’t kernel escapes. They’re policy failures.

If your sandbox can read the repo and has outbound network access, the agent can leak the repo. If it can read ~/.aws or mount host volumes, it can leak credentials. If it can reach internal services, it can become a lateral-movement tool.

This is why sandbox design for agents is often more about explicit capability design than about “strongest boundary available.” Boundary matters, but policy is how you control the blast radius when the model does something dumb or malicious prompts steer it.

Again, good times in Systems Land.

Toad, AI In The Terminal

Posted on: Sun 18 January 2026

Out of the ashes of Textualize, Will McGugan has created toad: “A unified interface for AI in your terminal”.

From McGugan’s release announcement:

My startup for terminals wrapped up mid-2025 when the funding ran dry. So I don’t have money, but what I do have are a very particular set of skills. Skills I have acquired over a very long career convincing terminals they are actually GUIs.

Skills which I have used to create a terminal app that offers a more pleasant experience for agentic coding. Toad (a play on Textual Code) is a front-end for AI tools such as OpenHands, Claude Code, Gemini CLI, and many more. All of which run seamlessly under a single terminal UI, thanks to the ACP protocol.

At the time of writing, Toad supports 12 agent CLIs, and I expect many more to come online soon.

I discovered toad via Rui Carmo, who has been working on a sandbox for various agents, duly named agentbox:

Agentbox - Coding Agent Sandbox

There’s no perfect way to sandbox agents (yet), but at least we can try limiting the damage using containers.

Agentbox is a simple Docker-based coding agent sandbox, originally inspired by running Batrachian Toad as a general-purpose coding assistant TUI and now generalized to more tools.

Whatever agent you prefer, Agentbox aims to provide a reliable and isolated environment which will help you boostrap pretty much any development environment.

Motivation

I found myself wanting to quickly spin up isolated coding environments for AI agents, without having to deal with complex orchestration tools or heavy VMs, and also wanting to limit CPU usage from Batrachian Toad itself.

With LLMs and agentic coding igniting a renaissance in CLIs and textual interfaces, old UNIX-heads like me are having a grand old time.

apple-music-python, MusicKit Authentication, and Music Assistant

Posted on: Sat 17 January 2026

It was a bit of a circuitous route, but I seem to have discovered a means to inspect a user’s Apple Music Library with Python.

Previously, when I was exploring Apple Music APIs, I landed on the apple-music-python package and its repository. Recently, I did the legwork to reactivate an old Apple Developer account and get the credentials for searching with the package.

Reading one of the repo’s issues surfaced Apple’s User Authentication for MusicKit documentation and Web flow. I started pondering if an agentic coder could do the heavy lifting involved in porting this to Python. But more research was in order. Surely there must be another Python package that dealt with this?

Then I learned about Music Assistant.

Music Assistant is a music library manager for your offline and online music sources which can easily stream your favourite music to a wide range of supported players and be combined with the power of Home Assistant!

…

Music Assistant Server

The Music Assistant server is a free, opensource Media library manager that connects to your streaming services and a wide range of connected speakers. The server is the beating heart, the core of Music Assistant and it keeps track of your music sources. It must run on an always-on device like a Raspberry Pi, a NAS or an Intel NUC or alike. The server can access multiple music providers and stream to multiple player types.

One of the output providers is Sonos 😲 💥 🎉! My fave!! And a supported streaming service is, guess what, Apple Music, including a nice auth flow according to this pull request.

So yes indeed, someone else had dealt with this issue. The solution looked a bit hairy though, so I decided to revisit apple-music-python one last time. Jonathan Jacobson had submitted a yet to be accepted PR that extended apple-music-python under the assumption that the Music User Token was already available, something that can be handled via Music Assistant.

Bringing it all together, in the short term, I’m just going to install Music Assistant and experiment with Apple Music support. Can’t imagine it’ll be too hard to pry out a Music User Token, assuming I can actually authenticate. Then I’ll just experiment with Jacobson’s package repo that upgrades apple-music-python.

The ultimate goal is to augment scrobbledb with the ability to enrich the data with information from my favorite DJ mix platform.

TIL orjson

Posted on: Fri 16 January 2026

TIL: orjson

orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard json library or other third-party libraries. It serializes dataclass, datetime, numpy, and UUID instances natively.

orjson.dumps() is something like 10x as fast as json, serializes common types and subtypes, has a default parameter for the caller to specify how to serialize arbitrary types, and has a number of flags controlling output.

orjson.loads() is something like 2x as fast as json, and is strictly compliant with UTF-8 and RFC 8259 (“The JavaScript Object Notation (JSON) Data Interchange Format”).

Via a rambling TalkPython Podcast episode about the diskcache module.

Feedbin, Still Highly Recommended

Posted on: Thu 15 January 2026

I’ve been using feedbin since its domain was at feedbin.me. Searching my archives, I realized I hadn’t mentioned it in a while. So here goes.

Feedbin’s pitch

A nice place to read on the web.

Follow your passions with RSS, email newsletters, podcasts, and YouTube.

That pretty much sums it up.

Of course there are all sorts of other goodies, like a well-designed browser UI, sync integration with other desktop and mobile readers, and a nice API. Also available at a great price point, $60 per year.

I’m confident that over the past five years I’ve used feedbin Every. F’in. Day.

When folks on the web lament Google Reader — of which I was a fan — I just roll my eyes. I understand the market influence of GReader, but there were plenty of excellent alternative readers when GReader shuffled off this coil. There were many other reasons why the “RSS Industry” didn’t flourish. C’mon, people. Move on.

If you’re looking for an RSS reader, give feedbin a shot. Even if you’ve already got one you like, give feedbin a shot, maybe you’ll see some features that will win you over.

Confirmation Bias

Posted on: Wed 14 January 2026

Various public developers are documenting how they work with agentic coding. I’m seeing many bits and pieces of their approach align with how I’ve been engaging with this development style. At the same time, I learn a lot about new techniques that could be applicable.

Do keep in mind that we’re all figuring this out as we go along. Foundational principles are few and far between, and things change so fast that all assumptions could be upended in a few months.

Let’s dig in a bit…

A Ruler for Agentic Coding

Posted on: Tue 13 January 2026

Ruler feels like something I can definitely add to the toolbox.

Ruler

Centralise your AI coding assistant instructions. Manage rules for all agents centrally, and distribute them to their agent-specific location as needed.

Teams, individuals, and open-source projects often rely on multiple AI coding agents working in the same project. Valuable rules, encoding norms and conventions, documentation, and helpful hints for the AI, are maintained in separate locations and formats for each agent.

Ruler makes it easy to manage the rules for all agents centrally, and distributing them to their agent-specific location as needed.

…

Why Ruler?

Managing instructions across multiple AI coding tools becomes complex as your team grows. Different agents (GitHub Copilot, Claude, Cursor, Aider, etc.) require their own configuration files, leading to:

Inconsistent guidance across AI tools

Duplicated effort maintaining multiple config files

Context drift as project requirements evolve

Onboarding friction for new AI tools

Ruler solves this by providing a single source of truth for all your AI agent instructions, automatically distributing them to the right configuration files.

I discovered Ruler by following links to Eleanor Berger, who was part of the excellent webinar “Effective AI-Assisted Coding with Eleanor Berger and Isaac Flath”, hosted by Hugo Bowne-Anderson of Vanishing Gradients.

Eleanor and Isaac’s course Elite AI Assisted Coding looks like a winner.

trackio

Posted on: Mon 12 January 2026

A lighter-weight experiment tracking framework with an SQLite mentality could be quite useful as an alternative to MLflow.

Enter trackio from Hugging Face.

trackio is a lightweight, free experiment tracking Python library built by Hugging Face 🤗.

…

Trackio is designed to be lightweight (the core codebase is <5,000 lines of Python code), not fully-featured. It is designed in an extensible way and written entirely in Python so that developers can easily fork the repository and add functionality that they care about.

And from the Hugging Face team’s blog post:

TL;DR: Trackio is a new, open-source, and free experiment tracking Python library that provides a local dashboard and seamless integration with Hugging Face Spaces for easy sharing and collaboration. Since trackio is a drop-in replacement for wandb, you can get started with the syntax you already know!

Background

If you have trained your own machine learning model, you know how important it is to be able to track metrics, parameters, and hyperparameters during training and visualize them afterwards to better understand your training run.

Most machine learning researchers use specific experiment tracking libraries to do this. However, these libraries can be paid, require complex setup, or lack the flexibility needed for rapid experimentation and sharing.

They note some further reasons they switched to trackio internally, including:

Easy sharing and embedding
Standardization and transparency
Data accessibility
Flexibility for experimentation

I think MLflow is a nice piece of kit, although it has been succumbing to scope creep and API sprawl in recent versions. That said, its wide integration with the popular ML toolkits is definitely convenient. Haven’t really had a chance to use Weights and Biases, since it doesn’t seem particularly friendly to self-hosting. I’ll be giving trackio a test drive to see how it compares to both and whether it lives up to its claims.

harbor

Posted on: Sun 11 January 2026

Link parkin’: the harbor framework.

From the docs:

Motivation

Why we built Harbor

Harbor is a framework for evaluating and optimizing agents and models in container environments.

When we released Terminal-Bench in May, we were surprised to see it used in unexpected ways like building custom evals, optimizing prompts, running RL, generating SFT traces, and CI/CD agent testing.

We also learned that defining and managing containerized tasks at scale is hard. We built Harbor to make it easy.

Harbor provides:

Simple, modular interfaces for environments, agents, and tasks

All popular CLI agents pre-integrated

A registry of popular benchmarks and datasets

Integrations with cloud sandbox providers like Daytona, Modal, and E2B for horizontal scaling

Integrations with frameworks like SkyRL and GEPA for optimizing agents

Related: Simon Willison on a new product, Sprites.dev from Fly.io:

New from Fly.io today: Sprites.dev. Here’s their blog post and YouTube demo. It’s an interesting new product that’s quite difficult to explain—Fly call it “Stateful sandbox environments with checkpoint & restore” but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox.

And directly from Kurt Mackey, the horse’s mouth:

The state of the art in agent isolation is a read-only sandbox. At Fly.io, we’ve been selling that story for years, and we’re calling it: ephemeral sandboxes are obsolete. Stop killing your sandboxes every time you use them.

…

We have a lot to say about how Sprites work. They’re related to Fly Machines but sharply different in important ways. They have an entirely new storage stack. They’re orchestrated differently. No Dockerfiles.

But for now, I just want you to think about what I’m saying here. Whether or not you ever boot a Sprite, ask: if you could run a coding agent anywhere, would you want it to look more like a read-only sandbox in a K8s cluster in the cloud, or like an entire EC2 instance you could summon in the snap of a finger?

I think the answer is obvious. The age of sandboxes is over. The time of the disposable computer has come.

And here I’m only mentioning a small slice of what’s going on in the space. Willison’s post covers multiple other offerings and that’s not even close to comprehensive. Innovation in isolated execution, including containers, is getting a bump due to agentic coding.

Pydantic Gateway

Posted on: Sat 10 January 2026

The folks at Pydantic are building up quite the AI stack. Their new to me AI Gateway would rival LiteLLM with a second-mover advantage.

Pydantic AI Gateway is a unified interface for accessing multiple AI providers with a single key. Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the Pydantic stack.

If you haven’t gotten a taste of Pydantic’s leader, Samuel Colvin, give this episode of the AI Engineering Podcast a listen. Colvin’s a hoot; he often calls out the engineering quality of other frameworks. Here’s a skoosh of what he’s cooking from the Pydantic AI Gateway announcement blog post:

Why another Gateway?

We could see it was a pain point for our customers.

We knew we could build something with higher engineering quality and better chosen abstractions.

We are uniquely positioned to offer a better developer experience via integrations with the existing Pydantic Stack (specifically Pydantic AI and Logfire).

Most “AI gateways” are the wrong kind of abstraction.

They try to wrap every provider in a single “universal schema” that slows you down. Every time a model adds a feature: tool calling, image input, JSON mode - you wait weeks for the gateway to catch up.

PAIG takes a different approach: one key, zero translation.

I am a big fan of Pydantic’s validation approach and often find their Python libraries to be well designed. Let’s see if it works for infrastructure.