home ¦ Archives ¦ Atom ¦ RSS

On OpenClaw

So the Clawdbot, er … Moltbot, errr … OpenClaw project snuck in under my radar. It literally exploded onto my awareness of the AI space. The only comp I can come up with quickly is when Netscape emerged onto the Web scene.

Can’t say OpenClaw is really for me at the moment, however. I’m an adventurous lad, but it seems like a lot of work is needed to secure the AI-native, instant-messaging-based personal assistant. When folks are buying Mac minis as prophylactics, I’m okay being part of a second wave. Additionally, it looks like there’s no token subsidy, which can cause the expedition to get expensive quickly. Plus, the gold-rush, Kool‑Aid vibes are starting to come on strong from the wider ecosystem.

Also, I don’t know that much about the project 🫤.

But Federico Viticci, of MacStories, has been living with OpenClaw for a bit and has quite a lot to say:

For the past week or so, I’ve been working with a digital assistant that knows my name, my preferences for my morning routine, how I like to use Notion and Todoist, but which also knows how to control Spotify and my Sonos speaker, my Philips Hue lights, as well as my Gmail. It runs on Anthropic’s Claude Opus 4.5 model, but I chat with it using Telegram. I called the assistant Navi (inspired by the fairy companion of Ocarina of Time, not the besieged alien race in James Cameron’s sci-fi film saga), and Navi can even receive audio messages from me and respond with other audio messages generated with the latest ElevenLabs text-to-speech model. Oh, and did I mention that Navi can improve itself with new features and that it’s running on my own M4 Mac mini server?

If this intro just gave you whiplash, imagine my reaction when I first started playing around with Clawdbot, the incredible open-source project by Peter Steinberger (a name that should be familiar to longtime MacStories readers) that’s become very popular in certain AI communities over the past few weeks. I kept seeing Clawdbot being mentioned by people I follow; eventually, I gave in to peer pressure, followed the instructions provided by the funny crustacean mascot on the app’s website, installed Clawdbot on my new M4 Mac mini (which is not my main production machine), and connected it to Telegram.

To say that Clawdbot has fundamentally altered my perspective of what it means to have an intelligent, personal AI assistant in 2026 would be an understatement. I’ve been playing around with Clawdbot so much, I’ve burned through 180 million tokens on the Anthropic API (yikes), and I’ve had fewer and fewer conversations with the “regular” Claude and ChatGPT apps in the process. Don’t get me wrong: Clawdbot is a nerdy project, a tinkerer’s laboratory that is not poised to overtake the popularity of consumer LLMs any time soon. Still, Clawdbot points at a fascinating future for digital assistants, and it’s exactly the kind of bleeding-edge project that MacStories readers will appreciate.

Once I finish up Viticci’s article, I should be a much better informed critic. Meanwhile, best of luck to the OpenClaw project.


Modern CLIs

Just discovered Derick Schaefer’s book CLI: A Practical Guide to Creating Modern Command-Line Interfaces

Once the backbone of early computing, the command-line interface (CLI) nearly disappeared in the shadow of graphical user interfaces. But today, it’s experiencing a powerful resurgence—driven by DevOps, automation, cloud-native infrastructure, and the rise of generative AI. While its roots trace back to the 1960s, the CLI has evolved into a modern development essential: fast, scriptable, cross-platform, and precise.

This book is a modern guide to command-line development, written for engineers, architects, and toolmakers building the next generation of CLI applications. It offers clear explanations, battle-tested patterns, and real-world examples written in Go—an ideal language for high-performance, cross-platform development. Readers will also find Spotlights on widely adopted tools like Git, WP-CLI, and Warp Terminal, revealing the design thinking behind some of today’s most influential CLIs.

The book is interesting to me on multiple fronts. I’d like to think I’m a connoisseur of command-line interface (CLI) tools because of my long-time UNIX use. The history angle of the book is a quick hook.

When starting a new development project, I usually default to building a CLI. I’ve had it in the back of my head to do some research on best practices. Discovering CLI: … is really timely since it offers pragmatic design advice for CLI tools.

Also, Schaefer only publishes physical versions of his books.

I was alerted to the book via an excellent Software Engineering Radio podcast episode interview with Schaefer.

In this episode, Derick Schaefer, author of CLI: A Practical Guide to Creating Modern Command-Line Interfaces, talks with host Robert Blumen about command-line interfaces old and new. Starting with a short review of the origin of commands in the early unix systems, they trace the evolution of commands into modern CLIs. Following the historic rise, fall, and re-emergence of CLIs, they consider innovative examples such as git, github, WordPress, and warp. Schaefer clarifies whether commands are the same as CLIs and then discusses a range of topics, including implementation languages, packages in the golang ecosystem for CLI development, CLIs and APIs, CLIs and AIs, AI tooling versus MCP, the object-command pattern, command flags, API authentication, whether CLIs should be stateless, and output formats – json, rich text.

The interview was a rich conversation with real technical depth. No fluff. I probably don’t listen to SE Radio episodes often enough, but this one was a sterling example of the podcast’s quality. Also, the content has a Creative Commons license attached, making it handy for testing and demos.


Ai2, Open Coding Agents

Ai2, the Allen Institute for AI, is admirable in that it releases fully open large models: data, code, and weights. This week they announced Open Coding Agents

Over the past year, coding agents have transformed how developers write, test, and maintain software. These systems can debug, refactor, and even submit pull requests—fundamentally changing what software development looks like. Yet despite this progress, most coding agents share the same constraints: they’re closed, expensive to train, and difficult to study or adapt to private codebases.

Ai2 Open Coding Agents change that. Today we’re releasing not just a collection of strong open coding models, but a training method that makes building your own coding agent for any codebase – for example, your personal codebase or an internal codebase at your organization – remarkably accessible for tasks including code generation, code review, debugging, maintenance, and code explanation.

The first release in our Open Coding Agents family is SERA (Soft-verified Efficient Repository Agents). The strongest – SERA-32B – solves 54.2% of SWE-Bench Verified problems, surpassing prior open-source state-of-the-art coding models of comparable sizes and context lengths while requiring only 40 GPU days (or fewer) to train on a cluster of 2 NVIDIA Hopper GPUs or NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. SERA models are optimized and compatible with Claude Code out of the box. With our fine-tuning method, you can specialize them to your own codebase including your full engineering stack and conventions quickly and at low cost.

These releases are great building blocks for further open-source development. To be clear, SERA is a fine-tuned version of Qwen 3, so it doesn’t provide complete end-to-end transparency.

They’re also useful for independent learning. A genuinely benchmark-competitive model with freely available training data and code is a nice starter kit for a class. Plus these present realistic application of modern post-training approaches.

With the crazy functionality and robustness of current TUI frameworks, it’s not inconceivable a small-to-medium-sized business can completely own their agentic coding stack.


GitHub Copilot SDK

GitHub Copilot now has an SDK!

From the announcement blog post:

Building agentic workflows from scratch is hard.

You have to manage context across turns, orchestrate tools and commands, route between models, integrate MCP servers, and think through permissions, safety boundaries, and failure modes. Even before you reach your actual product logic, you’ve already built a small platform.

GitHub Copilot SDK (now in technical preview) removes that burden. It allows you to take the same Copilot agentic core that powers GitHub Copilot CLI and embed it in any application.

This gives you programmatic access to the same production-tested execution loop that powers GitHub Copilot CLI. That means instead of wiring your own planner, tool loop, and runtime, you can embed that agentic loop directly into your application and build on top of it for any use case.

You also get Copilot CLI’s support for multiple AI models, custom tool definitions, MCP server integration, GitHub authentication, and real-time streaming.

I’ll have more to say in an upcoming post, but the fact that platforms like Claude, GitHub Copilot, et al. have SDKs explodes the optionality in agentic coding. You can embed these frameworks into bespoke tooling along with their built-in extension mechanisms, like skills, and MCP. In a past life, I used to make the argument that “scripting” languages were high leverage because they intentionally supported both embedding and extending.

As an old Lisp weenie, I definitely comprehend that there are limits on the utility of extreme environment customization. For solo developers, it’s a win. For teams and organizations that need consistent, shared development practices, it’s a challenge. Letting a thousand flowers bloom is wonderful until you have to transfer standards and expertise between developers.

One angle of deep interest, though, is the rapid creation of domain-specific platforms for niche, high-expertise audiences.


Tansu

Link parkin’: Tansu

Diskless Kafka on top of PostgreSQL, S3 or SQLite

Tansu is an Open Source, Apache Kafka®-compatible messaging broker. Super simple. Single binary. Built-in schema validation, open table format support (Iceberg, Delta). Built to be stateless and easy to use with plug-and-play storage.

I’ve been interested in messaging systems, especially Apache Kafka, since Kafka was first announced to the world by Jay Kreps in a blog post called “The Log: …”. Has it really been 12 years? Haven’t needed to use Kakfa over the past two to three years, but my ears always perk up when I hear of a new entrant into the messaging space.

Redpanda has been around for a while as a low operations version of Kafka. AutoMQ has existed for a bit as Kafka on S3. Update: I may have been thinking of WarpStream as the originator of Kafka on S3.

Tansu, according to its docs, would appear to be a bit of a generalization in both directions:

Tansu is a drop-in replacement for Apache Kafka with PostgreSQL, SQLite, S3 or memory storage engines. Without the cost of broker replicated storage for durability. Licensed under the Apache License. Written in 100% safe 🦺 async 🚀 Rust 🦀.

Similarly, support for the Apache Iceberg or Delta Lake open table formats can be enabled …

A validating schema registry is also baked in. At least when I last engaged with the Kafka ecosystem, this component was an add on maintained by Confluent.

Something to explore.


Mona Lisa Overdrive

So according to my own blog, the last time I completed William Gibson’s Mona Lisa Overdrive was back in October of 2009 😱, over 15 years ago! Not sure that’s really the case, but I did finish the book again this past week. I revisited Count Zero at the end of last year and figured I’d change things up and not go direct to Neuromancer.

As that past missive indicated, Overdrive isn’t my favorite of Gibson’s oeuvre. Not really sure why. I will note that unlike most of his other books, it has four overlapping threads. Typically it’s three, occasionally two, if I have it right. Now there’s a research project. This leaves a bit less room for development of characters and themes.

There’s also the hazard of wrapping a trilogy, which has confounded many an author.

Other books, especially from the Blue Ant trilogy, and in particular Spook Country, have grown on me over time. I’m now restarting that journey, except going backwards from Zero History. I’m curious to see if there are any new subtleties that come to the surface.

On that research project note, it’s obvious that the text of Gibson’s chapter titles are derived from the actual chapter, usually verbatim. Here’s a research todo list

  • Scrounge some electronic copies of the works, execute the information extraction task for collecting the titles. Then automate the calculation of the verbatim percentage. There’s always one or two chapter titles per book that confound me. Could a website based this idea be considered fair use?
  • Investigate what motivated this part of Gibson’s style
  • Identify other authors that follow the same practice

Apropos Elle Driver, love that word oeuvre. So rarely get to use it in a sentence.


Markdown Link Names, Who Knew?

I’ve been using Markdown for well over a decade now. How did I only recently learn that link reference names weren’t limited to numbers 🙄?

From the official markdown syntax documentation:

Link definition names may consist of letters, numbers, spaces, and punctuation — but they are not case sensitive.

And even better, a convenient shortcut:

The implicit link name shortcut allows you to omit the name of the link, in which case the link text itself is used as the name. Just use an empty set of square brackets

The Markdown processor I use goes one step further. You can leave out the empty square brackets as well.

Boy, does that improve readability!


Gradient Flow Tool Recommendations

This is the first time that I’ve seen, or more likely noticed, Ben Lorica, author of the Gradient Flow newsletter, making a coding recommendation.

When reading about AI coding tools, the names that often get mentioned are Claude Code, Cursor, and Google Antigravity. I’d like to put forth another option that I’ve come to enjoy using: the combination of OpenCode and OpenRouter. While I’m not really an early adopter and put off trying the OpenCode Desktop App for a while, I finally jumped in several weeks ago and have to say I’ve really enjoyed using it. This combination has really hit the sweet spot for me — when you pair OpenCode with OpenRouter’s easy access to all the leading models for coding, it becomes an incredible toolset for your projects or for developing tutorials and courses.

I’ve been aware of both OpenCode and OpenRouter but haven’t put either through their paces. Or put them together like Lorica.

P.S. Ben’s a data and AI insider you can take seriously. Gradient Flow is a newsletter worth supporting with a subscription.


chonkie

TIL: chonkierepo:

🦛 Chonkie ✨

The lightweight ingestion library for fast, efficient and robust RAG pipelines

Ever found yourself making a RAG pipeline yet again (your 2,342,148th one), only to realize you’re stuck having to write your ingestion logic with bloated software library X or the painfully feature-less library Y?

WHY CAN’T THIS JUST BE SIMPLE, UGH?

Well, look no further than Chonkie! (chonkie boi is a gud boi 🦛)

To be honest, although I’m making a lot of progress with retrocast, it will incorporate my first RAG pipeline. Figured there must be something easier than LangChain or LlamaIndex for tokenizing and chunking text ahead of RAG indexing. Sure enough, I was right.

What are Chonkie’s core values?

Chonkie is a very opinionated library, and it all stems from innate human mortality. We are all going to die one day, and we have no reason to waste time figuring out how to chunk documents. Just use Chonkie.

Chonkie needs to be and always adheres to be:

  • Simple: We care about how simple it is to use Chonkie. No brainer.
  • Fast: We care about your latency. No time to waste.
  • Lightweight: We care about your memory. No space to waste.
  • Flexible: We care about your customization needs. Hassle free.

Chonkie just works. It’s that simple.


A Sandbox Field Guide

An excellent deep dive into sandboxes for AI agents by Luis Cardoso.

In the rest of this post, I’ll give you a simple mental model for evaluating sandboxes, then walk through the boundaries that show up in real AI execution systems: containers, gVisor, microVMs, and runtime sandboxes.

Cardoso goes into detail on multiple approaches (with figures!) and then clearly lays out the tradeoffs. The focus is on containers or container-alikes (e.g., microVMs). Also, it’s server and cloud oriented, rather than about code your coding agent is running locally on your laptop. There is good additional material on that topic, though.

I also enjoyed his “three-question model” of 1) boundary, 2) policy, and 3) lifecycle for evaluating sandboxes.

This includes a well-organized and clearly presented discussion of the underlying Linux kernel mechanisms that enable isolation. And their fundamental limitations.

A lot of “agent sandbox” failures aren’t kernel escapes. They’re policy failures.

If your sandbox can read the repo and has outbound network access, the agent can leak the repo. If it can read ~/.aws or mount host volumes, it can leak credentials. If it can reach internal services, it can become a lateral-movement tool.

This is why sandbox design for agents is often more about explicit capability design than about “strongest boundary available.” Boundary matters, but policy is how you control the blast radius when the model does something dumb or malicious prompts steer it.

Again, good times in Systems Land.


Toad, AI In The Terminal

Out of the ashes of Textualize, Will McGugan has created toad: “A unified interface for AI in your terminal”.

From McGugan’s release announcement:

My startup for terminals wrapped up mid-2025 when the funding ran dry. So I don’t have money, but what I do have are a very particular set of skills. Skills I have acquired over a very long career convincing terminals they are actually GUIs.

Skills which I have used to create a terminal app that offers a more pleasant experience for agentic coding. Toad (a play on Textual Code) is a front-end for AI tools such as OpenHands, Claude Code, Gemini CLI, and many more. All of which run seamlessly under a single terminal UI, thanks to the ACP protocol.

At the time of writing, Toad supports 12 agent CLIs, and I expect many more to come online soon.

I discovered toad via Rui Carmo, who has been working on a sandbox for various agents, duly named agentbox:

Agentbox - Coding Agent Sandbox

There’s no perfect way to sandbox agents (yet), but at least we can try limiting the damage using containers.

Agentbox is a simple Docker-based coding agent sandbox, originally inspired by running Batrachian Toad as a general-purpose coding assistant TUI and now generalized to more tools.

Whatever agent you prefer, Agentbox aims to provide a reliable and isolated environment which will help you boostrap pretty much any development environment.

Motivation

I found myself wanting to quickly spin up isolated coding environments for AI agents, without having to deal with complex orchestration tools or heavy VMs, and also wanting to limit CPU usage from Batrachian Toad itself.

With LLMs and agentic coding igniting a renaissance in CLIs and textual interfaces, old UNIX-heads like me are having a grand old time.


apple-music-python, MusicKit Authentication, and Music Assistant

It was a bit of a circuitous route, but I seem to have discovered a means to inspect a user’s Apple Music Library with Python.

Previously, when I was exploring Apple Music APIs, I landed on the apple-music-python package and its repository. Recently, I did the legwork to reactivate an old Apple Developer account and get the credentials for searching with the package.

Reading one of the repo’s issues surfaced Apple’s User Authentication for MusicKit documentation and Web flow. I started pondering if an agentic coder could do the heavy lifting involved in porting this to Python. But more research was in order. Surely there must be another Python package that dealt with this?

Then I learned about Music Assistant.

Music Assistant is a music library manager for your offline and online music sources which can easily stream your favourite music to a wide range of supported players and be combined with the power of Home Assistant!

Music Assistant Server

The Music Assistant server is a free, opensource Media library manager that connects to your streaming services and a wide range of connected speakers. The server is the beating heart, the core of Music Assistant and it keeps track of your music sources. It must run on an always-on device like a Raspberry Pi, a NAS or an Intel NUC or alike. The server can access multiple music providers and stream to multiple player types.

One of the output providers is Sonos 😲 💥 🎉! My fave!! And a supported streaming service is, guess what, Apple Music, including a nice auth flow according to this pull request.

So yes indeed, someone else had dealt with this issue. The solution looked a bit hairy though, so I decided to revisit apple-music-python one last time. Jonathan Jacobson had submitted a yet to be accepted PR that extended apple-music-python under the assumption that the Music User Token was already available, something that can be handled via Music Assistant.

Bringing it all together, in the short term, I’m just going to install Music Assistant and experiment with Apple Music support. Can’t imagine it’ll be too hard to pry out a Music User Token, assuming I can actually authenticate. Then I’ll just experiment with Jacobson’s package repo that upgrades apple-music-python.

The ultimate goal is to augment scrobbledb with the ability to enrich the data with information from my favorite DJ mix platform.


TIL orjson

TIL: orjson

orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard json library or other third-party libraries. It serializes dataclass, datetime, numpy, and UUID instances natively.

orjson.dumps() is something like 10x as fast as json, serializes common types and subtypes, has a default parameter for the caller to specify how to serialize arbitrary types, and has a number of flags controlling output.

orjson.loads() is something like 2x as fast as json, and is strictly compliant with UTF-8 and RFC 8259 (“The JavaScript Object Notation (JSON) Data Interchange Format”).

Via a rambling TalkPython Podcast episode about the diskcache module.


Feedbin, Still Highly Recommended

I’ve been using feedbin since its domain was at feedbin.me. Searching my archives, I realized I hadn’t mentioned it in a while. So here goes.

Feedbin’s pitch

A nice place to read on the web.

Follow your passions with RSS, email newsletters, podcasts, and YouTube.

That pretty much sums it up.

Of course there are all sorts of other goodies, like a well-designed browser UI, sync integration with other desktop and mobile readers, and a nice API. Also available at a great price point, $60 per year.

I’m confident that over the past five years I’ve used feedbin Every. F’in. Day.

When folks on the web lament Google Reader — of which I was a fan — I just roll my eyes. I understand the market influence of GReader, but there were plenty of excellent alternative readers when GReader shuffled off this coil. There were many other reasons why the “RSS Industry” didn’t flourish. C’mon, people. Move on.

If you’re looking for an RSS reader, give feedbin a shot. Even if you’ve already got one you like, give feedbin a shot, maybe you’ll see some features that will win you over.


Confirmation Bias

Various public developers are documenting how they work with agentic coding. I’m seeing many bits and pieces of their approach align with how I’ve been engaging with this development style. At the same time, I learn a lot about new techniques that could be applicable.

Do keep in mind that we’re all figuring this out as we go along. Foundational principles are few and far between, and things change so fast that all assumptions could be upended in a few months.

Let’s dig in a bit…

read more ...


A Ruler for Agentic Coding

Ruler feels like something I can definitely add to the toolbox.

Ruler

Centralise your AI coding assistant instructions. Manage rules for all agents centrally, and distribute them to their agent-specific location as needed.

Teams, individuals, and open-source projects often rely on multiple AI coding agents working in the same project. Valuable rules, encoding norms and conventions, documentation, and helpful hints for the AI, are maintained in separate locations and formats for each agent.

Ruler makes it easy to manage the rules for all agents centrally, and distributing them to their agent-specific location as needed.

Why Ruler?

Managing instructions across multiple AI coding tools becomes complex as your team grows. Different agents (GitHub Copilot, Claude, Cursor, Aider, etc.) require their own configuration files, leading to:

  • Inconsistent guidance across AI tools
  • Duplicated effort maintaining multiple config files
  • Context drift as project requirements evolve
  • Onboarding friction for new AI tools

Ruler solves this by providing a single source of truth for all your AI agent instructions, automatically distributing them to the right configuration files.

I discovered Ruler by following links to Eleanor Berger, who was part of the excellent webinar “Effective AI-Assisted Coding with Eleanor Berger and Isaac Flath”, hosted by Hugo Bowne-Anderson of Vanishing Gradients.

Eleanor and Isaac’s course Elite AI Assisted Coding looks like a winner.


trackio

A lighter-weight experiment tracking framework with an SQLite mentality could be quite useful as an alternative to MLflow.

Enter trackio from Hugging Face.

trackio is a lightweight, free experiment tracking Python library built by Hugging Face 🤗.

Trackio is designed to be lightweight (the core codebase is <5,000 lines of Python code), not fully-featured. It is designed in an extensible way and written entirely in Python so that developers can easily fork the repository and add functionality that they care about.

And from the Hugging Face team’s blog post:

TL;DR: Trackio is a new, open-source, and free experiment tracking Python library that provides a local dashboard and seamless integration with Hugging Face Spaces for easy sharing and collaboration. Since trackio is a drop-in replacement for wandb, you can get started with the syntax you already know!

Background

If you have trained your own machine learning model, you know how important it is to be able to track metrics, parameters, and hyperparameters during training and visualize them afterwards to better understand your training run.

Most machine learning researchers use specific experiment tracking libraries to do this. However, these libraries can be paid, require complex setup, or lack the flexibility needed for rapid experimentation and sharing.

They note some further reasons they switched to trackio internally, including:

  • Easy sharing and embedding
  • Standardization and transparency
  • Data accessibility
  • Flexibility for experimentation

I think MLflow is a nice piece of kit, although it has been succumbing to scope creep and API sprawl in recent versions. That said, its wide integration with the popular ML toolkits is definitely convenient. Haven’t really had a chance to use Weights and Biases, since it doesn’t seem particularly friendly to self-hosting. I’ll be giving trackio a test drive to see how it compares to both and whether it lives up to its claims.


harbor

Link parkin’: the harbor framework.

From the docs:

Motivation

Why we built Harbor

Harbor is a framework for evaluating and optimizing agents and models in container environments.

When we released Terminal-Bench in May, we were surprised to see it used in unexpected ways like building custom evals, optimizing prompts, running RL, generating SFT traces, and CI/CD agent testing.

We also learned that defining and managing containerized tasks at scale is hard. We built Harbor to make it easy.

Harbor provides:

  • Simple, modular interfaces for environments, agents, and tasks
  • All popular CLI agents pre-integrated
  • A registry of popular benchmarks and datasets
  • Integrations with cloud sandbox providers like Daytona, Modal, and E2B for horizontal scaling
  • Integrations with frameworks like SkyRL and GEPA for optimizing agents

Related: Simon Willison on a new product, Sprites.dev from Fly.io:

New from Fly.io today: Sprites.dev. Here’s their blog post and YouTube demo. It’s an interesting new product that’s quite difficult to explain—Fly call it “Stateful sandbox environments with checkpoint & restore” but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox.

And directly from Kurt Mackey, the horse’s mouth:

The state of the art in agent isolation is a read-only sandbox. At Fly.io, we’ve been selling that story for years, and we’re calling it: ephemeral sandboxes are obsolete. Stop killing your sandboxes every time you use them.

We have a lot to say about how Sprites work. They’re related to Fly Machines but sharply different in important ways. They have an entirely new storage stack. They’re orchestrated differently. No Dockerfiles.

But for now, I just want you to think about what I’m saying here. Whether or not you ever boot a Sprite, ask: if you could run a coding agent anywhere, would you want it to look more like a read-only sandbox in a K8s cluster in the cloud, or like an entire EC2 instance you could summon in the snap of a finger?

I think the answer is obvious. The age of sandboxes is over. The time of the disposable computer has come.

And here I’m only mentioning a small slice of what’s going on in the space. Willison’s post covers multiple other offerings and that’s not even close to comprehensive. Innovation in isolated execution, including containers, is getting a bump due to agentic coding.


Pydantic Gateway

The folks at Pydantic are building up quite the AI stack. Their new to me AI Gateway would rival LiteLLM with a second-mover advantage.

Pydantic AI Gateway is a unified interface for accessing multiple AI providers with a single key. Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the Pydantic stack.

If you haven’t gotten a taste of Pydantic’s leader, Samuel Colvin, give this episode of the AI Engineering Podcast a listen. Colvin’s a hoot; he often calls out the engineering quality of other frameworks. Here’s a skoosh of what he’s cooking from the Pydantic AI Gateway announcement blog post:

Why another Gateway?

  • We could see it was a pain point for our customers.
  • We knew we could build something with higher engineering quality and better chosen abstractions.
  • We are uniquely positioned to offer a better developer experience via integrations with the existing Pydantic Stack (specifically Pydantic AI and Logfire).

Most “AI gateways” are the wrong kind of abstraction.

They try to wrap every provider in a single “universal schema” that slows you down. Every time a model adds a feature: tool calling, image input, JSON mode - you wait weeks for the gateway to catch up.

PAIG takes a different approach: one key, zero translation.

I am a big fan of Pydantic’s validation approach and often find their Python libraries to be well designed. Let’s see if it works for infrastructure.


Gudu Mix Series

While crate diggin’ on Apple Music, I landed on the highly listenable Gudu Mix Series

GUDU

Founded by Peggy Gou in 2019, Gudu Records releases cutting-edge contemporary electronic music from a global roster of artists such as Mogwaa, Maurice Fulton, DMX Krew, Dea, and Peggy herself.

This page collects the archive to Gudu’s popular DJ mix series, with contributions from label family and friends alike including Salamanda, Kim Ann Foxman, Chae, fka.m4a, and more.

The few mixes I’ve listened to in full lean toward ’90s and early 2000s Chicago and New York tech house. I need to sample a few more mixes to get a sense of the overall oeuvre. Looking promising though.


GitHub Copilot CLI

Link parkin’: GitHub Copilot CLI.

Agent-powered, GitHub-native

Execute coding tasks with an agent that knows your repositories, issues, and pull requests—all natively in your terminal.

It’s really a knockoff of Claude Code but, as might be expected, incorporates nice integration with GitHub features like issues and PRs. Interestingly, the default model is Claude Sonnet 4.5.

Using GitHub Copilot locally is what grabs me.

The command-line interface (CLI) for GitHub Copilot allows you to use Copilot directly from your terminal. You can use it to answer questions, write and debug code, and interact with GitHub.com. For example, you can ask Copilot to make some changes to a project and create a pull request.

GitHub Copilot CLI gives you quick access to a powerful AI agent, without having to leave your terminal. It can help you complete tasks more quickly by working on your behalf, and you can work iteratively with GitHub Copilot CLI to build the code you need.

But then you can actually delegate work back to a cloud agent:

Delegate tasks to Copilot coding agent

The delegate command lets you push your current session to Copilot coding agent on GitHub. This lets you hand off work while preserving all the context Copilot needs to complete your task.

You can delegate a task using the slash command, followed by a prompt:

/delegate complete the API integration tests and fix any failing edge cases

Copilot will ask to commit any of your unstaged changes as a checkpoint in a new branch it creates. Copilot coding agent will open a draft pull request, make changes in the background, and request a review from you.

Copilot will provide a link to the pull request and agent session on GitHub once the session begins.

So far, the best cloud agent experience for me has been with GitHub Copilot’s agents, so a TUI specifically for GitHub Copilot is a welcome addition.


OpenAI PLANS.md

I’ve had this blog post on exec plans from the OpenAI Cookbook sitting in a tab for a while. Mainly, I was stuck pondering how Codex could be induced to run for multiple hours. To date, my agentic coding has pretty much peaked at keeping an agent engaged for ten to twenty minutes.

Codex and the gpt-5.2-codex model (recommended) can be used to implement complex tasks that take significant time to research, design, and implement. The approach described here is one way to prompt the model to implement these tasks and to steer it towards successful completion of a project.

These plans are thorough design documents, and “living documents”. As a user of Codex, you can use these documents to verify the approach that Codex will take before it begins a long implementation process. The particular PLANS.md included below is very similar to one that has enabled Codex to work for more than seven hours from a single prompt.

We enable Codex to use these documents by first updating AGENTS.md to describe when to use PLANS.md, and then of course, to add the PLANS.md file to our repository.

Then I took a closer look at the source for PLANS.md and noticed how it aligned with much of my current process for agentic coding, but on steroids. For example, I routinely ask my agents to create task checklists when generating plans and then track progress during implementation by checking items off.

It’s definitely not an exact match (I encourage agents to apply timestamps within plans as much as possible) but then again I’m not OpenAI in terms of scale and velocity. Meanwhile, I’ve already learned quite a bit about steering agents from some initial reading. Otherwise, this looks like a nice component to add to my agentic-project-cookiecutter.


Tiled.art

TIL Tiled.art. From the About page:

Site developer Rick Mohr says: “Like so many others I was captivated early on by the art of M.C. Escher. Decades later when investigating the world of tessellations to interest my son I was stunned by the capabilities of Kevin Lee’s TesselManiac, and fascinated by the explanations and art of Alain Nicolas.”

With further explorations the ideas for this site began to emerge:

  • Introduce visitors to the inspired work of tessellation artists from around the world.

  • Use the now-mature capabilities of web graphics to help people understand tessellation art through animations.

  • Give clear non-mathematical explanations of the symmetries and classification systems used in tessellation art.

  • Offer a full tessellation editor, with the option of adding detail using Inkscape.

Make everything work on a smartphone, to open the world of tessellation art to the millions of users without a desktop computer.

Fast-forward three years, hundreds of artworks, and tens of thousands of lines of code — welcome to Tiled.art!

Since I’ve been making some slight progress on my peyote project, I’ve always got an eye out for what feels like generative art. Seems like tessellations should fit into the processing mindset, but even if they don’t they’re pretty cool!

Via Jason Kottke


SQLit

Speaking of lazygit there’s now the SQLit TUI for interacting with SQLite databases.

The lazygit of SQL databases. Connect to Postgres, MySQL, SQL Server, SQLite, Supabase, Turso, and more from your terminal in seconds.

A lightweight TUI for people who just want to run some queries fast.

I’m tempted to see how well it embeds or wraps my personal projects that happen to have an SQLite DB hiding within them.


pi coding agent

Thanks to Armin Ronacher, I learned about the pi coding agent. Mario Zechner’s blog post on what he learned building pi outlines his underlying principles in creating an agentic coding framework:

In the past three years, I’ve been using LLMs for assisted coding. If you read this, you probably went through the same evolution: from copying and pasting code into ChatGPT, to Copilot auto-completions (which never worked for me), to Cursor, and finally the new breed of coding agent harnesses like Claude Code, Codex, Amp, Droid, and opencode that became our daily drivers in 2025.

So what’s an old guy yelling at Claudes going to do? He’s going to write his own coding agent harness and give it a name that’s entirely un-Google-able, so there will never be any users. Which means there will also never be any issues on the GitHub issue tracker. How hard can it be?

To make this work, I needed to build:

  • pi-ai: A unified LLM API with multi-provider support (Anthropic, OpenAI, Google, xAI, Groq, Cerebras, OpenRouter, and any OpenAI-compatible endpoint), streaming, tool calling with TypeBox schemas, thinking/reasoning support, seamless cross-provider context handoffs, and token and cost tracking.
  • pi-agent-core: An agent loop that handles tool execution, validation, and event streaming.
  • pi-tui: A minimal terminal UI framework with differential rendering, synchronized output for (almost) flicker-free updates, and components like editors with autocomplete and markdown rendering.
  • pi-coding-agent: The actual CLI that wires it all together with session management, custom tools, themes, and project context files.

My philosophy in all of this was: if I don’t need it, it won’t be built. And I don’t need a lot of things.

One of the things about this new era of agentic coding that excites me is the Cambrian explosion of interaction paradigms. I’m just using a few of the popular TUIs and cloud agents as an individual developer. Within that regime, the space of available tools and refinement mechanisms (MCP, skills, hooks, agent-specific CLI tools, extensions, and version-control integration) can seem overwhelming.

pi looks like a well-designed, highly opinionated, modular approach. In 2026, I’m going to invest some time seeing if I can be productive with pi. Once that succeeds, I’ll go a level deeper and see if I can’t customize pi into a tool that fits my hand well.


Happy New Year!

Happy New Year! Here’s a beautiful image from the James Webb Space Telescope, taken at the end of 2025, to kick off the year 2026.

An image from the James Webb Space Telescope:


A humble-cli Tip

I decided to see if I could download all of my Humble Bundle content to a local NAS. I’ve mentioned the humble-cli tool as working really well for interacting with Humble Bundle purchases.

A recent version of humble-cli has a bulk-download subcommand, which isn’t very well documented. It takes a CSV file with columns labeled key and name. It’s easy to generate with the humble-cli itself. Assuming you’ve already authed, here’s what the steps look like:

$ humble-cli list --field key --field name > bundles.csv
...
$ humble-cli bulk-download bundles.csv
...

That’s it and off go your humble downloads.

In the current directory, this will iterate over each bundle in the csv file. For each bundle, it’ll create a folder named after each item in the bundle. Then it’ll download the corresponding media into that folder. This results in a nice, two-level archive of all the bundles in the csv.

Bonus: the download process attempts to be idempotent. If you run a bulk-download again and the media files are already present, it won’t re-download them.

I haven’t seen this anywhere on the web, so maybe it’ll get indexed and help someone out in the future.


claude-code-transcripts

Simon Willison has released a new tool: claude-code-transcripts.

A new way to extract detailed transcripts from Claude Code

I’ve released claude-code-transcripts, a new Python CLI tool for converting Claude Code transcripts to detailed HTML pages that provide a better interface for understanding what Claude Code has done than even Claude Code itself. The resulting transcripts are also designed to be shared, using any static HTML hosting or even via GitHub Gists.

This is great! AI session transcripts make excellent fodder for blog posts. I wish the desktop AI agents provided consistent and convenient ways to extract chat transcripts.


TIL: Lazygit

Seen the Lazygit name kicking around, so I figured it was time to find out what all the fuss was about.

Here’s the elevator pitch:

Rant time: You’ve heard it before, git is powerful, but what good is that power when everything is so damn hard to do? Interactive rebasing requires you to edit a goddamn TODO file in your editor? Are you kidding me? To stage part of a file you need to use a command line program to step through each hunk and if a hunk can’t be split down any further but contains code you don’t want to stage, you have to edit an arcane patch file by hand? Are you KIDDING me?! Sometimes you get asked to stash your changes when switching branches only to realise that after you switch and unstash that there weren’t even any conflicts and it would have been fine to just checkout the branch directly? YOU HAVE GOT TO BE KIDDING ME!

If you’re a mere mortal like me and you’re tired of hearing how powerful git is when in your daily life it’s a powerful pain in your ass, lazygit might be for you.

I live in magit for git work, but something handy for the occasional terminal session seems useful


mdformat

I’ve created a handy tool for copyediting my blog posts. Unfortunately, the models tend to extend my text into long lines and don’t properly word-wrap.

Enter mdformat

Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files. Mdformat is a Unix-style command-line tool as well as a Python library.

The features/opinions of the formatter include:

  • Consistent indentation and whitespace across the board
  • Always use ATX style headings
  • Move all link references to the bottom of the document (sorted by label)
  • Reformat indented code blocks as fenced code blocks
  • Use 1. as the ordered list marker if possible, also for noninitial list items

Mdformat will not change word wrapping by default. The rationale for this is to support Semantic Line Breaks.

These two plugins, mdformat-footnote and mdformat-front-matters, have been useful. A long time ago, I picked up the Pandoc footnote style in Markdown. This year, I started using YAML front matter to align with my use of Quarto on another blog.

I actually started implementing this feature by hand without assistance from an agentic coder. Then I asked GitHub Copilot to review the PR. It found a few issues, including a couple of clear brainos by the human, and made some suggestions that were easy to adopt. The two of us delivered a better product.

GitHub Copilot feels like a vastly underappreciated entry into the agentic coding space.

P. S. This post was copyedited with my own copyediting tool.


Claude Plan Mode

Armin Ronacher dug into Claude Code’s plan mode as an interested bystander.

However today I had two interesting conversations with people who really like plan mode. As a non-user of plan mode, I wanted to understand how it works. So I specifically looked at the Claude Code implementation to understand what it does, how it prompts the agent, and how it steers the client. I wanted to use the tool loop just to get a better understanding of what I’m missing out on.

This post is basically just what I found out about how it works, and maybe it’s useful to someone who also does not use plan mode and wants to know what it actually does.

This one took me a few days to get around to, but it turned out to be an interesting reflection on how I work with Claude. In essence, like Ronacher, I’ve unknowingly recreated many key aspects of plan mode!

Over the last month, I’ve started most agentic coding sessions with a prompt along the lines of: “generate a plan, in markdown format, include a timestamp, with a task checklist, and write it into a plans folder.” Then I review the plan, typically only making minor changes. I commit these plans to the working repository.

With a plan in place, I’ll kick the agent into implementation action. If the plan is fairly complex, with multiple phases, and might hit usage limits, I often ask the agent to complete only the first couple of phases, then pause so I can review.

Here’s more from Armin:

Plan mode as it exists in Claude has this sort of weirdness in my mind where it doesn’t come quite natural to me. It might come natural to others! But why can I not just ask the model to plan with me? Why do I have to switch the user interface into a different mode? Plan mode is just one of many examples where I think that because we are already so used to writing or talking to machines, bringing in more complexity in the user interface takes away some of the magic. I always want to look into whether just working with the model can accomplish something similar enough that I don’t actually need to have another user interaction or a user interface that replicates something that natural language could potentially do.

This is particularly true because my workflow involves wanting to double check what these plans are, to edit them, and to manipulate them. I feel like I’m more in control of that experience if I have a file on disk somewhere that I can see, that I can read, that I can review, that I can edit before actually acting on it. The Claude integrated user experience is just a little bit too far away from me to feel natural. I understand that other people might have different opinions on this, but for me that experience really was triggered by the thought that if people have such a great experience with plan mode, I want to understand what I’m missing out on.

And now I know: I’m mostly a custom prompt to give it structure, and some system reminders and a handful of examples.

Emphasis mine on that last line.

The other nice thing about making plans into explicit documentation is that:

  1. Work can be handed off to another agentic system.
  2. You can try the same plan with multiple different agents.
  3. Resets and restarts are easy.

YMMV.


Merry Christmas 2025

The previously mentioned festivitas makes for a great holiday desktop. This screen capture isn’t doing it justice, but the snowflakes are actually a nice update.

A screen capture of a macOS desktop. The menubar and sidebar have
festive seasonal ornaments from the festivitas app. The background is
a spectacular space shot of a galaxy captured by the James Webb Telescope

Merry Christmas ya’ll! 🎄


Python Supply Chain Security

Michael Kennedy does the Python community a service and explains how to integrate pip-audit into package development to help secure the dependency supply chain:

pip-audit is great because you can just run it on the command line. It will check against PyPA’s official list of vulnerabilities and tell you if anything in your virtual environment or requirements files is known to be malicious.

You could even set up a GitHub Action to do so, and I wouldn’t recommend against that at all. But it’s also valuable to make this check happen on developers’ machines. It’s a simple two-step process to do so …

Here’s the précis on pip-audit:

pip-audit is a tool for scanning Python environments for packages with known vulnerabilities. It uses the Python Packaging Advisory Database (https://github.com/pypa/advisory-database) via the PyPI JSON API as a source of vulnerability reports.

Kennedy illustrates the basic installation and usage of pip-audit from the command line. He also incorporates it into a pytest test. Personally, I think I’d rather add it as a Poe The Poet task and then roll it into a qa meta task. That approach already fits into my GitHub Actions workflow.

Also, TIL about uv’s dependency cooldowns:

Dependency cooldowns

uv also supports dependency “cooldowns” in which resolution will ignore packages newer than a duration. This is a good way to improve security posture by delaying package updates until the community has had the opportunity to vet new versions of packages.

This feature is available via the exclude-newer option and shares the same semantics.

Define a dependency cooldown by specifying a duration instead of an absolute value. Either a “friendly” duration (e.g., 24 hours, 1 week, 30 days) or an ISO 8601 duration (e.g., PT24H, P7D, P30D) can be used.

Even better, this can be specified in the tool.uv section of a pyproject.toml file.


TIL Gemmaverse

TIL: Gemmaverse

Strictly speaking, last night I learned about the compendium of open, smaller models made available by Google. Thanks to the exceedingly generous Hugo Bowne-Anderson (do check out the Vanishing Gradients podcast) for hosting a pop-up seminar with Ravin Kumar of Google DeepMind.

Gemma Models Overview

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning precious stone. The Gemma model weights are supported by developer tools that promote innovation, collaboration, and the responsible use of artificial intelligence (AI). You can get multiple variations of Gemma for general and specific use cases:

  • Gemma 3: Solve a wide variety of generative AI tasks with text and image input, support for over 140 languages, and long 128K context window.

  • CodeGemma: Complete programming tasks with this lightweight, coding-focused generative model.

  • PaliGemma 2: Build visual data processing AI solutions with a model that’s built to be fine-tuned for your image data processing applications and available in multiple resolutions.

  • ShieldGemma 2: Evaluate the safety of generative AI models’ input and output against defined policies.

Many more Gemma variants are available from Google and our AI developer community. Check them out on Kaggle Models and Hugging Face. Get inspired by what our community members have built with Gemma in the Gemmaverse.

I’d heard of a few of the Gemma models but didn’t realize how diverse the collection had become.

Bonus: Kumar dropped a Colab Notebook on finetuning the FunctionGemma 270M model.


mlx-whisper

Link parkin’: mlx-whisper

Speech recognition with Whisper in MLX. Whisper is a set of open source speech recognition models from OpenAI, ranging from 39 million to 1.5 billion parameters.

I had been experimenting with transcription and diarization using WhisperX. This turned out to be pretty slow on an M2 MacBook. mlx-whisper is pretty honking fast, although it only does transcription. I think diarization can be addressed by complementary application of pyannote.audio.

pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it comes with state-of-the-art pretrained models and pipelines, that can be further finetuned to your own data for even better performance.

Thought I’d mentioned mlx-whisper ahead of parakeet-mlx. In any event, I’ve actually put it to the test a little bit for retrocast. The processing rate is quite acceptable for high-quality transcription. However, this needs some serious benchmarking to confirm. moonshine is also in the mix.


Two New Side Projects

I’ve been putting in the reps with agentic coding, gaining patterns and confidence. So much so that I’m committing to launching two projects here at the end of the year.

Humble Librarian

Let’s start with Humble Librarian. I’m a bit of a Humble Book Bundle addict.

BOOKS BUNDLES

Looking for your next great read? Immerse yourself in adventurous comics, discover spectacular fantasy worlds, whip up your soon-to-be favorite recipe, run a new tabletop RPG system, and more with our curated book bundles. Whether you’re a voracious reader or want to explore new genres, our exclusive bundles can help you find your next literary love. Plus, a portion of the bundle proceeds go to charity!

Over the last few years, I’ve plunked down for 80+ book bundles. I tend toward tech book and comics bundles, with dashes of science fiction and self-help. I’ve also purchased a number of eBooks directly from Manning, Pragmatic Programmers, No Starch, eBooks.com, and self-publishers. Each bundle typically includes 10 or more books.

Bottom line, I have a metric crap-ton of EPUBs, PDFs, CBZs, and MOBIs to deal with. Half the time when I’m looking at a new Humble Book Bundle, I’m not sure if I already have the books on offer.

Let’s see how far we can get using agentic coding to build a file-system crawler, indexer, and a chat UX for managing my personal library.

Peyote

Over fifteen years ago I made some headway on a project named peyote. The idea was to use Python, Pygame, and PyOpenGL to create generative art on the order of Processing. I managed to get one piece completed and then ran out of steam.

Below is a portion of a plan that I asked Claude Code to create for a screensaver, hextrail, that recently caught my eye.

HexTrail Screensaver - Code Summary and Python Port Plan

Original Code Summary

HexTrail is an XScreenSaver module written in C with OpenGL that creates a mesmerizing animated pattern of colored lines growing across a hexagonal grid.

Core Concept

The screensaver creates a honeycomb grid of hexagonal cells. Animated “arms” (lines) grow from the centers of hexagons toward their edges, then continue into neighboring hexagons, creating a branching network of colorful trails across the hexagonal substrate.

Key Data Structures

  1. arm (hextrail.c:39-43)

    • State: EMPTY, IN, WAIT, OUT, or DONE
    • ratio: Growth progress (0.0 to 1.0)
    • speed: Animation speed
  2. hexagon (hextrail.c:45-53)

    • Position (XYZ coordinates)
    • 6 neighbors (pointers to adjacent hexagons)
    • 6 arms (one for each edge)
    • Color index
    • Border state and animation ratio
  3. hextrail_configuration (hextrail.c:55-70)

    • Grid dimensions and hexagon array
    • Color palette (8 colors)
    • OpenGL context and rotation state
    • Animation state (FIRST, DRAW, FADE)

That’s just an initial assessment from the XScreensaver C source code for hextrail. I’m feeling confident I can work with Claude and Codex to make a passable start on a featureful, modernized peyote app with hextrail and my old sketch, substrate, as starters.

Bottom Line

As many others diving into agentic coding point out, the process might not be more productive for straightforward code creation. But these tools lower the barrier for kicking off new projects and getting them above the threshold.

Forza!


Mirascope and Lilypad

Link parkin’: Mirascope for programming with LLMs

Mirascope is a powerful, flexible, and user-friendly library that simplifies the process of working with LLMs through a unified interface that works across various supported providers, including OpenAI, Anthropic, Mistral, Google (Gemini/Vertex), Groq, Cohere, LiteLLM, Azure AI, and Bedrock.

Whether you’re generating text, extracting structured information, or developing complex AI-driven agent systems, Mirascope provides the tools you need to streamline your development process and create powerful, robust applications.

And Lilypad for observability and context engineering

Context engineering refers to structuring everything an LLM sees so it provides the right response. This involves curating and sequencing the information that’s sent to the model, a task that goes beyond just writing prompt instructions.

That’s why we built Lilypad, a context engineering framework that versions, traces, and evaluates everything influencing an LLM’s output, not just the prompt, allowing you to reproduce, compare, and improve every input, parameter, and piece of context systematically, rather than through trial and error.

I’m eyeing Lilypad as a light(ish)weight option for getting started with AI evals on a small scale. The Docker Compose deployment to launch the necessary services doesn’t look too bad, only kicking off Zookeeper, Kafka, Postgres, and Lilypad containers. And with a modernized version of Kafka you could probably kill the Zookeeper process to lighten the load a bit.

The TalkPython course on LLM Building Blocks for Python has some coverage on Mirascope:

Dive into LLM Building Blocks for Python, a concise 1.2-hour video course that equips you with everything you need to integrate large language models into your Python applications. You’ll learn to move beyond “text in → text out” by turning your prompts into structured data, orchestrating chat-style workflows, and building interactive prototypes. From rapid-fire notebook experiments to production-ready async pipelines and caching, this course gives you practical, code-first techniques for real-world LLM development.

I actually went through the course when it first came out in the middle of the summer (shout out again Vincent Warmerdam), but it didn’t really sink in. Now, with the benefit of more experience with AI API client frameworks, it makes more sense. The decorator and “prompt function” styles feel attractive. Also, Vincent didn’t cover Lilypad, which I’m hoping can provide a starter kit deployment for prototyping processes around LLM product evals.


parakeet-mlx

Link parkin’: parakeet-mlx

Parakeet MLX

An implementation of the Parakeet models - Nvidia’s ASR (Automatic Speech Recognition) models - for Apple Silicon using MLX.

Regarding the models, here’s info from a June NVIDIA blog post:

NVIDIA Parakeet TDT 0.6B v2 is a 600-million-parameter automatic speech recognition (ASR) model designed for high-quality English transcription. It is currently ranked #1 on the Hugging Face ASR leaderboard, alongside four other top-ranking NVIDIA Parakeet models. NVIDIA NeMo Canary models have also made their mark on the Hugging Face ASR leaderboard.

This post explores how these and other cutting-edge NVIDIA speech AI models are setting new benchmarks for accuracy, speed, and versatility in automatic speech recognition (ASR). We will review model highlights, leaderboard performance, and practical deployment options so you can leverage these state-of-the-art models for real-world applications.

Another possible transcription backend for retrocast.


Diggin’ On PAUZA

Listened to the EDC Mexico set from PAUZA on Apple Music and really enjoyed it. It’s very much in the same vein as Mestiza. The set was so authentic that it even included a train wreck or two 😆.

I was working with Claude to confirm the name and get some background.

You were right! PAUZA is indeed a DJ duo—specifically Cuba’s first female electronic music duo, formed by Paula Fernández and Zahira Sánchez from Havana.

The name PAUZA is actually a combination of their names: “PAU” (from Paula) and “ZA” (from Zahira), not directly a translation from Spanish or Slavic languages.

They’re known for blending electronic music with traditional Afro-Cuban rhythms and have over 25 million Spotify streams.

I apologize for my initial assumption about the name’s origin!

From PAUZA’s Resident Advisor bio:

PAUZA is a Cuban DJ duo conformed by Paula Fernández and Zahira Sánchez, known for their distinctive fusion of electronic music and traditional / folkloric Afro Cuban rhythms. Since their formation, they have stood out on the international music scene with their unique and energetic style.

Latin-themed mix sets have been providing good vibes recently.


TIL: Liquid Funk

Quoting from Wikipedia on Liquid funk

Liquid funk, liquid drum & bass, liquid DnB, melodic drum & bass, or sometimes just liquid is a subgenre of drum and bass. While it uses similar basslines and bar layouts to other styles, it contains fewer bar-oriented samples and more instrumental layers (both synthesized and natural), harmonies, melodies and ambiance, producing a sentimental atmosphere directed at home listeners as well as nightclub and rave audiences. Music genres such as jazz, soul and sometimes blues have a pivotal influence on liquid funk.

Previously, I had personally referred to this style of music as “Soulful DnB”. I can’t remember exactly what triggered a deeper investigation (probably liner notes on an Apple Music mix) but I poked ChatGPT to investigate:

Is “liquid dnb” a well recognized term? Check your work, provide citations and an explanation if true

What “Liquid DnB” Means

  • Liquid DnB is commonly used as shorthand for liquid drum & bass — a subgenre of drum & bass (DnB) music.
  • It’s characterized by lush melodies, smoother rhythms, soulful harmonies, and atmospheric textures, often incorporating influences from jazz, soul, funk, and ambient music.
  • This style typically contrasts with harder and darker DnB subgenres like neurofunk or techstep by focusing on musicality and emotion. 

Confidence was bolstered when artists I recognized appeared in the response: DJ Marky, Calibre, London Elektricity, High Contrast, etc. I additionally cross-checked with Claude, which also returned the Wikipedia link. Oddly, there were no follow-up questions, unlike ChatGPT. +1 to OpenAI.

This all worked out well: I learned more about a genre of music that I really love. I had no idea that this style has been around for over 25 years! Bonus: a few more artists to look into.

© 2008-2025 C. Ross Jam. Licensed under CC BY-NC-SA 4.0 Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.