home ¦ Archives ¦ Atom ¦ RSS

Compact Speech Recognition

Link parkin’: whisper.cpp

High-performance inference of OpenAI’s Whisper automatic speech recognition (ASR) model:

Having such a lightweight implementation of the model allows to easily integrate it in different platforms and applications. As an example, here is a video of running the model on an iPhone 13 device - fully offline, on-device:

Really compact C++ version of a production speech-to-text model. If I can get it to build, I’ll try it against some podcasts to see how things come out. If halfway decent it could become a piece of a comprehensive personal knowledge extraction memex.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.