Embeddings with llm-gguf

Posted on: Sun 22 December 2024

I find the ability to create multi-dimensional embedding vectors from deep learning models quite fascinating. There’s an obvious application pattern in Retrieval Augmented Generation (RAG) with current LLMs. However, useful embedding models come in a much wider range of scales and capabilities then general language models. In principle, it’s quite possible to train custom embedding models at a reasonable cost in terms of compute hardware, data scale, and time.

Last month, Simon Willison updated his llm-gguf plugin to support creating embeddings from GGUF models specifically for embeddings.

The LLM docs have extensive coverage of things you can then do with this model, like embedding every row in a CSV file / file in a directory / record in a SQLite database table and running similarity and semantic search against them.

This could come in handy since I have a few piles of content laying around where using embeddings to supplement search and retrieval would be an interesting experiment.