🦛 Chonkie ✨
The lightweight ingestion library for fast, efficient and robust RAG pipelines
Ever found yourself making a RAG pipeline yet again (your 2,342,148th one), only to realize you’re stuck having to write your ingestion logic with bloated software library X or the painfully feature-less library Y?
WHY CAN’T THIS JUST BE SIMPLE, UGH?
Well, look no further than Chonkie! (chonkie boi is a gud boi 🦛)
To be honest, although I’m making a lot of progress with retrocast, it will incorporate my first RAG pipeline. Figured there must be something easier than LangChain or LlamaIndex for tokenizing and chunking text ahead of RAG indexing. Sure enough, I was right.
What are Chonkie’s core values?
Chonkie is a very opinionated library, and it all stems from innate human mortality. We are all going to die one day, and we have no reason to waste time figuring out how to chunk documents. Just use Chonkie.
Chonkie needs to be and always adheres to be:
- Simple: We care about how simple it is to use Chonkie. No brainer.
- Fast: We care about your latency. No time to waste.
- Lightweight: We care about your memory. No space to waste.
- Flexible: We care about your customization needs. Hassle free.
Chonkie just works. It’s that simple.