home ¦ Archives ¦ Atom ¦ RSS

docling

Link parkin’ docling:

Docling parses documents and exports them to the desired format with ease and speed.

GitHub repo and technical report from IBM:

This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion. It is powered by state-of-the-art specialized AI models for layout analysis (DocLayNet) and table structure recognition (TableFormer), and runs efficiently on commodity hardware in a small resource budget. The code interface allows for easy extensibility and addition of new features and models.

Prompted via a mention about spacy-layout

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.