Unstructured - Unstructured
![AI/ML](https://rdl.ink/render/https%3A%2F%2Fup.raindrop.io%2Fcollection%2Fthumbs%2F159%2F070%2F39%2F8592ba29754d62de5acfda3237a1acc7.png?mode=crop&width=32&height=32&dpr=2)
AI/ML
NuExtract 1.5
Structured extraction - where an LLM helps turn unstructured text (or image content) into structured data - remains one of the most directly useful applications of LLMs. NuExtract is a …
AI-Powered Content Audits for Local News
How to responsibly use AI to help with understanding your coverage
GitHub - DocumindHQ/documind: Open-source platform for extracting structured data from documents using AI.
Open-source platform for extracting structured data from documents using AI. - DocumindHQ/documind
GitHub - DocumindHQ/documind: Open-source platform for extracting structured data from documents using AI.
Open-source platform for extracting structured data from documents using AI. - DocumindHQ/documind
Home - Docling
Docling
MIT licensed document extraction Python library from the Deep Search team at IBM, who released [Docling v2](https://ds4sd.github.io/docling/v2/#changes-in-docling-v2) on October 16th. Here's the [Docling Technical Report](https://arxiv.org/abs/2408.09869) paper from August, which provides …
Run a prompt to generate and execute jq programs using llm-jq
llm-jq is a brand new plugin for LLM which lets you pipe JSON directly into the llm jq command along with a human-language description of how you’d like to manipulate …
files-to-prompt 0.4
New release of my [files-to-prompt tool](https://simonwillison.net/2024/Apr/8/files-to-prompt/) adding an option for filtering just for files with a specific extension. The following command will output Claude XML-style markup for all Python and …
VikParuchuri/marker: Convert PDF to markdown quickly with high accuracy
Convert PDF to markdown quickly with high accuracy - VikParuchuri/marker: Convert PDF to markdown quickly with high accuracy
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev...