Found 49 bookmarks
Custom sorting
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization. - google/langextract
·github.com·
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Azure AI Foundry | Microsoft Azure
Azure AI Foundry | Microsoft Azure
Design, customize, and manage AI applications and agents with Azure AI Foundry. Deploy secure, enterprise-grade AI solutions using unified tools and APIs.
·azure.microsoft.com·
Azure AI Foundry | Microsoft Azure
reducto/RolmOCR · Hugging Face
reducto/RolmOCR · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
reducto/RolmOCR · Hugging Face
MonoQwen-Vision, the first visual document reranker - LightOn
MonoQwen-Vision, the first visual document reranker - LightOn
We introduce MonoQwen2-VL-v0.1, the first visual document reranker to enhance the quality of the retrieved visual documents and take these pipelines to the next level. Reranking a small number of candidates with MonoQwen2-VL-v0.1 achieve top results on the ViDoRe leaderboard.
·lighton.ai·
MonoQwen-Vision, the first visual document reranker - LightOn
How OpenElections Uses LLMs
How OpenElections Uses LLMs
The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are …
·simonwillison.net·
How OpenElections Uses LLMs
Agentic Document Extraction: 17x Faster, Smarter, with LLM-Ready Outputs
Agentic Document Extraction: 17x Faster, Smarter, with LLM-Ready Outputs
Agentic Document Extraction just got faster! We've improved the median document processing from 135 seconds to 8 seconds! Agentic Document Extraction sees documents visually and uses an iterative workflow to accurately extract text, figures, form fields, charts, and more to create an LLM-ready output. You can use our SDK to parse complex documents and get the extracted content in Markdown and JSON. You can then feed the output to an LLM, RAG application, or other downstream apps. You can also use our Playground to test out Agentic Document Extraction. Try out Agentic Document Extraction: - Playground: https://va.landing.ai/demo/doc-extraction - Library: https://github.com/landing-ai/agentic-doc Learn more: https://landing.ai/agentic-document-extraction
·youtube.com·
Agentic Document Extraction: 17x Faster, Smarter, with LLM-Ready Outputs
Private Local LlamaOCR with a User-Friendly Streamlit Front-End
Private Local LlamaOCR with a User-Friendly Streamlit Front-End
Optical Character Recognition (OCR) is a powerful tool for extracting text from images, and with the rise of multimodal AI models, it's now easier than ever to implement locally. In this guide, we'll show you how to build a professional OCR application using Llama 3.2-Vision, Ollama for the backend, and Streamlit for the front end.PrerequisitesBefore we start, ensure you have the following:1. Python 3.10 or higher installed.2. Anaconda (Optional)3. Ollama installed for local model hosting. Downl
·gpt-labs.ai·
Private Local LlamaOCR with a User-Friendly Streamlit Front-End
SmolDocling - The SmolOCR Solution?
SmolDocling - The SmolOCR Solution?
In this video I look at SmolDocling and how it compares to the other OCR solutions that are out there, both open and proprietary. Blog: https://huggingface.c...
·youtube.com·
SmolDocling - The SmolOCR Solution?
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
In this video, I look at the release of the new Gemma 3 models, which come in four different flavors: a 1B, a 4B, a 12B, and the new Big 27B parameter model. Demo: https://huggingface.co/spaces/huggingface-projects/gemma-3-12b-it Blog: https://blog.google/technology/developers/gemma-3/?linkId=sam_witteveen Model Weights: https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps:
·youtube.com·
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
Mistral OCR - Multimodal & Multilingual OCR
Mistral OCR - Multimodal & Multilingual OCR
In this video, I look at the latest release from Mistral AI, which is their Mistral OCR model. I look at how it works and how it compares to other models, as well as how you can get started using it with code. Colab: https://dripl.ink/Sr4Uk Blog: https://mistral.ai/news/mistral-ocr For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:17 Other models 00:35 Mistral OCR Blog 05:45 Mistral OCR Demo 13:47 Mistral OCR Batch inference
·youtube.com·
Mistral OCR - Multimodal & Multilingual OCR
olmOCR - The Open OCR System
olmOCR - The Open OCR System
In this video, I look at olmOCR, the OpenOCR system from Allen AI. Colab: https://dripl.ink/HpaK4 Blog: https://olmocr.allenai.org/blog macOS ver: https://jonathansoma.com/words/olmocr-on-macos-with-lm-studio.html For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:31 Allen AI Blog 01:20 olmOCR Blog 02:08 olmOCR Hugging Face 04:52 olmOCR GitHub 05:41 Demo 05:59 Running olmOCR on macOS with LM Studio
·youtube.com·
olmOCR - The Open OCR System
DrSadiqfareed/Full-Page-Handwriting-Recognition: An implementation of a full-page handwriting recognition system using convolutional neural networks and transformers. This project tackles the complex task of recognizing handwritten text without segmentation.
DrSadiqfareed/Full-Page-Handwriting-Recognition: An implementation of a full-page handwriting recognition system using convolutional neural networks and transformers. This project tackles the complex task of recognizing handwritten text without segmentation.
An implementation of a full-page handwriting recognition system using convolutional neural networks and transformers. This project tackles the complex task of recognizing handwritten text without s...
·github.com·
DrSadiqfareed/Full-Page-Handwriting-Recognition: An implementation of a full-page handwriting recognition system using convolutional neural networks and transformers. This project tackles the complex task of recognizing handwritten text without segmentation.