Urn:li:ugc post:7351284834956185600

Document Parsers
GitHub - AdemBoukhris457/Docs_Parsing_Techniques: Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...)
Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...) - AdemBoukhris457/Docs_Parsing_Techniques
Jerry Liu (@jerryjliu0) on X
Here’s how to build an AI agent that auto-generates a company risk report over dozens of public filings 📈📉
Batch analyzing a ton of documents and writing up a memo would take 20+ hours of work. Agents have the potential to automate this but they completely fall apart without
Transformation Agent | Weaviate
This Weaviate Agent is in technical preview.
From PDFs to Insights: Structured Outputs from PDFs with Gemini 2.0
Learn how to extract structured data from PDFs with Gemini 2.0 and Pydantic.
GitHub - getomni-ai/zerox: PDF to Markdown with vision models
PDF to Markdown with vision models.
Qwen2.5-VL/cookbooks at main · QwenLM/Qwen2.5-VL
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud. - QwenLM/Qwen2.5-VL
GitHub - X-PLUG/mPLUG-DocOwl: mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding - X-PLUG/mPLUG-DocOwl
We now support VLMs in smolagents!
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
EyeLevel | RAG on-Prem
EyeLevel.ai's GroundX APIs are the fastest way to build enterprise-grade RAG on prem or cloud. Trusted by Air France, Dartmouth, UltraCommerce and hundreds more.
Interactive LLM-Powered Data Processing with DocWrangler
DocWrangler is an IDE that provides instant feedback, visual exploration tools, and AI assistance for building and iterating on LLM-powered data processing pipelines