Found 13 bookmarks
Custom sorting
Docling
Docling
MIT licensed document extraction Python library from the Deep Search team at IBM, who released [Docling v2](https://ds4sd.github.io/docling/v2/#changes-in-docling-v2) on October 16th. Here's the [Docling Technical Report](https://arxiv.org/abs/2408.09869) paper from August, which provides …
·simonwillison.net·
Docling
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
·github.com·
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Langchain gpt-3.5-turbo models reads files - problem
Langchain gpt-3.5-turbo models reads files - problem
I am making really simple (and for fun) LangChain project. A model can read PDF file and I can then ask him questions about specific PDF file. Everything works fine (this is working example) from P...
·stackoverflow.com·
Langchain gpt-3.5-turbo models reads files - problem
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev...
·github.com·
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Ask Your PDF
Ask Your PDF
Your gateway to dynamic, interactive, and intelligent conversations with any PDF document
·askyourpdf.com·
Ask Your PDF
NLP+CSS 201 Tutorials
NLP+CSS 201 Tutorials
Tutorials for advanced natural language processing methods designed for computational social science research.
·nlp-css-201-tutorials.github.io·
NLP+CSS 201 Tutorials
NER Powered Semantic Search in Python
NER Powered Semantic Search in Python
Semantic search is a compelling technology allowing us to search using abstract concepts and meaning rather than relying on specific words. However, sometimes a simple keyword search can be just as valuable — especially if we know the exact wording of what we're searching for. Pinecone allows you to pair semantic search with a basic keyword filter. If you know that the document you're looking for contains a specific word or set of words, you simply tell Pinecone to restrict the search to only include documents with those keywords. We even support functionality for keyword search using sets of words with AND, OR, NOT logic. In this video, we will explore these features through a start-to-finish example of basic keyword search in Pinecone. 🌲 Pinecone Docs Page: https://www.pinecone.io/docs/examples/metadata-filtered-search/ 🤖 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5 🎉 Subscribe for Article and Video Updates! https://jamescalam.medium.com/subscribe https://medium.com/@jamescalam/membership 👾 Discord: https://discord.gg/c5QtDB9RAP 00:00 NER Powered Semantic Search 01:19 Dependencies and Hugging Face Datasets Prep 04:18 Creating NER Entities with Transformers 07:00 Creating Embeddings with Sentence Transformers 07:48 Using Pinecone Vector Database 11:33 Indexing the Full Medium Articles Dataset 15:09 Making Queries to Pinecone 17:01 Final Thoughts
·youtube.com·
NER Powered Semantic Search in Python