Found 19 bookmarks
Custom sorting
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
GITHUB HUGGING FACE MODELSCOPE DISCORD We release Qwen3 Embedding series, a new proprietary model of the Qwen model family. These models are specifically designed for text embedding, retrieval, and reranking tasks, built on the Qwen3 foundation model. Leveraging Qwen3’s robust multilingual text understanding capabilities, the series achieves state-of-the-art performance across multiple benchmarks for text embedding and reranking tasks. We have open-sourced this series of text embedding and reranking models under the Apache 2.
·qwenlm.github.io·
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Qwen3 Embedding
Qwen3 Embedding
New family of embedding models from Qwen, in three sizes: 0.6B, 4B, 8B - and two categories: Text Embedding and Text Reranking. The full collection can be browsed on Hugging …
·simonwillison.net·
Qwen3 Embedding
Docling
Docling
MIT licensed document extraction Python library from the Deep Search team at IBM, who released [Docling v2](https://ds4sd.github.io/docling/v2/#changes-in-docling-v2) on October 16th. Here's the [Docling Technical Report](https://arxiv.org/abs/2408.09869) paper from August, which provides …
·simonwillison.net·
Docling
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
·github.com·
GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Langchain gpt-3.5-turbo models reads files - problem
Langchain gpt-3.5-turbo models reads files - problem
I am making really simple (and for fun) LangChain project. A model can read PDF file and I can then ask him questions about specific PDF file. Everything works fine (this is working example) from P...
·stackoverflow.com·
Langchain gpt-3.5-turbo models reads files - problem
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev...
·github.com·
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Ask Your PDF
Ask Your PDF
Your gateway to dynamic, interactive, and intelligent conversations with any PDF document
·askyourpdf.com·
Ask Your PDF
NLP+CSS 201 Tutorials
NLP+CSS 201 Tutorials
Tutorials for advanced natural language processing methods designed for computational social science research.
·nlp-css-201-tutorials.github.io·
NLP+CSS 201 Tutorials
NER Powered Semantic Search in Python
NER Powered Semantic Search in Python
Semantic search is a compelling technology allowing us to search using abstract concepts and meaning rather than relying on specific words. However, sometimes a simple keyword search can be just as valuable — especially if we know the exact wording of what we're searching for. Pinecone allows you to pair semantic search with a basic keyword filter. If you know that the document you're looking for contains a specific word or set of words, you simply tell Pinecone to restrict the search to only include documents with those keywords. We even support functionality for keyword search using sets of words with AND, OR, NOT logic. In this video, we will explore these features through a start-to-finish example of basic keyword search in Pinecone. 🌲 Pinecone Docs Page: https://www.pinecone.io/docs/examples/metadata-filtered-search/ 🤖 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5 🎉 Subscribe for Article and Video Updates! https://jamescalam.medium.com/subscribe https://medium.com/@jamescalam/membership 👾 Discord: https://discord.gg/c5QtDB9RAP 00:00 NER Powered Semantic Search 01:19 Dependencies and Hugging Face Datasets Prep 04:18 Creating NER Entities with Transformers 07:00 Creating Embeddings with Sentence Transformers 07:48 Using Pinecone Vector Database 11:33 Indexing the Full Medium Articles Dataset 15:09 Making Queries to Pinecone 17:01 Final Thoughts
·youtube.com·
NER Powered Semantic Search in Python