From shiny object to sober reality: The vector database story, two years later
GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search
“Don’t use chatbots as search engines” was great advice for several years... until it wasn’t. I wrote about how good OpenAI’s o3 was at using its Bing-backed search tool back …
RetrievalTutorials/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb at main · FullStackRetrieval-com/RetrievalTutorials
Contribute to FullStackRetrieval-com/RetrievalTutorials development by creating an account on GitHub.
Evaluating Chunking Strategies for Retrieval | Chroma Research
Modern Information Retrieval Evaluation In The RAG Era
35% off our upcoming evals course: https://bit.ly/evals-aiModern IR Evaluation in the RAG Era w/ Nandan Thakur.Learn about future directions in RAG evaluatio...
Optimizing RAG With Reasoning Models
Orion Weller presents new frontiers in information retrieval, focusing on how instruction following and reasoning capabilities from large language models can be integrated into retrieval systems. He introduces Promptriever, a fast embedder that can follow instructions, and Rank1, a powerful but slower reasoning reranker, demonstrating their ability to unlock new types of queries and significantly improve performance.
00:00 - New Frontiers in IR: Instruction Following and Reasoning
00:07 - Language Models (LLMs) & Their Key Capabilities
00:20 - Instruction Following
00:57 - Reasoning (Test-Time Compute)
01:41 - Bridging LLMs to Information Retrieval (IR)
01:52 - Evolution of Search (Google 1999 vs. Today)
02:17 - SearchGPT and Its Limitations
02:38 - Search Hasn't Changed Fundamentally
03:16 - Keyword Search (Traditional IR)
04:11 - Semantic Search (Modern IR)
04:38 - Instruction-Based Search (Proposed IR)
05:25 - Challenge: Reranking Alone Isn't Enough
06:02 - Prompt & Reasoning-Based Search (Advanced IR)
06:42 - What is an Instruction in IR? (Attributes & NLU)
07:31 - Call to Action: Prompt Retrievers Like LLMs
07:46 - Introducing Promptriever & Rank1
08:23 - Bi-Encoder vs. Cross-Encoder Architecture
09:10 - Can We Make Promptable Retrievers? (Promptriever's Idea)
10:08 - Generating Synthetic Instructions
10:34 - Promptriever Experimental Settings
11:20 - Promptriever Evaluation Data (FollowIR & InstructIR)
12:28 - Promptriever Instruction Following Results
12:59 - Promptriever Results: Out-of-Domain (OOD) with Generic Prompts
13:10 - Promptriever: Generic Prompt Examples
13:58 - Promptriever Performance with Generic Prompts (BEIR OOD)
14:44 - Promptriever: Robustness to Paraphrased Prompts
15:16 - Promptriever Summary
16:04 - Introducing Rank1 (Test-Time Compute for IR)
16:22 - Test-Time Compute in LLMs (O1 AIME example)
17:08 - What Does Test-Time Compute Look Like in IR? (Rank1 Example)
18:01 - Rank1 Evaluation Data (BRIGHT dataset)
18:50 - Rank1: Example of Model Reasoning (Leetcode Problem)
19:35 - Rank1 Results (BRIGHT, NevIR, mFollowIR)
20:15 - Rank1: Direct Comparison of Reasoning Chain
20:33 - Rank1: Finding New Relevant Documents (DL19/DL20)
21:05 - Re-judging Old Data (Explanation)
22:05 - Rank1 Summary
22:37 - The Goal: IR That Works Like LLMs
22:56 - Implications for Downstream Users
23:36 - Open Data/Open Source & Contact Info
23:45 - Q&A Session - Promptriever & Bi-Encoder
24:23 - Q&A Session - Operationalizing Promptriever
26:04 - Q&A Session - Cross-Encoder Integration
26:33 - Q&A Session - Meta-Search/Human-Provided Prompts
27:56 - Q&A Session - Rank1 vs. Frontier Reasoning Models
28:07 - Clarification on Rank1's Training Focus
28:30 - How Rank1 Compares to O3/Gemini
29:32 - Q&A Session - Fine-Tuning Rank1
30:19 - Q&A Session - Where to Find the Models
30:45 - Conclusion of Q&A
I don't use RAG, I just retrieve documents
Our evals course with 35% discount code: https://bit.ly/evals-ai
ggozad/haiku.rag: Retrieval Augmented Generation based on SQLite
Retrieval Augmented Generation based on SQLite. Contribute to ggozad/haiku.rag development by creating an account on GitHub.
Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search
Learn how to use vector search and embeddings to easily combine your data with large language models like GPT-4. You will first learn the concepts and then create three projects.
✏️ Course developed by Beau Carnes.
💻 Code: https://github.com/beaucarnes/vector-search-tutorial
🔗 Access MongoDB Atlas: https://cloud.mongodb.com/
🏗️ MongoDB provided a grant to make this course possible.
⭐️ Contents ⭐️
⌨️ (00:00) Introduction
⌨️ (01:18) What are vector embeddings?
⌨️ (02:39) What is vector search?
⌨️ (03:40) MongoDB Atlas vector search
⌨️ (04:30) Project 1: Semantic search for movie database
⌨️ (32:55) Project 2: RAG with Atlas Vector Search, LangChain, OpenAI
⌨️ (54:36) Project 3: Chatbot connected to your documentation
🎉 Thanks to our Champion and Sponsor supporters:
👾 davthecoder
👾 jedi-or-sith
👾 南宮千影
👾 Agustín Kussrow
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Justin Hual
👾 Otis Morgan
👾 Oscar Rahnama
--
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://freecodecamp.org/news
❤️ Support for this channel comes from our friends at Scrimba – the coding platform that's reinvented interactive learning: https://scrimba.com/freecodecamp
An Intro to RAG with sqlite-vec & llamafile!
A brief introduction to using llamafile (a single-file tool for working with large language models) and sqlite-vec (A SQLite extension for vector search) to build a Retrival Augmentation Generation (RAG) application.
This was a live online event hosted on Dec 17th 2024 in the Mozilla AI Discord, join us for the next event at at https://discord.gg/Ve7WeCJFXk
LINKS:
- Doc w/ links to all mentioned projects/blog posts: https://docs.google.com/document/d/17GYLzlGUyJF9EDeaa1P-dFFZnkwxATnBcg5KnNgpvPE/edit?usp=sharing
- Slides: https://docs.google.com/presentation/d/14Szda-VnZzepL-1U9Nb7sXQg_TTf56OQ-KtUIMQ5xug/edit?usp=sharing
asg017/sqlite-vec: A vector search SQLite extension that runs anywhere!
A vector search SQLite extension that runs anywhere! - asg017/sqlite-vec
Sqlite can totally do embeddings now with Alex Garcia, creator of sqlite-vec
Vector databases are kind of everywhere these days. There is a big pool of VC's that are pouring money into the ecosystem too. But while all of that is happening, sqlite has also gotten support for it. In this episode we talk the Alex Garcia, the maintainer of this project, and discuss how the project got created on what the future has in store.
00:00 Introduction
00:40 Dataviz
04:39 Chromebook matters
10:30 Why sqlite rocks
17:32 Facebook and VR stuff
26:19 Datasette & Simon
38:31 Towards sqlite-vec
46:46 Getting attention
52:38 Current work
Sqlite-vec Github repo:
https://github.com/asg017/sqlite-vec
Alex Garcia blog:
https://alexgarcia.xyz/blog/2024/sqlite-vec-hybrid-search/index.html
Datasette discord:
https://discord.com/invite/ktd74dm5mw
Sqlite-vec channel on Mozilla Discord:
https://discord.gg/Ve7WeCJFXk
We have a Discord these days, feel free to discuss the podcast with us there!
https://discord.probabl.ai
You can follow the podcast on most podcast players including apple podcasts, spotify and rss.com.
- https://podcasts.apple.com/us/podcast/sample-space/id1739598572
- https://open.spotify.com/show/0BnwEHuyOlHgeZfselpn1n
- https://rss.com/podcasts/sample-space/
This podcast is part of the open efforts over at probabl. To learn more you can check out website or reach out to us on social media.
Website: https://probabl.ai/
Bluesky: https://bsky.app/profile/probabl.bsky.social
LinkedIn: https://www.linkedin.com/company/probabl
Twitter: https://x.com/probabl_ai
#probabl
How sqlite-vec Works for Storing and Querying Vector Embeddings
Vector search has become a foundational tool for modern applications — from powering recommendation engines to enabling semantic search in…
Curiosity - AI search for everything
The ultimate AI productivity app that protects your privacy. Bring all your apps and data into one AI-powered search and assistant. Get it for you and for your teams today.
Building search-based RAG using Claude, Datasette and Val Town
Retrieval Augmented Generation (RAG) is a technique for adding extra “knowledge” to systems built on LLMs, allowing them to answer questions against custom information not included in their training data. …
Ask HN: I have many PDFs – what is the best local way to leverage AI for search? | Hacker News