Optimizing RAG With Reasoning Models
Orion Weller presents new frontiers in information retrieval, focusing on how instruction following and reasoning capabilities from large language models can be integrated into retrieval systems. He introduces Promptriever, a fast embedder that can follow instructions, and Rank1, a powerful but slower reasoning reranker, demonstrating their ability to unlock new types of queries and significantly improve performance.
00:00 - New Frontiers in IR: Instruction Following and Reasoning
00:07 - Language Models (LLMs) & Their Key Capabilities
00:20 - Instruction Following
00:57 - Reasoning (Test-Time Compute)
01:41 - Bridging LLMs to Information Retrieval (IR)
01:52 - Evolution of Search (Google 1999 vs. Today)
02:17 - SearchGPT and Its Limitations
02:38 - Search Hasn't Changed Fundamentally
03:16 - Keyword Search (Traditional IR)
04:11 - Semantic Search (Modern IR)
04:38 - Instruction-Based Search (Proposed IR)
05:25 - Challenge: Reranking Alone Isn't Enough
06:02 - Prompt & Reasoning-Based Search (Advanced IR)
06:42 - What is an Instruction in IR? (Attributes & NLU)
07:31 - Call to Action: Prompt Retrievers Like LLMs
07:46 - Introducing Promptriever & Rank1
08:23 - Bi-Encoder vs. Cross-Encoder Architecture
09:10 - Can We Make Promptable Retrievers? (Promptriever's Idea)
10:08 - Generating Synthetic Instructions
10:34 - Promptriever Experimental Settings
11:20 - Promptriever Evaluation Data (FollowIR & InstructIR)
12:28 - Promptriever Instruction Following Results
12:59 - Promptriever Results: Out-of-Domain (OOD) with Generic Prompts
13:10 - Promptriever: Generic Prompt Examples
13:58 - Promptriever Performance with Generic Prompts (BEIR OOD)
14:44 - Promptriever: Robustness to Paraphrased Prompts
15:16 - Promptriever Summary
16:04 - Introducing Rank1 (Test-Time Compute for IR)
16:22 - Test-Time Compute in LLMs (O1 AIME example)
17:08 - What Does Test-Time Compute Look Like in IR? (Rank1 Example)
18:01 - Rank1 Evaluation Data (BRIGHT dataset)
18:50 - Rank1: Example of Model Reasoning (Leetcode Problem)
19:35 - Rank1 Results (BRIGHT, NevIR, mFollowIR)
20:15 - Rank1: Direct Comparison of Reasoning Chain
20:33 - Rank1: Finding New Relevant Documents (DL19/DL20)
21:05 - Re-judging Old Data (Explanation)
22:05 - Rank1 Summary
22:37 - The Goal: IR That Works Like LLMs
22:56 - Implications for Downstream Users
23:36 - Open Data/Open Source & Contact Info
23:45 - Q&A Session - Promptriever & Bi-Encoder
24:23 - Q&A Session - Operationalizing Promptriever
26:04 - Q&A Session - Cross-Encoder Integration
26:33 - Q&A Session - Meta-Search/Human-Provided Prompts
27:56 - Q&A Session - Rank1 vs. Frontier Reasoning Models
28:07 - Clarification on Rank1's Training Focus
28:30 - How Rank1 Compares to O3/Gemini
29:32 - Q&A Session - Fine-Tuning Rank1
30:19 - Q&A Session - Where to Find the Models
30:45 - Conclusion of Q&A