AI/ML

AI/ML

2207 bookmarks
Custom sorting
The Hidden Metric That Determines AI Product Success
The Hidden Metric That Determines AI Product Success
Co-authored by Assaf Elovic and Harrison Chase. You can also find a version of this post published on Assaf's Medium. Why do some AI products explode in adoption while others struggle to gain traction? After a decade of building AI products and watching hundreds of launches across the industry, we’
·blog.langchain.com·
The Hidden Metric That Determines AI Product Success
Kimi K2
Kimi K2
While most people focused on Grok, there was another model release that got uniformly high praise: Kimi K2 from Moonshot.ai. …
·lesswrong.com·
Kimi K2
LLM Daydreaming
LLM Daydreaming
Proposal & discussion of how default mode networks for LLMs are an example of missing capabilities for search and novelty in contemporary AI systems.
·gwern.net·
LLM Daydreaming
DSPy 3.0 — and DSPy at Databricks
DSPy 3.0 — and DSPy at Databricks
The DSPy OSS team at Databricks and beyond is excited to present DSPy 3.0, targeted for release close to DAIS 2025. We will present what DSPy is and how it evolved over the past year. We will discuss greatly improved prompt optimization and finetuning/RL capabilities, improved productionization and observability via thorough and native integration with MLflow, and lessons from usage of DSPy in various Databricks R&D and professional services contexts. Talk By: Krista Opsahl-Ong, Research Engineer, Databricks ; Omar Khattab, Research Scientist, Databricks Databricks Named a Leader in the 2025 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms: https://www.databricks.com/blog/databricks-named-leader-2025-gartner-magic-quadrant-data-science-and-machine-learning Build and deploy quality AI agent systems: https://www.databricks.com/product/artificial-intelligence See all the product announcements from Data + AI Summit: https://www.databricks.com/events/dataaisummit-2025-announcements Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc
·youtube.com·
DSPy 3.0 — and DSPy at Databricks
Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines
Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines
Large Language Models (LLMs) excel at understanding messy, real-world data, but integrating them into production systems remains challenging. Prompts can be unruly to write, vary by model and can be difficult to manage in the large context of a pipeline. In this session, we'll demonstrate incorporating LLMs into a geospatial conflation pipeline, using DSPy. We'll discuss how DSPy works under the covers and highlight the benefits it provides pipeline creators and managers. Talk By: Drew Breunig, Data Science Leader & Strategist, Overture Maps Foundation Databricks Named a Leader in the 2025 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms: https://www.databricks.com/blog/databricks-named-leader-2025-gartner-magic-quadrant-data-science-and-machine-learning Build and deploy quality AI agent systems: https://www.databricks.com/product/artificial-intelligence See all the product announcements from Data + AI Summit: https://www.databricks.com/events/dataaisummit-2025-announcements Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc
·youtube.com·
Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines
The BEST Way to Chunk Text for RAG
The BEST Way to Chunk Text for RAG
To try everything Brilliant has to offer—free—for a full 30 days, visit https://brilliant.org/AdamLucek/ You’ll also get 20% off an annual premium subscript...
·youtube.com·
The BEST Way to Chunk Text for RAG
MonoQwen-Vision, the first visual document reranker - LightOn
MonoQwen-Vision, the first visual document reranker - LightOn
We introduce MonoQwen2-VL-v0.1, the first visual document reranker to enhance the quality of the retrieved visual documents and take these pipelines to the next level. Reranking a small number of candidates with MonoQwen2-VL-v0.1 achieve top results on the ViDoRe leaderboard.
·lighton.ai·
MonoQwen-Vision, the first visual document reranker - LightOn
If it cites em dashes as proof, it came from a tool.
If it cites em dashes as proof, it came from a tool.
It's a safe bet that most of us have encountered the age-old admonition to "never judge a book by its cover" at some point in our lives. There is a deep wisdom in that advice---wisdom that seems to go completely out the window as soon as a certain type of person spots a certain type of punctuation.
·scottsmitelli.com·
If it cites em dashes as proof, it came from a tool.
Optimizing RAG With Reasoning Models
Optimizing RAG With Reasoning Models
Orion Weller presents new frontiers in information retrieval, focusing on how instruction following and reasoning capabilities from large language models can be integrated into retrieval systems. He introduces Promptriever, a fast embedder that can follow instructions, and Rank1, a powerful but slower reasoning reranker, demonstrating their ability to unlock new types of queries and significantly improve performance. 00:00 - New Frontiers in IR: Instruction Following and Reasoning 00:07 - Language Models (LLMs) & Their Key Capabilities 00:20 - Instruction Following 00:57 - Reasoning (Test-Time Compute) 01:41 - Bridging LLMs to Information Retrieval (IR) 01:52 - Evolution of Search (Google 1999 vs. Today) 02:17 - SearchGPT and Its Limitations 02:38 - Search Hasn't Changed Fundamentally 03:16 - Keyword Search (Traditional IR) 04:11 - Semantic Search (Modern IR) 04:38 - Instruction-Based Search (Proposed IR) 05:25 - Challenge: Reranking Alone Isn't Enough 06:02 - Prompt & Reasoning-Based Search (Advanced IR) 06:42 - What is an Instruction in IR? (Attributes & NLU) 07:31 - Call to Action: Prompt Retrievers Like LLMs 07:46 - Introducing Promptriever & Rank1 08:23 - Bi-Encoder vs. Cross-Encoder Architecture 09:10 - Can We Make Promptable Retrievers? (Promptriever's Idea) 10:08 - Generating Synthetic Instructions 10:34 - Promptriever Experimental Settings 11:20 - Promptriever Evaluation Data (FollowIR & InstructIR) 12:28 - Promptriever Instruction Following Results 12:59 - Promptriever Results: Out-of-Domain (OOD) with Generic Prompts 13:10 - Promptriever: Generic Prompt Examples 13:58 - Promptriever Performance with Generic Prompts (BEIR OOD) 14:44 - Promptriever: Robustness to Paraphrased Prompts 15:16 - Promptriever Summary 16:04 - Introducing Rank1 (Test-Time Compute for IR) 16:22 - Test-Time Compute in LLMs (O1 AIME example) 17:08 - What Does Test-Time Compute Look Like in IR? (Rank1 Example) 18:01 - Rank1 Evaluation Data (BRIGHT dataset) 18:50 - Rank1: Example of Model Reasoning (Leetcode Problem) 19:35 - Rank1 Results (BRIGHT, NevIR, mFollowIR) 20:15 - Rank1: Direct Comparison of Reasoning Chain 20:33 - Rank1: Finding New Relevant Documents (DL19/DL20) 21:05 - Re-judging Old Data (Explanation) 22:05 - Rank1 Summary 22:37 - The Goal: IR That Works Like LLMs 22:56 - Implications for Downstream Users 23:36 - Open Data/Open Source & Contact Info 23:45 - Q&A Session - Promptriever & Bi-Encoder 24:23 - Q&A Session - Operationalizing Promptriever 26:04 - Q&A Session - Cross-Encoder Integration 26:33 - Q&A Session - Meta-Search/Human-Provided Prompts 27:56 - Q&A Session - Rank1 vs. Frontier Reasoning Models 28:07 - Clarification on Rank1's Training Focus 28:30 - How Rank1 Compares to O3/Gemini 29:32 - Q&A Session - Fine-Tuning Rank1 30:19 - Q&A Session - Where to Find the Models 30:45 - Conclusion of Q&A
·youtube.com·
Optimizing RAG With Reasoning Models
The Prompt Foreman
The Prompt Foreman
Writing about technology, culture, media, data, and the ways they interact.
·dbreunig.com·
The Prompt Foreman