Found 109 bookmarks
Newest
Building a Biomedical GraphRAG: When Knowledge Graphs Meet Vector Search
Building a Biomedical GraphRAG: When Knowledge Graphs Meet Vector Search

a RAG system for biomedical research that uses both vector search and knowledge graphs.

Turns out, you need both.

Vector databases, such as Qdrant, are excellent at handling semantic similarity, but they struggle with relationship queries.

𝐓𝐡𝐞 𝐢𝐬𝐬𝐮𝐞: Author networks, citations, and institutional collaborations aren't semantic similarities. They're structured relationships that don't live in embeddings.

𝐓𝐡𝐞 𝐡𝐲𝐛𝐫𝐢𝐝 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡

I combined Qdrant for semantic retrieval with Neo4j for relationship queries, using OpenAI's tool-calling to orchestrate between them.

The workflow:

1️⃣ User asks a question 2️⃣ Qdrant retrieves semantically relevant papers 3️⃣ LLM analyzes the query and decides which graph enrichment tools to call 4️⃣ Neo4j returns structured relationship data 5️⃣ Both sources combine into one answer

Same query with the hybrid system: Returns 4 specific collaborators with paper counts, plus relevant research context.

𝐈𝐦𝐩𝐥𝐞𝐦𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧 𝐧𝐨𝐭𝐞𝐬

  • I initially tried having the LLM generate Cypher queries directly, but tool-calling worked much better. The LLM decides which pre-built tool to call, as the tools themselves contain reliable Cypher queries, and LLMs are not yet good enough at Cypher query generation

  • For domains with complex relationships, such as biomedical research, legal documents, and enterprise knowledge, combining vector search with knowledge graphs gives you capabilities neither has alone.

https://www.linkedin.com/posts/activity-7397237155716063232-0pku/

·aiechoes.substack.com·
Building a Biomedical GraphRAG: When Knowledge Graphs Meet Vector Search
Ontologies bring context
Ontologies bring context
I used the o word last week and it hit a few nerves. Ontologies bring context. But then context engineering is very poorly understood. Agent engineers speak about it, expect everyone is doing it, know but almost everyone is winging it. Here's what context engineering is definitely not - ie. longer prompts. What it actually is - the right information, with the right meaning, at the right time. Not more but the right information with the right meaning. Sounds super abstract. That's why a brief video that actually breaks down how to load context. Okay. Not brief. but context needs context.
Ontologies bring context
·linkedin.com·
Ontologies bring context
ATOM is finally here! A scalable and fast approach that can build and continuously update temporal knowledge graphs, inspired by atomic bonds.
ATOM is finally here! A scalable and fast approach that can build and continuously update temporal knowledge graphs, inspired by atomic bonds.
Alhamdulillah, ATOM is finally here! A scalable and fast approach that can build and continuously update temporal knowledge graphs, inspired by atomic bonds. Just as matter is formed from atoms, and galaxies are formed from stars, knowledge is likely to be formed from atomic knowledge graphs. Atomic knowledge graphs were born from our intention to solve a common problem in LLM-based KG construction methods: exhaustivity and stability. Often, these methods produce unstable KGs that change when rerunning the construction process, even without changing anything. Moreover, they fail to capture all facts in the input documents and usually overlook the temporal and dynamic aspects of real-world data. What is the solution? Atomic facts that are temporally aware. Instead of constructing knowledge graphs from raw documents, we split them into atomic facts, which are self-contained and concise propositions. Temporal atomic KGs are constructed from each atomic fact. Then, we defined how the temporal atomic KGs would be merged at the atomic level and how the temporal aspects would be handled. We designed a binary merge algorithm that combines two TKGs and a parallel merge process that merges all TKGs simultaneously. The entire architecture operates in parallel. ATOM employs dual-time modeling that distinguishes observation time from validity time and has 3 main modules: - Module 1 (Atomic Fact Decomposition) splits input documents observed at time t into atomic facts using LLM-based prompting, where each temporal atomic fact is a short, self-contained snippet that conveys exactly one piece of information. - Module 2 (Atomic TKGs Construction) extracts 5-tuples in parallel from each atomic fact to construct atomic temporal KGs, while embedding nodes and relations and addressing temporal resolution during extraction. - Module 3 (Parallel Atomic Merge) employs a binary merge algorithm to merge pairs of atomic TKGs through iterative pairwise merging in parallel until convergence, with three resolution phases: (1) entity resolution, (2) relation name resolution, and (3) temporal resolution that merges observation and validity time sets for relations with similar (e_s, r_p, e_o). The resulting TKG snapshot is then merged with the previous DTKG to yield the updated DTKG. Results: Empirical evaluations demonstrate that ATOM achieves ~18% higher exhaustivity, ~17% better stability, and over 90% latency reduction compared to baseline methods (including iText2KG), demonstrating strong scalability potential for dynamic TKG construction. Check our ATOM's architecture and code: Preprint Paper: https://lnkd.in/dsJzDaQc Code: https://lnkd.in/drZUyisV Website: (coming soon) Example use cases: (coming soon) Special thanks to the dream team: Ludovic Moncla, Khalid Benabdeslem, Rémy Cazabet, Pierre Cléau 📚📡 | 14 comments on LinkedIn
ATOM is finally here! A scalable and fast approach that can build and continuously update temporal knowledge graphs, inspired by atomic bonds.
·linkedin.com·
ATOM is finally here! A scalable and fast approach that can build and continuously update temporal knowledge graphs, inspired by atomic bonds.
Beyond RDF vs LPG: Operational Ontologies, Hybrid Semantics, and Why We Still Chose a Property Graph | LinkedIn
Beyond RDF vs LPG: Operational Ontologies, Hybrid Semantics, and Why We Still Chose a Property Graph | LinkedIn
How to stay sane about “semantic Graph RAG” when your job is shipping reliable systems, not winning ontology theology wars. You don’t wake up in the morning thinking about OWL profiles or SPARQL entailment regimes.
·linkedin.com·
Beyond RDF vs LPG: Operational Ontologies, Hybrid Semantics, and Why We Still Chose a Property Graph | LinkedIn
Text2KGBench-LettrIA: A Refined Benchmark for Text2Graph Systems
Text2KGBench-LettrIA: A Refined Benchmark for Text2Graph Systems
🚀 LLMs can be powerful tools to extract information from texts and automatically populate Knowledge Graphs guided by ontologies given as inputs. BUT how good are they? To reply to this question, we need benchmarks! 💡 With Lettria, we build the Text2KGBench-LettrIA benchmark covering 19 different ontologies in various domains (company, film, food, politician, sports, monument, etc.) and consisting of near 5k sentences strictly annotated with triples conforming to these ontologies (208 classes, 426 properties) yielding more than 17k triples. What's more? We throw a lot of compute to compare the performance and efficiency of numerous Closed LLMs models and variants (GPT4, Claude 3, Gemini) and numerous fine-tuned Open Weights models (Mistral 3, Qwen 3, Gemma 3, Phi 4). ✨Key take-away: when being provided with high quality data, fine-tuned open models largely outperform larger, proprietary counterparts! 📄 Curious about the detailed results? Read our paper at https://lnkd.in/e-EZCjWm See our presentation at https://lnkd.in/eEdCCpdA that I have just presented at the Knowledge Base Construction from Pre-Trained Language Models Workshop colocated with the ISWC - International Semantic Web Conference. You want to use these results in your operations? Sign-up for using the newly released PERSEUS model, https://lnkd.in/e7exyJHc Joint work with Julien PLU, Oscar Moreno Escobar, Edouard Trouillez, Axelle Gapin, Pasquale Lisena, Thibault Ehrhart #iswc2025 #LLMs #KnowledgeGraphs #NLP #Research EURECOM, Charles Borderie
·linkedin.com·
Text2KGBench-LettrIA: A Refined Benchmark for Text2Graph Systems
Your agents NEED a semantic layer
Your agents NEED a semantic layer
Your agents NEED a semantic layer 🫵 Traditional RAG systems embed documents, retrieve similar chunks, and feed them to LLMs. This works for simple Q&A. It fails catastrophically for agents that need to reason across systems. Why? Because semantic similarity doesn't capture relationships. Your vector database can tell you that two documents are "about bonds." It can't tell you that Document A contains the official pricing methodology, Document B is a customer complaint referencing that methodology, and Document C is an assembly guide that superseded both. These relationships are invisible to embeddings. What semantic layers provide: Entity resolution across data silos. When "John Smith" in your CRM, "J. Smith" in email, and "john.smith@company.com" in logs all map to the same person node, agents can traverse the complete context. Cross-domain entity linking through knowledge graphs. Products in your database connect to assembly guides, which link to customer reviews, which reference support tickets. Single-query traversal instead of application-level joins. Provenance-tracked derivations. Every extracted entity, inferred relationship, and generated embedding maintains lineage to source data. Critical for regulatory compliance and debugging agent behavior. Ontology-grounded reasoning. Financial instruments mapped to FIBO standards. Products mapped to domain taxonomies. Agents reason with structured vocabulary, not statistical word associations. The technical implementation pattern: Layer 1: Unified graph database supporting vector, structured, and semi-structured data types in single queries. Layer 2: Entity extraction pipeline with coreference resolution and deduplication across sources. Layer 3: Relationship inference and cross-domain linking using both explicit identifiers and contextual signals. Layer 4: Separation of first-party data from derived artifacts with clear tagging for safe regeneration. The result: Agents can traverse "Product → described_in → AssemblyGuide → improved_by → CommunityTip → authored_by → Expert" in a single graph query instead of five API calls with application-level joins. Model Context Protocol is emerging as the open standard for semantic tool modeling. Not just describing APIs, but encoding what tools do, when to use them, and how outputs compose. This enables agents to discover and reason about capabilities dynamically. The competitive moat isn't your model choice. The moat is your knowledge graph architecture and the accumulated entity relationships that took years to build. | 28 comments on LinkedIn
Your agents NEED a semantic layer
·linkedin.com·
Your agents NEED a semantic layer
Can LLMs Really Build Knowledge Graphs We Can Trust?
Can LLMs Really Build Knowledge Graphs We Can Trust?
🕸️ Can LLMs Really Build Knowledge Graphs We Can Trust? There’s a growing trend: “Let’s use LLMs to build knowledge graphs.” It sounds like the perfect shortcut - take unstructured data, prompt an LLM, and get a ready-to-use graph. But… are we sure those graphs are trustworthy? Before that, let’s pause for a second: 💡 Why build knowledge graphs at all? Because they solve one of AI’s biggest weaknesses - lack of structure and reasoning. Graphs let us connect facts, entities, and relationships in a way that’s transparent, queryable, and explainable. They give context, memory, and logic - everything that raw text or embeddings alone can’t provide. Yet, here’s the catch when using LLMs to build them: 🔹 Short context window - LLMs can only “see” a limited amount of data at once, losing consistency across larger corpora. 🔹 Hallucinations - when context runs out or ambiguity appears, models confidently invent facts or relations that never existed. 🔹 Lack of provenance - LLM outputs don’t preserve why or how a link was made. Without traceability, you can’t audit or explain your graph. 🔹 Temporal instability - the same prompt can yield different graphs tomorrow, because stochastic generation ≠ deterministic structure. 🔹 Scalability & cost - large-scale graph construction requires persistent context and reasoning, which LLMs weren’t designed for. Building knowledge graphs isn’t just data extraction - it’s engineering meaning. It demands consistency, provenance, and explainability, not just text generation. LLMs can assist in this process, but they shouldn’t be the architect. The next step is finding a way to make graphs both trustworthy and instant - without compromising one for the other. | 11 comments on LinkedIn
Can LLMs Really Build Knowledge Graphs We Can Trust?
·linkedin.com·
Can LLMs Really Build Knowledge Graphs We Can Trust?
Where is GraphRAG actually working in production?
Where is GraphRAG actually working in production?
"GraphRAG chatter is louder than its footprint in production." That line from Ben Lorica's piece on Gradient Flow stopped me in my tracks: https://lnkd.in/dmC-ykAu I was reading it because of my deep interest in graph-based reasoning, and while the content is excellent, I was genuinely surprised by the assessment of GraphRAG adoption. The article suggests that a year after the initial buzz, GraphRAG remains mostly confined to graph vendors and specialists, with little traction in mainstream AI engineering teams. Here's the thing: at GraphAware, we have GraphRAG running in production: our AskTheDocs conversational interface in Hume uses this approach to help customers query documentation, and the feedback has been consistently positive. It's not an experiment—it's a production feature our users rely on daily. So I have a question for my network (yes, I know you're a bit biased—many of you are graph experts, after all 😊): Where is GraphRAG actually working in production? I'm not looking for POCs, experiments, or "we're exploring it." I want to hear about real, deployed systems serving actual users. Success stories. Production use cases. The implementations that are quietly delivering value while the tech commentary wonders if anyone is using this stuff. If you have direct or indirect experience with GraphRAG in production, I'd love to hear from you: - Drop a comment below - Send me a DM - Email me directly I want to give these cases a voice and learn from what's actually working out there. Who's building with GraphRAG beyond the buzz? #GraphRAG #KnowledgeGraphs #AI #ProductionAI #RAG
Where is GraphRAG actually working in production?
·linkedin.com·
Where is GraphRAG actually working in production?
Is OpenAI quietly moving toward knowledge graphs?
Is OpenAI quietly moving toward knowledge graphs?
Is OpenAI quietly moving toward knowledge graphs? Yesterday’s OpenAI DevDay was all about new no-code tools to create agents. Impressive. But what caught my attention wasn’t what they announced… it’s what they didn’t talk about. During the summer, OpenAI released a Cookbook update introducing the concept Temporal Agents (see below) connecting it to Subject–Predicate–Object triples: the very foundation of a knowledge graph. If you’ve ever worked with graphs, you know this means something big: they’re not just building agents anymore they’re building memory, relationships, and meaning. When you see “London – isCapitalOf – United Kingdom” in their official docs, you realize they’re experimenting with how to represent knowledge itself. And with any good knowledge graph… comes an ontology. So here’s my prediction: ChatGPT-6 will come with a built-in graph that connects everything about you. The question is: do you want their AI to know everything about you? Or do you want to build your own sovereign AI, one that you own, built from open-source intelligence and collective knowledge? Would love to know what you think. Is that me hallucinating or is that a weak signal?👇 | 62 comments on LinkedIn
Is OpenAI quietly moving toward knowledge graphs?
·linkedin.com·
Is OpenAI quietly moving toward knowledge graphs?
Automatic Ontology Generation Still Falls Short & Why Applied Ontologists Deliver the ROI | LinkedIn
Automatic Ontology Generation Still Falls Short & Why Applied Ontologists Deliver the ROI | LinkedIn
For all the excitement around large language models, the latest research from Simona-Vasilica Oprea and Georgiana Stănescu (Electronics 14:1313, 2025) offers a reality check. Automatic ontology generation, even with novel prompting techniques like Memoryless CQ-by-CQ and Ontogenia, remains a partial
·linkedin.com·
Automatic Ontology Generation Still Falls Short & Why Applied Ontologists Deliver the ROI | LinkedIn
Algorithmic vs. Symbolic Reasoning: Is Graph Data Science a critical, transformative layer for GraphRAG?
Algorithmic vs. Symbolic Reasoning: Is Graph Data Science a critical, transformative layer for GraphRAG?
Is Graph Data Science a critical, transformative layer for GraphRAG? The field of enterprise Artificial Intelligence (AI) is undergoing a significant architectural evolution. The initial enthusiasm for Large Language Models (LLMs) has matured into a pragmatic recognition of their limitations, partic
·linkedin.com·
Algorithmic vs. Symbolic Reasoning: Is Graph Data Science a critical, transformative layer for GraphRAG?
Flexible-GraphRAG
Flexible-GraphRAG
𝗙𝗹𝗲𝘅𝗶𝗯𝗹𝗲 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 𝗼𝗿 𝗥𝗔𝗚 is now flexing to the max using LlamaIndex, supports 𝟳 𝗴𝗿𝗮𝗽𝗵 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀, 𝟭𝟬 𝘃𝗲𝗰𝘁𝗼𝗿 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀, 𝟭𝟯 𝗱𝗮𝘁𝗮 𝘀𝗼𝘂𝗿𝗰𝗲𝘀, 𝗟𝗟𝗠𝘀, Docling 𝗱𝗼𝗰 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴, 𝗮𝘂𝘁𝗼 𝗰𝗿𝗲𝗮𝘁𝗲 𝗞𝗚𝘀, 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚, 𝗛𝘆𝗯𝗿𝗶𝗱 𝗦𝗲𝗮𝗿𝗰𝗵, 𝗔𝗜 𝗖𝗵𝗮𝘁 (shown Hyland products web page data src) 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝗚𝗿𝗮𝗽𝗵: Neo4j ArcadeDB FalkorDB Kuzu NebulaGraph, powered by Vesoft (coming Memgraph and 𝗔𝗺𝗮𝘇𝗼𝗻 𝗡𝗲𝗽𝘁𝘂𝗻𝗲) 𝗩𝗲𝗰𝘁𝗼𝗿: Qdrant, Elastic, OpenSearch Project, Neo4j 𝘃𝗲𝗰𝘁𝗼𝗿, Milvus, created by Zilliz (coming Weaviate, Chroma, Pinecone, 𝗣𝗼𝘀𝘁𝗴𝗿𝗲𝗦𝗤𝗟 + 𝗽𝗴𝘃𝗲𝗰𝘁𝗼𝗿, LanceDB) Docling 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴 𝗗𝗮𝘁𝗮 𝗦𝗼𝘂𝗿𝗰𝗲𝘀: using LlamaIndex readers: working: Web Pages, Wikipedia, Youtube, untested: Google Drive, Msft OneDrive, S3, Azure Blob, GCS, Box, SharePoint, previous: filesystem, Alfresco, CMIS. 𝗟𝗟𝗠𝘀: 𝗟𝗹𝗮𝗺𝗮𝗜𝗻𝗱𝗲𝘅 𝗟𝗟𝗠𝘀 (OpenAI, Ollama, Claude, Gemini, etc.) 𝗥𝗲𝗮𝗰𝘁, 𝗩𝘂𝗲, 𝗔𝗻𝗴𝘂𝗹𝗮𝗿 𝗨𝗜𝘀, 𝗠𝗖𝗣 𝘀𝗲𝗿𝘃𝗲𝗿, 𝗙𝗮𝘀𝘁𝗔𝗣𝗜 𝘀𝗲𝗿𝘃𝗲𝗿 𝗚𝗶𝘁𝗛𝘂𝗯 𝘀𝘁𝗲𝘃𝗲𝗿𝗲𝗶𝗻𝗲𝗿/𝗳𝗹𝗲𝘅𝗶𝗯𝗹𝗲-𝗴𝗿𝗮𝗽𝗵𝗿𝗮𝗴: https://lnkd.in/eUEeF2cN 𝗫.𝗰𝗼𝗺 𝗣𝗼𝘀𝘁 𝗼𝗻 𝗙𝗹𝗲𝘅𝗶𝗯𝗹𝗲 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 𝗼𝗿 𝗥𝗔𝗚 𝗺𝗮𝘅 𝗳𝗹𝗲𝘅𝗶𝗻𝗴 https://lnkd.in/gHpTupAr 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗲𝗱 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰𝘀 𝗕𝗹𝗼𝗴: https://lnkd.in/ehpjTV7d
·linkedin.com·
Flexible-GraphRAG
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
KG-R1, Why Knowledge Graph RAG Systems Are Too Expensive to Deploy (And How One Team Fixed It) ... What if I told you that most knowledge graph systems require multiple AI models just to answer a single question? That's exactly the problem plaguing current KG-RAG deployments. 👉 The Cost Problem Traditional knowledge graph retrieval systems use a pipeline approach: one model for planning, another for reasoning, a third for reviewing, and a fourth for responding. Each step burns through tokens and compute resources, making deployment prohibitively expensive for most organizations. Even worse? These systems are built for specific knowledge graphs. Change your data source, and you need to retrain everything. 👉 A Single-Agent Solution Researchers from MIT and IBM just published KG-R1, which replaces this entire multi-model pipeline with one lightweight agent that learns through reinforcement learning. Here's the clever part: instead of hardcoding domain-specific logic, the system uses four simple, universal operations: - Get relations from an entity - Get entities from a relation - Navigate forward or backward through connections These operations work on any knowledge graph without modification. 👉 The Results Are Striking Using just a 3B parameter model, KG-R1: - Matches accuracy of much larger foundation models - Uses 60% fewer tokens per query than existing methods - Transfers across different knowledge graphs without retraining - Processes queries in under 7 seconds on a single GPU The system learned to retrieve information strategically through multi-turn interactions, optimized end-to-end rather than stage-by-stage. This matters because knowledge graphs contain some of our most valuable structured data - from scientific databases to legal documents. Making them accessible and affordable could unlock entirely new applications.
https://arxiv.org/abs/2509.26383v1 Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
·linkedin.com·
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
Introducing the GitLab Knowledge Graph
Introducing the GitLab Knowledge Graph
Today, I'd like to introduce the GitLab Knowledge Graph. This release includes a code indexing engine, written in Rust, that turns your codebase into a live, embeddable graph database for LLM RAG. You can install it with a simple one-line script, parse local repositories directly in your editor, and connect via MCP to query your workspace and over 50,000 files in under 100 milliseconds. We also saw GKG agents scoring up to 10% higher on the SWE-Bench-lite benchmarks, with just a few tools and a small prompt added to opencode (an open-source coding agent). On average, we observed a 7% accuracy gain across our eval runs, and GKG agents were able to solve new tasks compared to the baseline agents. You can read more from the team's research here https://lnkd.in/egiXXsaE. This release is just the first step: we aim for this local version to serve as the backbone of a Knowledge Graph service that enables you to query the entire GitLab Software Development Life Cycle—from an Issue down to a single line of code. I am incredibly proud of the work the team has done. Thank you, Michael U., Jean-Gabriel Doyon, Bohdan Parkhomchuk, Dmitry Gruzd, Omar Qunsul, and Jonathan Shobrook. You can watch Bill Staples and I present this and more in the GitLab 18.4 release here: https://lnkd.in/epvjrhqB Try today at: https://lnkd.in/eAypneFA Roadmap: https://lnkd.in/eXNYQkEn Watch more below for a complete, in-depth tutorial on what we've built: | 19 comments on LinkedIn
introduce the GitLab Knowledge Graph
·linkedin.com·
Introducing the GitLab Knowledge Graph
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation ... Why Current AI Search Falls Short When You Need Real Answers What happens when you ask an AI system a complex question that requires connecting multiple pieces of information? Most current approaches retrieve some relevant documents, generate an answer, and call it done. But this single-pass strategy often misses critical evidence. 👉 The Problem with Shallow Retrieval Traditional retrieval-augmented generation (RAG) systems work like a student who only skims the first few search results before writing an essay. They grab what seems relevant on the surface but miss deeper connections that would lead to better answers. When researchers tested these systems on complex multi-hop questions, they found a consistent pattern: the AI would confidently provide answers based on incomplete evidence, leading to logical gaps and missing key facts. 👉 A New Approach: Deep Searching with Dual Channels Researchers from IDEA Research and Hong Kong University of Science and Technology developed GraphSearch, which works more like a thorough investigator than a quick searcher. The system breaks down complex questions into smaller, manageable pieces, then searches through both text documents and structured knowledge graphs. Think of it as having two different research assistants: one excellent at finding descriptive information in documents, another skilled at tracing relationships between entities. 👉 How It Actually Works Instead of one search-and-answer cycle, GraphSearch uses six coordinated modules: Query decomposition splits complex questions into atomic sub-questions Context refinement filters out noise from retrieved information Query grounding fills in missing details from previous searches Logic drafting organizes evidence into coherent reasoning chains Evidence verification checks if the reasoning holds up Query expansion generates new searches to fill identified gaps The system continues this process until it has sufficient evidence to provide a well-grounded answer. 👉 Real Performance Gains Testing across six different question-answering benchmarks showed consistent improvements. On the MuSiQue dataset, for example, answer accuracy jumped from 35% to 51% when GraphSearch was integrated with existing graph-based systems. The approach works particularly well under constrained conditions - when you have limited computational resources for retrieval, the iterative searching strategy maintains performance better than single-pass methods. This research points toward more reliable AI systems that can handle the kind of complex reasoning we actually need in practice. Paper: "GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation" by Yang et al.
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
·linkedin.com·
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation