Search GraphNews

Found 512 bookmarks

Custom sorting

AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas

AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas 👉 Why This Matters Traditional knowledge graphs face a paradox: they require expert-crafted schemas to organize information, creating bottlenecks for scalability and adaptability. This limits their ability to handle dynamic real-world knowledge or cross-domain applications effectively. 👉 What Changed AutoSchemaKG eliminates manual schema design through three innovations: 1. Dynamic schema induction: LLMs automatically create conceptual hierarchies while extracting entities/events 2. Event-aware modeling: Captures temporal relationships and procedural knowledge missed by entity-only approaches 3. Multi-level conceptualization: Organizes instances into semantic categories through abstraction layers The system processed 50M+ documents to build ATLAS - a family of KGs with: - 900M+ nodes (entities/events/concepts) - 5.9B+ relationships - 95% alignment with human-created schemas (zero manual intervention) 👉 How It Works 1. Triple extraction pipeline: - LLMs identify entity-entity, entity-event, and event-event relationships - Processes text at document level rather than sentence level for context preservation 2. Schema induction: - Automatically groups instances into conceptual categories - Creates hierarchical relationships between specific facts and abstract concepts 3. Scale optimization: - Handles web-scale corpora through GPU-accelerated batch processing - Maintains semantic consistency across 3 distinct domains (Wikipedia, academic papers, Common Crawl) 👉 Proven Impact - Boosts multi-hop QA accuracy by 12-18% over state-of-the-art baselines - Improves LLM factuality by up to 9% on specialized domains like medicine and law - Enables complex reasoning through conceptual bridges between disparate facts 👉 Key Insight The research demonstrates that billion-scale KGs with dynamic schemas can effectively complement parametric knowledge in LLMs when they reach critical mass (1B+ facts). This challenges the assumption that retrieval augmentation needs domain-specific tuning to be effective. Question for Discussion As autonomous KG construction becomes viable, how should we rethink the role of human expertise in knowledge representation? Should curation shift from schema design to validation and ethical oversight? | 15 comments on LinkedIn

AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas

#KnowledgeGraph #AI #LLM #research #GenAI

·linkedin.com·Jun 4, 2025

AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas

DRAG introduces a novel distillation framework that transfers RAG capabilities from LLMs to SLMs through Evidence-based distillation and Graph-based structuring

Small Models, Big Knowledge: How DRAG Bridges the AI Efficiency-Accuracy Gap 👉 Why This Matters Modern AI systems face a critical tension: large language models (LLMs) deliver impressive knowledge recall but demand massive computational resources, while smaller models (SLMs) struggle with factual accuracy and "hallucinations." Traditional retrieval-augmented generation (RAG) systems amplify this problem by requiring constant updates to vast knowledge bases. 👉 The Innovation DRAG introduces a novel distillation framework that transfers RAG capabilities from LLMs to SLMs through two key mechanisms: 1. Evidence-based distillation: Filters and ranks factual snippets from teacher LLMs 2. Graph-based structuring: Converts retrieved knowledge into relational graphs to preserve critical connections This dual approach reduces model size requirements by 10-100x while improving factual accuracy by up to 27.7% compared to prior methods like MiniRAG. 👉 How It Works 1. Evidence generation: A large teacher LLM produces multiple context-relevant facts 2. Semantic filtering: Combines cosine similarity and LLM scoring to retain top evidence 3. Knowledge graph creation: Extracts entity relationships to form structured context 4. Distilled inference: SLMs generate answers using both filtered text and graph data The process mimics how humans combine raw information with conceptual understanding, enabling smaller models to "think" like their larger counterparts without the computational overhead. 👉 Privacy Bonus DRAG adds a privacy layer by: - Local query sanitization before cloud processing - Returning only de-identified knowledge graphs Tests show 95.7% reduction in potential personal data leakage while maintaining answer quality. 👉 Why It’s Significant This work addresses three critical challenges simultaneously: - Makes advanced RAG capabilities accessible on edge devices - Reduces hallucination rates through structured knowledge grounding - Preserves user privacy in cloud-based AI interactions The GitHub repository provides full implementation details, enabling immediate application in domains like healthcare diagnostics, legal analysis, and educational tools where accuracy and efficiency are non-negotiable.

DRAG introduces a novel distillation framework that transfers RAG capabilities from LLMs to SLMs through two key mechanisms:1. Evidence-based distillation: Filters and ranks factual snippets from teacher LLMs2. Graph-based structuring: Converts retrieved knowledge into relational graphs to preserve critical connections

#LLM #research #AI #KnowledgeGraph

·linkedin.com·Jun 4, 2025

DRAG introduces a novel distillation framework that transfers RAG capabilities from LLMs to SLMs through Evidence-based distillation and Graph-based structuring

Semantically Composable Architectures

I'm happy to share the draft of the "Semantically Composable Architectures" mini-paper. It is the culmination of approximately four years' work, which began with Coreless Architectures and has now evolved into something much bigger. LLMs are impressive, but a real breakthrough will occur once we surpass the cognitive capabilities of a single human brain. Enabling autonomous large-scale system reverse engineering and large-scale autonomous transformation with minimal to no human involvement, while still making it understandable to humans if they choose to, is a central pillar of making truly groundbreaking changes. We hope the ideas we shared will be beneficial to humanity and advance our civilization further. It is not final and will require some clarification and improvements, but the key concepts are present. Happy to hear your thoughts and feedback. Some of these concepts underpin the design of the Product X system. Part of the core team + external contribution: Andrew Barsukov Andrey Kolodnitsky Sapta Girisa N Keith E. Glendon Gurpreet Sachdeva Saurav Chandra Mike Diachenko Oleh Sinkevych | 13 comments on LinkedIn

Semantically Composable Architectures

#research #AI #KnowledgeGraph #LLM #semantics #technical

·linkedin.com·Jun 1, 2025

Semantically Composable Architectures

Leveraging Large Language Models for Realizing Truly Intelligent...

The number of published scholarly articles is growing at a significant rate, making scholarly knowledge organization increasingly important. Various approaches have been proposed to organize...

#KnowledgeGraph #LLM #research #semantics #AI

·arxiv.org·Jun 1, 2025

Leveraging Large Language Models for Realizing Truly Intelligent...

Want to explore the Anthropic Transformer-Circuit's as a queryable graph?

Want to explore the Anthropic Transformer-Circuit's as a queryable graph? Wrote a script to import the graph json into Neo4j - code in Gist. https://lnkd.in/eT4NjQgY https://lnkd.in/e38TfQpF Next step - write directly from the circuit-tracer library to the graph db. https://lnkd.in/eVU_t6mS

Want to explore the Anthropic Transformer-Circuit's as a queryable graph?

#GraphDB #KnowledgeGraph #AI #LLM #technical #open source

·linkedin.com·May 30, 2025

Want to explore the Anthropic Transformer-Circuit's as a queryable graph?

Introducing FACT: Fast Augmented Context Tools (3.2x faster, 90% cost reduction vs RAG)

Introducing FACT: Fast Augmented Context Tools (3.2x faster, 90% cost reduction vs RAG) RAG had its run, but it’s not built for agentic systems. Vectors are fuzzy, slow, and blind to context. They work fine for static data, but once you enter recursive, real-time workflows, where agents need to reason, act, and reflect. RAG collapses under its own ambiguity. That’s why I built FACT: Fast Augmented Context Tools. Traditional Approach: User Query → Database → Processing → Response (2-5 seconds) FACT Approach: User Query → Intelligent Cache → [If Miss] → Optimized Processing → Response (50ms) It replaces vector search in RAG pipelines with a combination of intelligent prompt caching and deterministic tool execution via MCP. Instead of guessing which chunk is relevant, FACT explicitly retrieves structured data, SQL queries, live APIs, internal tools, then intelligently caches the result if it’s useful downstream. The prompt caching isn’t just basic storage. It’s intelligent using the prompt cache from Anthropic and other LLM providers, tuned for feedback-driven loops: static elements get reused, transient ones expire, and the system adapts in real time. Some things you always want cached, schemas, domain prompts. Others, like live data, need freshness. Traditional RAG is particularly bad at this. Ask anyone force to frequently update vector DBs. I'm also using Arcade.dev to handle secure, scalable execution across both local and cloud environments, giving FACT hybrid intelligence for complex pipelines and automatic tool selection. If you're building serious agents, skip the embeddings. RAG is a workaround. FACT is a foundation. It’s cheaper, faster, and designed for how agents actually work: with tools, memory, and intent. To get started point your favorite coding agent at: https://lnkd.in/gek_akem | 38 comments on LinkedIn

Introducing FACT: Fast Augmented Context Tools (3.2x faster, 90% cost reduction vs RAG)

#LLM #AI #GenAI #technical #KnowledgeGraph

·linkedin.com·May 28, 2025

Introducing FACT: Fast Augmented Context Tools (3.2x faster, 90% cost reduction vs RAG)

A-MEM Transforms AI Agent Memory with Zettelkasten Method, Atomic Notes, Dynamic Linking & Continuous Evolution

🏯🏇 A-MEM Transforms AI Agent Memory with Zettelkasten Method, Atomic Notes, Dynamic Linking & Continuous Evolution! This Novel Memory fixes rigid structures with adaptable, evolving, and interconnected knowledge networks, delivering 2x performance in complex reasoning tasks. 𝗧𝗵𝗶𝘀 𝗶𝘀 𝘄𝗵𝗮𝘁 𝗜 𝗹𝗲𝗮𝗿𝗻𝗲𝗱: ﹌﹌﹌﹌﹌﹌﹌﹌﹌》 𝗪𝗵𝘆 𝗧𝗿𝗮𝗱𝗶𝘁𝗶𝗼𝗻𝗮𝗹 𝗠𝗲𝗺𝗼𝗿𝘆 𝗙𝗮𝗹𝗹 𝗦𝗵𝗼𝗿𝘁 Most AI agents today rely on simplistic storage and retrieval but break down when faced with complex, multi-step reasoning tasks. ✸ Common Limitations: ☆ Fixed schemas: Conventional memory systems require predefined structures that limit flexibility. ☆ Limited adaptability: When new information arises, old memories remain static and disconnected, reducing an agent’s ability to build on past experiences. ☆ Ineffective long-term retention: AI agents often struggle to recall relevant past interactions, leading to redundant processing and inefficiencies. ﹌﹌﹌﹌﹌﹌﹌﹌﹌》𝗔-𝗠𝗘𝗠: 𝗔𝘁𝗼𝗺𝗶𝗰 𝗻𝗼𝘁𝗲𝘀 𝗮𝗻𝗱 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗹𝗶𝗻𝗸𝗶𝗻𝗴 A-MEM organizes knowledge in a way that mirrors how humans create and refine ideas over time. ✸ How it Works: ☆ Atomic notes: Information is broken down into small, self-contained knowledge units, ensuring clarity and easy integration with future knowledge. ☆ Dynamic linking: Instead of relying on static categories, A-MEM automatically creates connections between related knowledge, forming a network of interrelated ideas. ﹌﹌﹌﹌﹌﹌﹌﹌﹌》 𝗣𝗿𝗼𝘃𝗲𝗻 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗔𝗱𝘃𝗮𝗻𝘁𝗮𝗴𝗲 A-MEM delivers measurable improvements. ✸ Empirical results demonstrate: ☆ Over 2x performance improvement in complex reasoning tasks, where AI must synthesize multiple pieces of information across different timeframes. ☆ Superior efficiency across top foundation models, including GPT, Llama, and Qwen—proving its versatility and broad applicability. ﹌﹌﹌﹌﹌﹌﹌﹌﹌》 𝗜𝗻𝘀𝗶𝗱𝗲 𝗔-𝗠𝗘𝗠 ✸ Note Construction: ☆ AI-generated structured notes that capture essential details and contextual insights. ☆ Each memory is assigned metadata, including keywords and summaries, for faster retrieval. ✸ Link Generation: ☆ The system autonomously connects new memories to relevant past knowledge. ☆ Relationships between concepts emerge naturally, allowing AI to recognize patterns over time. ✸ Memory Evolution: ☆ Older memories are continuously updated as new insights emerge. ☆ The system dynamically refines knowledge structures, mimicking the way human memory strengthens connections over time. ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ⫸ꆛ Want to build Real-World AI agents? Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝟰-𝗶𝗻-𝟭 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 TODAY! 𝟰𝟴𝟬+ already Enrolled. ➠ Build Real-World AI Agents for Healthcare, Finance,Smart Cities,Sales ➠ Learn 4 Framework: LangGraph | PydanticAI | CrewAI | OpenAI Swarm ➠ Work with Text, Audio, Video and Tabular Data 👉𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟰𝟱% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁): https://lnkd.in/eGuWr4CH | 27 comments on LinkedIn

A-MEM Transforms AI Agent Memory with Zettelkasten Method, Atomic Notes, Dynamic Linking & Continuous Evolution

#KnowledgeGraph #AI #research #LLM #technical #GenAI

·linkedin.com·May 28, 2025

A-MEM Transforms AI Agent Memory with Zettelkasten Method, Atomic Notes, Dynamic Linking & Continuous Evolution

RAG vs Graph RAG, explained visually

RAG vs Graph RAG, explained visually. (it's a popular LLM interview question) Imagine you have a long document, say a biography, about an individual (X) who has accomplished several things in this life. ↳ Chapter 1: Talks about Accomplishment-1. ↳ Chapter 2: Talks about Accomplishment-2. ... ↳ Chapter 10: Talks about Accomplishment-10. Summarizing all these accomplishments via RAG might never be possible since... ...it must require the entire context... ...but one might only be fetching the top-k relevant chunks from the vector db. Moreover, since traditional RAG systems retrieve each chunk independently, this can often leave the LLM to infer the connections between them (provided the chunks are retrieved). Graph RAG solves this. The idea is to first create a graph (entities & relationships) from the documents and then do traversal over that graph during the retrieval phase. See how Graph RAG solves the above problems. - First, a system (typically an LLM) will create the graph by understanding the biography. - This will produce a full graph of nodes entities & relationships, and a subgraph will look like this: ↳ X → → Accomplishment-1. ↳ X → → Accomplishment-2. ... ↳ X → → Accomplishment-N. When summarizing these accomplishments, the retrieval phase can do a graph traversal to fetch all the relevant context related to X's accomplishments. This context, when passed to the LLM, will produce a more coherent and complete answer as opposed to traditional RAG. Another reason why Graph RAG systems are so effective is because LLMs are inherently adept at reasoning with structured data. Graph RAG instills that structure into them with their retrieval mechanism. 👉 Over to you: What are some other issues with traditional RAG systems that Graph RAG solves? ____ Find me → Avi Chawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs. | 24 comments on LinkedIn

RAG vs Graph RAG, explained visually

#AI #KnowledgeGraph #LLM #technical #GenAI

·linkedin.com·May 28, 2025

RAG vs Graph RAG, explained visually

🌐 From Unstructured Chaos to Structured Insight: Building a Graph-RAG-Ready Knowledge Graph

In the age of AI-driven applications, the ability to ground large language models (LLMs) with trustworthy, contextual information is more…

#KnowledgeGraph #AI #LLM #technical #GenAI

·medium.com·May 27, 2025

🌐 From Unstructured Chaos to Structured Insight: Building a Graph-RAG-Ready Knowledge Graph

A Pilot Empirical Study on When and How to Use Knowledge Graphs as...

The integration of Knowledge Graphs (KGs) into the Retrieval Augmented Generation (RAG) framework has attracted significant interest, with early studies showing promise in mitigating...

#KnowledgeGraph #research #LLM #GenAI

·arxiv.org·May 24, 2025

A Pilot Empirical Study on When and How to Use Knowledge Graphs as...

CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented...

Retrieval-Augmented Generation (RAG) has significantly enhanced large language models (LLMs) in knowledge-intensive tasks by incorporating external knowledge retrieval. However, existing RAG...

this https URL elakhatibi/CDF-RAG

#research #AI #LLM #KnowledgeGraph #GenAI

·arxiv.org·May 23, 2025

CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented...

Paper page - Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation

Join the discussion on this paper page

#research #LLM #KnowledgeGraph #GenAI #AI

·huggingface.co·May 23, 2025

Paper page - Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation

Paper page - HeteRAG: A Heterogeneous Retrieval-augmented Generation Framework with Decoupled Knowledge Representations

Join the discussion on this paper page

#research #GenAI #KnowledgeGraph #LLM

·huggingface.co·May 23, 2025

Paper page - HeteRAG: A Heterogeneous Retrieval-augmented Generation Framework with Decoupled Knowledge Representations

Ontology Reasoning Imperative for Intelligent GraphRAG (Part 1 of 2)

Comparative Review of Semantic Knowledge Graph vs Property Knowledge Graph

#KnowledgeGraph #AI #LLM #GenAI

·medium.com·May 23, 2025

Ontology Reasoning Imperative for Intelligent GraphRAG (Part 1 of 2)

Agentic GraphRAG for Commercial Contracts

Structuring legal information as a knowledge graph to increase answer accuracy using a LangGraph agent

#KnowledgeGraph #AI #LLM #technical

·medium.com·May 23, 2025

Agentic GraphRAG for Commercial Contracts

LLMs generate possibilities; knowledge graphs remember what works

LLMs generate possibilities; knowledge graphs remember what works. Together, they forge the recursive memory and creative engine that enables AI systems to truly evolve themselves. Combining neural components (like large language models) with symbolic verification creates a powerful framework for self-evolution that overcomes limitations of either approach used independently. AlphaEvolve demonstrates that self-evolving systems face a fundamental tension between generating novel solutions and ensuring those solutions actually work. The paper shows how AlphaEvolve addresses this through a hybrid architecture where: Neural components (LLMs) provide creative generation of code modifications by drawing on patterns learned from vast training data Symbolic components (code execution) provide ground truth verification through deterministic evaluation Without this combination, a system would either generate interesting but incorrect solutions (neural-only approach) or be limited to small, safe modifications within known patterns (symbolic-only approach). The system can operate at multiple levels of abstraction depending on the problem: raw solution evolution, constructor function evolution, search algorithm evolution, or co-evolution of intermediate solutions and search algorithms. This capability emanates directly from the neurosymbolic integration, where: Neural networks excel at working with continuous, high-dimensional spaces and recognizing patterns across abstraction levels Symbolic systems provide precise representations of discrete structures and logical relationships This enables AlphaEvolve to modify everything from specific lines of code to entire algorithmic approaches. While AlphaEvolve currently uses an evolutionary database, a knowledge graph structure could significantly enhance self-evolution by: Capturing evolutionary relationships between solutions Identifying patterns of code changes that consistently lead to improvements Representing semantic connections between different solution approaches Supporting transfer learning across problem domains Automated, objective evaluation is the core foundation enabling self-evolution: The main limitation of AlphaEvolve is that it handles problems for which it is possible to devise an automated evaluator. This evaluation component provides the "ground truth" feedback that guides evolution, allowing the system to: Differentiate between successful and unsuccessful modifications Create selection pressure toward better-performing solutions Avoid hallucinations or non-functional solutions that might emerge from neural components alone. When applied to optimize Gemini's training kernels, the system essentially improved the very LLM technology that powers it. | 12 comments on LinkedIn

LLMs generate possibilities; knowledge graphs remember what works

#KnowledgeGraph #AI #LLM #GenAI #research #technical

·linkedin.com·May 21, 2025

LLMs generate possibilities; knowledge graphs remember what works

I added a Knowledge Graph to Cursor using MCP

I added a Knowledge Graph to Cursor using MCP. You gotta see this working! Knowledge graphs are a game-changer for AI Agents, and this is one example of how you can take advantage of them. How this works: 1. Cursor connects to Graphiti's MCP Server. Graphiti is a very popular open-source Knowledge Graph library for AI agents. 2. Graphiti connects to Neo4j running locally. Now, every time I interact with Cursor, the information is synthesized and stored in the knowledge graph. In short, Cursor now "remembers" everything about our project. Huge! Here is the video I recorded. To get this working on your computer, follow the instructions on this link: https://lnkd.in/eeZ_4dkb Something super cool about using Graphiti's MCP server: You can use one model to develop the requirements and a completely different model to implement the code. This is a huge plus because you could use the stronger model at each stage. Also, Graphiti supports custom entities, which you can use when running the MCP server. You can use these custom entities to structure and recall domain-specific information, which will tenfold the accuracy of your results. Here is an example of what these look like: https://lnkd.in/efv7kTaH By the way, knowledge graphs for agents are a big thing. A few ridiculous and eye-opening benchmarks comparing an AI Agent using knowledge graphs with state-of-the-art methods: • 94.8% accuracy versus 93.4% in the Deep Memory Retrieval (DMR) benchmark. • 71.2% accuracy versus 60.2% on conversations simulating real-world enterprise use cases. • 2.58s of latency versus 28.9s. • 38.4% improvement in temporal reasoning. You'll find these benchmarks in this paper: https://fnf.dev/3CLQjBK | 36 comments on LinkedIn

I added a Knowledge Graph to Cursor using MCP

#KnowledgeGraph #AI #GenAI #LLM

·linkedin.com·May 20, 2025

I added a Knowledge Graph to Cursor using MCP

Introducing NLWeb: Bringing conversational interfaces directly to the web - Source

#KnowledgeGraph #semantics #SEO #AI #LLM

·news.microsoft.com·May 19, 2025

Introducing NLWeb: Bringing conversational interfaces directly to the web - Source

GraphFlow (Workflows) — AutoGen

#AI #LLM #technical #KnowledgeGraph

·microsoft.github.io·May 16, 2025

GraphFlow (Workflows) — AutoGen

PathRAG: The Enterprise Knowledge Graph Roadmap from Data Burden to Corporate Wisdom — and…

Imagine you’re handed a mountain of books in a language you barely speak. Traditional RAG systems do just that: they treat every text…

#KnowledgeGraph #LLM #technical #AI

·medium.com·May 16, 2025

PathRAG: The Enterprise Knowledge Graph Roadmap from Data Burden to Corporate Wisdom — and…

How GraphRAG Works Step-by-Step

Perhaps you’ve come across the paper From Local to Global: A GraphRAG Approach to Query-Focused Summarization, which is Microsoft…

#AI #KnowledgeGraph #LLM #technical #GenAI

·pub.towardsai.net·May 16, 2025

How GraphRAG Works Step-by-Step

Fine-tue an LLM model for triplet extraction

Do you want to fine-tune an LLM model for triplet extraction? These findings from a recently published paper (first comment) could save you much time. ✅ Does the choice of coding vs natural language prompts significantly impact performance? When fine-tuning these open weights and small LLMs, the choice between code and natural language prompts has a limited impact on performance. ✅ Does training fine-tuned models to include chain-of-thought (rationale) sections in their outputs improve KG construction (KGC) performance? It is ineffective at best and highly detrimental at worst for fine-tuned models. This performance decrease is observed regardless of the number of in-context learning examples provided. Attention analysis suggests this might be due to the model's attention being dispersed on redundant information when rationale is used. Without rationale lists occupying prompt space, the model's attention can focus directly on the ICL examples while extracting relations. ✅ How do the fine-tuned smaller, open-weight LLMs perform compared to the CodeKGC baseline, which uses larger, closed-source models (GPT-3.5)? The selected lightweight LLMs significantly outperform the much larger CodeKGC baseline after fine-tuning. The best fine-tuned models improve upon the CodeKGC baseline by as much as 15–20 absolute F1 points across the dataset. ✅ Does model size matter for KGC performance when fine-tuning with a small amount of training data? Yes, but not in a straightforward way. The 70 B-parameter versions yielded worse results than the 1B, 3B, and 8B models when undergoing the same small amount of training. This implies that for KGC with limited fine-tuning, smaller models can perform better than much larger ones. ✅ For instruction-tuned models without fine-tuning, does prompt language or rationale help? For models without fine-tuning, using code prompts generally yields the best results for both code LLMs and the Mistral natural language model. In addition, using rationale generally seems to help these models, with most of the best results obtained when including rationale lists in the prompt. ✅ What do the errors made by the models suggest about the difficulty of the KGC task? difficulty in predicting relations, entities, and their order, especially when dealing with specialized terminology or specific domain knowledge, which poses a challenge even after fine-tuning. Some errors include adding superfluous adjectives or mistaking entity instances for class names. ✅ What is the impact of the number of in-context learning (ICL) examples during fine-tuning? The greatest performance benefit is obtained when moving from 0 to 3 ICL examples. However, additional ICL examples beyond 3 do not lead to any significant performance delta and can even lead to worse results. This further indicates that the fine-tuning process itself is the primary driver of performance gain, allowing the model to learn the task from the input text and target output.

fine-tune an LLM model for triplet extraction

#LLM #GenAI #AI #research #KnowledgeGraph

·linkedin.com·May 14, 2025

Fine-tue an LLM model for triplet extraction

NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role

NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role. It’s not just smarter retrieval. It’s structured memory for AI agents. 》 Why NodeRAG? Most Retrieval-Augmented Generation (RAG) methods retrieve chunks of text. Good enough — until you need reasoning, precision, and multi-hop understanding. This is how NodeRAG solves these problems: 》 🔹Step 1: Graph Decomposition NodeRAG begins by decomposing raw text into smart building blocks: ✸ Semantic Units (S): Little event nuggets ("Hinton won the Nobel Prize.") ✸ Entities (N): Key names or concepts ("Hinton", "Nobel Prize") ✸ Relationships (R): Links between entities ("awarded to") ✩ This is like teaching your AI to recognize the actors, actions, and scenes inside any document. 》 🔹Step 2: Graph Augmentation Decomposition alone isn't enough. NodeRAG augments the graph by identifying important hubs: ✸ Node Importance: Using K-Core and Betweenness Centrality to find critical nodes ✩ Important entities get special attention — their attributes are summarized into new nodes (A). ✸ Community Detection: Grouping related nodes into communities and summarizing them into high-level insights (H). ✩ Each community gets a "headline" overview node (O) for quick retrieval. It's like adding context and intuition to raw facts. 》 🔹 Step 3: Graph Enrichment Knowledge without detail is brittle. So NodeRAG enriches the graph: ✸ Original Text: Full chunks are linked back into the graph (Text nodes, T) ✸ Semantic Edges: Using HNSW for fast, meaningful similarity connections ✩ Only smart nodes are embedded (not everything!) — saving huge storage space. ✩ Dual search (exact + vector) makes retrieval laser-sharp. It’s like turning a 2D map into a 3D living world. 》 🔹 Step 4: Graph Searching Now comes the magic. ✸ Dual Search: First find strong entry points (by name or by meaning) ✸ Shallow Personalized PageRank (PPR): Expand carefully from entry points to nearby relevant nodes. ✩ No wandering into irrelevant parts of the graph. The search is surgical. ✩ Retrieval includes fine-grained semantic units, attributes, high-level elements — everything you need, nothing you don't. It’s like sending out agents into a city — and they return not with everything they saw, but exactly what you asked for, summarized and structured. 》 Results: NodeRAG's Performance Compared to GraphRAG, LightRAG, NaiveRAG, and HyDE — NodeRAG wins across every major domain: Tech, Science, Writing, Recreation, and Finance. NodeRAG isn’t just a better graph. NodeRAG is a new operating system for memory. ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ⫸ꆛ Want to build Real-World AI agents? Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 TODAY! ➠ Build Real-World AI Agents + RAG Pipelines ➠ Learn 3 Tools: LangGraph/LangChain | CrewAI | OpenAI Swarm ➠ Work with Text, Audio, Video and Tabular Data 👉𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟯𝟰% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁): https://lnkd.in/eGuWr4CH | 20 comments on LinkedIn

NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role

#AI #GenAI #KnowledgeGraph #LLM #technical #research

·linkedin.com·May 14, 2025

NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services

Today, Amazon Web Services (AWS) announced the general availability of Amazon Bedrock Knowledge Bases GraphRAG (GraphRAG), a capability in Amazon Bedrock Knowledge Bases that enhances Retrieval-Augmented Generation (RAG) with graph data in Amazon Neptune Analytics. In this post, we discuss the benefits of GraphRAG and how to get started with it in Amazon Bedrock Knowledge Bases.

#GraphDB #AI #KnowledgeGraph #LLM #technical #GenAI

·aws.amazon.com·May 13, 2025

Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services

Trends from KGC 2025

Last week I was fortunate to attend the Knowledge Graph Conference in NYC! Here are a few trends that span multiple presentations and conversations. - AI and LLM Integration: A major focus [again this year] was how LLMs can be used to enrich knowledge graphs and how knowledge graphs, in turn, can improve LLM outputs. This included using LLMs for entity extraction, verification, inference, and query generation. Many presentations demonstrated how grounding LLMs in knowledge graphs leads to more accurate, contextual, and explainable AI responses. - Semantic Layers and Enterprise Knowledge: There was a strong emphasis on building semantic layers that act as gateways to structured, connected enterprise data. These layers facilitate data integration, governance, and more intelligent AI agents. Decentralized semantic data products (DPROD) were discussed as a framework for internal enterprise data ecosystems. - From Data to Knowledge: Many speakers highlighted that AI is just the “tip of the iceberg” and the true power lies in the data beneath. Converting raw data into structured, connected knowledge was seen as crucial. The hidden costs of ignoring semantics were also discussed, emphasizing the need for consistent data preparation, cleansing, and governance. - Ontology Management and Change: Managing changes and governance in ontologies was a recurring theme. Strategies such as modularization, version control, and semantic testing were recommended. The concept of “SemOps” (Semantic Operations) was discussed, paralleling DevOps for software development. - Practical Tools and Demos: The conference included numerous demos of tools and platforms for building, querying, and visualizing knowledge graphs. These ranged from embedded databases like KuzuDB and RDFox to conversational AI interfaces for KGs, such as those from Metaphacts and Stardog. I especially enjoyed catching up with the Semantic Arts team (Mark Wallace, Dave McComb and Steve Case), talking Gist Ontology and SemOps. I also appreciated the detailed Neptune Q&A I had with Brian O'Keefe, the vision of Ora Lassila and then a chance meeting Adrian Gschwend for the first time, where we connected on LinkML and Elmo as a means to help with bidirectional dataflows. I was so excited by these conversations that I planned to have two team members join me in June at the Data Centric Architecture Workshop Forum, https://www.dcaforum.com/

trends

#KnowledgeGraph #AI #LLM #semantics #events

·linkedin.com·May 13, 2025

Trends from KGC 2025