Found 22 bookmarks
Custom sorting
Fine-tue an LLM model for triplet extraction
Fine-tue an LLM model for triplet extraction
Do you want to fine-tune an LLM model for triplet extraction? These findings from a recently published paper (first comment) could save you much time. ✅ Does the choice of coding vs natural language prompts significantly impact performance? When fine-tuning these open weights and small LLMs, the choice between code and natural language prompts has a limited impact on performance. ✅ Does training fine-tuned models to include chain-of-thought (rationale) sections in their outputs improve KG construction (KGC) performance? It is ineffective at best and highly detrimental at worst for fine-tuned models. This performance decrease is observed regardless of the number of in-context learning examples provided. Attention analysis suggests this might be due to the model's attention being dispersed on redundant information when rationale is used. Without rationale lists occupying prompt space, the model's attention can focus directly on the ICL examples while extracting relations. ✅ How do the fine-tuned smaller, open-weight LLMs perform compared to the CodeKGC baseline, which uses larger, closed-source models (GPT-3.5)? The selected lightweight LLMs significantly outperform the much larger CodeKGC baseline after fine-tuning. The best fine-tuned models improve upon the CodeKGC baseline by as much as 15–20 absolute F1 points across the dataset. ✅ Does model size matter for KGC performance when fine-tuning with a small amount of training data? Yes, but not in a straightforward way. The 70 B-parameter versions yielded worse results than the 1B, 3B, and 8B models when undergoing the same small amount of training. This implies that for KGC with limited fine-tuning, smaller models can perform better than much larger ones. ✅ For instruction-tuned models without fine-tuning, does prompt language or rationale help? For models without fine-tuning, using code prompts generally yields the best results for both code LLMs and the Mistral natural language model. In addition, using rationale generally seems to help these models, with most of the best results obtained when including rationale lists in the prompt. ✅ What do the errors made by the models suggest about the difficulty of the KGC task? difficulty in predicting relations, entities, and their order, especially when dealing with specialized terminology or specific domain knowledge, which poses a challenge even after fine-tuning. Some errors include adding superfluous adjectives or mistaking entity instances for class names. ✅ What is the impact of the number of in-context learning (ICL) examples during fine-tuning? The greatest performance benefit is obtained when moving from 0 to 3 ICL examples. However, additional ICL examples beyond 3 do not lead to any significant performance delta and can even lead to worse results. This further indicates that the fine-tuning process itself is the primary driver of performance gain, allowing the model to learn the task from the input text and target output.
fine-tune an LLM model for triplet extraction
·linkedin.com·
Fine-tue an LLM model for triplet extraction
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role. It’s not just smarter retrieval. It’s structured memory for AI agents. 》 Why NodeRAG? Most Retrieval-Augmented Generation (RAG) methods retrieve chunks of text. Good enough — until you need reasoning, precision, and multi-hop understanding. This is how NodeRAG solves these problems: 》 🔹Step 1: Graph Decomposition NodeRAG begins by decomposing raw text into smart building blocks: ✸ Semantic Units (S): Little event nuggets ("Hinton won the Nobel Prize.") ✸ Entities (N): Key names or concepts ("Hinton", "Nobel Prize") ✸ Relationships (R): Links between entities ("awarded to") ✩ This is like teaching your AI to recognize the actors, actions, and scenes inside any document. 》 🔹Step 2: Graph Augmentation Decomposition alone isn't enough. NodeRAG augments the graph by identifying important hubs: ✸ Node Importance: Using K-Core and Betweenness Centrality to find critical nodes ✩ Important entities get special attention — their attributes are summarized into new nodes (A). ✸ Community Detection: Grouping related nodes into communities and summarizing them into high-level insights (H). ✩ Each community gets a "headline" overview node (O) for quick retrieval. It's like adding context and intuition to raw facts. 》 🔹 Step 3: Graph Enrichment Knowledge without detail is brittle. So NodeRAG enriches the graph: ✸ Original Text: Full chunks are linked back into the graph (Text nodes, T) ✸ Semantic Edges: Using HNSW for fast, meaningful similarity connections ✩ Only smart nodes are embedded (not everything!) — saving huge storage space. ✩ Dual search (exact + vector) makes retrieval laser-sharp. It’s like turning a 2D map into a 3D living world. 》 🔹 Step 4: Graph Searching Now comes the magic. ✸ Dual Search: First find strong entry points (by name or by meaning) ✸ Shallow Personalized PageRank (PPR): Expand carefully from entry points to nearby relevant nodes. ✩ No wandering into irrelevant parts of the graph. The search is surgical. ✩ Retrieval includes fine-grained semantic units, attributes, high-level elements — everything you need, nothing you don't. It’s like sending out agents into a city — and they return not with everything they saw, but exactly what you asked for, summarized and structured. 》 Results: NodeRAG's Performance Compared to GraphRAG, LightRAG, NaiveRAG, and HyDE — NodeRAG wins across every major domain: Tech, Science, Writing, Recreation, and Finance. NodeRAG isn’t just a better graph. NodeRAG is a new operating system for memory. ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ⫸ꆛ Want to build Real-World AI agents? Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 TODAY! ➠ Build Real-World AI Agents + RAG Pipelines ➠ Learn 3 Tools: LangGraph/LangChain | CrewAI | OpenAI Swarm ➠ Work with Text, Audio, Video and Tabular Data 👉𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟯𝟰% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁): https://lnkd.in/eGuWr4CH | 20 comments on LinkedIn
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
·linkedin.com·
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Today, Amazon Web Services (AWS) announced the general availability of Amazon Bedrock Knowledge Bases GraphRAG (GraphRAG), a capability in Amazon Bedrock Knowledge Bases that enhances Retrieval-Augmented Generation (RAG) with graph data in Amazon Neptune Analytics. In this post, we discuss the benefits of GraphRAG and how to get started with it in Amazon Bedrock Knowledge Bases.
·aws.amazon.com·
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Lessons Learned from Evaluating NodeRAG vs Other RAG Systems
Lessons Learned from Evaluating NodeRAG vs Other RAG Systems
🔎 Lessons Learned from Evaluating NodeRAG vs Other RAG Systems I recently dug into the NodeRAG paper (https://lnkd.in/gwaJHP94) and it was eye-opening not just for how it performed, but for what it revealed about the evolution of RAG (Retrieval-Augmented Generation) systems. Some key takeaways for me: 👉 NaiveRAG is stronger than you think. Brute-force retrieval using simple vector search sometimes beats graph-based methods, especially when graph structures are too coarse or noisy. 👉 GraphRAG was an important step, but not the final answer. While it introduced knowledge graphs and community-based retrieval, GraphRAG sometimes underperformed NaiveRAG because its communities could be too coarse, leading to irrelevant retrieval. 👉 LightRAG reduced token cost, but at the expense of accuracy. By focusing on retrieving just 1-hop neighbors instead of traversing globally, LightRAG made retrieval cheaper — but often missed important multi-hop reasoning paths, losing precision. 👉 NodeRAG shows what mature RAG looks like. NodeRAG redesigned the graph structure itself: Instead of homogeneous graphs, it uses heterogeneous graphs with fine-grained semantic units, entities, relationships, and high-level summaries — all as nodes. It combines dual search (exact match + semantic search) and shallow Personalized PageRank to precisely retrieve the most relevant context. The result? 🚀 Highest accuracy across multi-hop and open-ended benchmarks 🚀 Lowest token retrieval (i.e., lower inference costs) 🚀 Faster indexing and querying 🧠 Key takeaway: In the RAG world, it’s no longer about retrieving more — it’s about retrieving better. Fine-grained, explainable, efficient retrieval will define the next generation of RAG systems. If you’re working on RAG architectures, NodeRAG’s design principles are well worth studying! Would love to hear how others are thinking about the future of RAG systems. 🚀📚 #RAG #KnowledgeGraphs #AI #LLM #NodeRAG #GraphRAG #LightRAG #MachineLearning #GenAI #KnowledegGraphs
·linkedin.com·
Lessons Learned from Evaluating NodeRAG vs Other RAG Systems
Google Cloud & Neo4j: Teaming Up at the Intersection of Knowledge Graphs, Agents, MCP, and Natural Language Interfaces - Graph Database & Analytics
Google Cloud & Neo4j: Teaming Up at the Intersection of Knowledge Graphs, Agents, MCP, and Natural Language Interfaces - Graph Database & Analytics
We’re thrilled to announce new Text2Cypher models and Google’s MCP Toolbox for Databases from the collaboration between Google Cloud and Neo4j.
·neo4j.com·
Google Cloud & Neo4j: Teaming Up at the Intersection of Knowledge Graphs, Agents, MCP, and Natural Language Interfaces - Graph Database & Analytics
Choosing the Right Format: How Knowledge Graph Layouts Impact AI Reasoning
Choosing the Right Format: How Knowledge Graph Layouts Impact AI Reasoning
Choosing the Right Format: How Knowledge Graph Layouts Impact AI Reasoning ... 👉 Why This Matters Most AI systems blend knowledge graphs (structured data) with large language models (flexible reasoning). But there’s a hidden variable: "how" you translate the graph into text for the AI. Researchers discovered that the formatting choice alone can swing performance by up to "17.5%" on reasoning tasks. Imagine solving 1 in 5 more problems correctly just by adjusting how you present data. 👉 What They Built KG-LLM-Bench is a new benchmark to test how language models reason with knowledge graphs. It includes five tasks: - Triple verification (“Does this fact exist?”) - Shortest path finding (“How are two concepts connected?”) - Aggregation (“How many entities meet X condition?”) - Multi-hop reasoning (“Which entities linked to A also have property B?”) - Global analysis (“Which node is most central?”) The team tested seven models (Claude, GPT-4o, Gemini, Llama, Nova) with five ways to “textualize” graphs, from simple edge lists to structured JSON and semantic web formats like RDF Turtle. 👉 Key Insights 1. Format matters more than assumed:   - Structured JSON and edge lists performed best overall, but results varied by task.   - For example, JSON excels at aggregation tasks (data is grouped by entity), while edge lists help identify central nodes (repeated mentions highlight connections). 2. Models don’t cheat: Replacing real entity names with fake ones (e.g., “France” → “Verdania”) caused only a 0.2% performance drop, proving models rely on context, not memorized knowledge. 3. Token efficiency:   - Edge lists used ~2,600 tokens vs. JSON-LD’s ~13,500. Shorter formats free up context space for complex reasoning.   - But concise ≠ always better: structured formats improved accuracy for tasks requiring grouped data. 4. Models struggle with directionality:   Counting outgoing edges (e.g., “Which countries does France border?”) is easier than incoming ones (“Which countries border France?”), likely due to formatting biases. 👉 Practical Takeaways - Optimize for your task: Use JSON for aggregation, edge lists for centrality. - Test your model: The best format depends on the LLM—Claude thrived with RDF Turtle, while Gemini preferred edge lists. - Don’t fear pseudonyms: Masking real names minimally impacts performance, useful for sensitive data. The benchmark is openly available, inviting researchers to add new tasks, graphs, and models. As AI handles larger knowledge bases, choosing the right “data language” becomes as critical as the reasoning logic itself. Paper: [KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs] Authors: Elan Markowitz, Krupa Galiya, Greg Ver Steeg, Aram Galstyan
Choosing the Right Format: How Knowledge Graph Layouts Impact AI Reasoning
·linkedin.com·
Choosing the Right Format: How Knowledge Graph Layouts Impact AI Reasoning
Knowledge graphs for LLM grounding and avoiding hallucination
Knowledge graphs for LLM grounding and avoiding hallucination
This blog post is part of a series that dives into various aspects of SAP’s approach to Generative AI, and its technical underpinnings. In previous blog posts of this series, you learned about how to use large language models (LLMs) for developing AI applications in a trustworthy and reliable manner...
·community.sap.com·
Knowledge graphs for LLM grounding and avoiding hallucination
Multi-Layer Agentic Reasoning: Connecting Complex Data and Dynamic Insights in Graph-Based RAG Systems
Multi-Layer Agentic Reasoning: Connecting Complex Data and Dynamic Insights in Graph-Based RAG Systems
Multi-Layer Agentic Reasoning: Connecting Complex Data and Dynamic Insights in Graph-Based RAG Systems 🛜 At the most fundamental level, all approaches rely… | 11 comments on LinkedIn
Multi-Layer Agentic Reasoning: Connecting Complex Data and Dynamic Insights in Graph-Based RAG Systems
·linkedin.com·
Multi-Layer Agentic Reasoning: Connecting Complex Data and Dynamic Insights in Graph-Based RAG Systems
Build your hybrid-Graph for RAG & GraphRAG applications using the power of NLP | LinkedIn
Build your hybrid-Graph for RAG & GraphRAG applications using the power of NLP | LinkedIn
Build a graph for RAG application for a price of a chocolate bar! What is GraphRAG for you? What is GraphRAG? What does GraphRAG mean from your perspective? What if you could have a standard RAG and a GraphRAG as a combi-package, with just a query switch? The fact is, there is no concrete, universal
·linkedin.com·
Build your hybrid-Graph for RAG & GraphRAG applications using the power of NLP | LinkedIn