GraphNews

4773 bookmarks
Custom sorting
How can we create general-purpose graph foundation models?
How can we create general-purpose graph foundation models?
How can we create general-purpose graph foundation models? (by Dmitry Eremeev) For a long time, we believed that general-purpose graph foundation models were impossible to create. Indeed, graphs are used to represent data across many different domains, and thus graph machine learning must handle tasks on extremely diverse datasets, such as social, information, transportation, and co-purchasing networks, or models of various physical, biological, or engineering systems. Given the vast differences in structure, features, and labels among these datasets, it seemed unlikely that a single model could achieve robust cross-domain generalization and perform well on all of them. However, we noticed that tabular machine learning faces a similar challenge of working with diverse datasets containing different features and labels. And yet, this field has recently witnessed the emergence of first successful foundation models such as TabPFNv2, which are based on the prior-data fitted networks (PFNs) paradigm. Thus, we have…
·t.me·
How can we create general-purpose graph foundation models?
Flexible-GraphRAG
Flexible-GraphRAG
𝗙𝗹𝗲𝘅𝗶𝗯𝗹𝗲 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 𝗼𝗿 𝗥𝗔𝗚 is now flexing to the max using LlamaIndex, supports 𝟳 𝗴𝗿𝗮𝗽𝗵 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀, 𝟭𝟬 𝘃𝗲𝗰𝘁𝗼𝗿 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀, 𝟭𝟯 𝗱𝗮𝘁𝗮 𝘀𝗼𝘂𝗿𝗰𝗲𝘀, 𝗟𝗟𝗠𝘀, Docling 𝗱𝗼𝗰 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴, 𝗮𝘂𝘁𝗼 𝗰𝗿𝗲𝗮𝘁𝗲 𝗞𝗚𝘀, 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚, 𝗛𝘆𝗯𝗿𝗶𝗱 𝗦𝗲𝗮𝗿𝗰𝗵, 𝗔𝗜 𝗖𝗵𝗮𝘁 (shown Hyland products web page data src) 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝗚𝗿𝗮𝗽𝗵: Neo4j ArcadeDB FalkorDB Kuzu NebulaGraph, powered by Vesoft (coming Memgraph and 𝗔𝗺𝗮𝘇𝗼𝗻 𝗡𝗲𝗽𝘁𝘂𝗻𝗲) 𝗩𝗲𝗰𝘁𝗼𝗿: Qdrant, Elastic, OpenSearch Project, Neo4j 𝘃𝗲𝗰𝘁𝗼𝗿, Milvus, created by Zilliz (coming Weaviate, Chroma, Pinecone, 𝗣𝗼𝘀𝘁𝗴𝗿𝗲𝗦𝗤𝗟 + 𝗽𝗴𝘃𝗲𝗰𝘁𝗼𝗿, LanceDB) Docling 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴 𝗗𝗮𝘁𝗮 𝗦𝗼𝘂𝗿𝗰𝗲𝘀: using LlamaIndex readers: working: Web Pages, Wikipedia, Youtube, untested: Google Drive, Msft OneDrive, S3, Azure Blob, GCS, Box, SharePoint, previous: filesystem, Alfresco, CMIS. 𝗟𝗟𝗠𝘀: 𝗟𝗹𝗮𝗺𝗮𝗜𝗻𝗱𝗲𝘅 𝗟𝗟𝗠𝘀 (OpenAI, Ollama, Claude, Gemini, etc.) 𝗥𝗲𝗮𝗰𝘁, 𝗩𝘂𝗲, 𝗔𝗻𝗴𝘂𝗹𝗮𝗿 𝗨𝗜𝘀, 𝗠𝗖𝗣 𝘀𝗲𝗿𝘃𝗲𝗿, 𝗙𝗮𝘀𝘁𝗔𝗣𝗜 𝘀𝗲𝗿𝘃𝗲𝗿 𝗚𝗶𝘁𝗛𝘂𝗯 𝘀𝘁𝗲𝘃𝗲𝗿𝗲𝗶𝗻𝗲𝗿/𝗳𝗹𝗲𝘅𝗶𝗯𝗹𝗲-𝗴𝗿𝗮𝗽𝗵𝗿𝗮𝗴: https://lnkd.in/eUEeF2cN 𝗫.𝗰𝗼𝗺 𝗣𝗼𝘀𝘁 𝗼𝗻 𝗙𝗹𝗲𝘅𝗶𝗯𝗹𝗲 𝗚𝗿𝗮𝗽𝗵𝗥𝗔𝗚 𝗼𝗿 𝗥𝗔𝗚 𝗺𝗮𝘅 𝗳𝗹𝗲𝘅𝗶𝗻𝗴 https://lnkd.in/gHpTupAr 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗲𝗱 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰𝘀 𝗕𝗹𝗼𝗴: https://lnkd.in/ehpjTV7d
·linkedin.com·
Flexible-GraphRAG
Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis
Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis
Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis ... What happens when you ask an AI about supply chain vulnerabilities and it misses the most critical dependencies? Most AI systems treat business relationships like isolated facts in a database. They might know Apple uses lithium batteries, but they miss the web of connections that create real risk. 👉 The Core Problem Standard AI retrieval treats every piece of information as a standalone point. But supply chain risk lives in the relationships between companies, products, and locations. When conflict minerals from the DRC affect smartphone production, it's not just about one supplier - it's about cascading effects through interconnected networks. Vector similarity search finds related documents but ignores the structural dependencies that matter most for risk assessment. 👉 A Different Approach New research from UC Berkeley and MSCI demonstrates how to solve this by treating supply chains as both networks and knowledge graphs simultaneously. The key insight: economic relationships like "Company A produces Product B" are both structural network links and semantic knowledge graph triples. This duality lets you use network science to find the most economically important paths. 👉 How It Works Instead of searching for similar text, the system: - Maps supply chains as networks with companies, products, and locations as nodes - Uses centrality measures to identify structurally important paths - Wraps quantitative data in descriptive language so AI can reason about what numbers actually mean - Retrieves specific relationship paths rather than generic similar content When asked about cobalt risks, it doesn't just find articles about cobalt. It traces the actual path from DRC mines through battery manufacturers to final products, revealing hidden dependencies. The system generates risk narratives that connect operational disruptions to financial impacts without requiring specialized training or expensive graph databases. This approach shows how understanding the structure of business relationships - not just their content - can make AI genuinely useful for complex domain problems.
Exploring Network-Knowledge Graph Duality: A Case Study inAgentic Supply Chain Risk Analysis
·linkedin.com·
Exploring Network-Knowledge Graph Duality: A Case Study in Agentic Supply Chain Risk Analysis
Another awesome Graph RAG paper from February: Agentic Retrieval-Augmented Generation: A Survey ON Agentic RAG
Another awesome Graph RAG paper from February: Agentic Retrieval-Augmented Generation: A Survey ON Agentic RAG
Another awesome Graph RAG paper from February: Agentic Retrieval-Augmented Generation: A Survey ON Agentic RAG by Aditi Singh, Ph.D., Abul E., Saket Kumar and Tala Talaei Khoei, Ph.D.. Section 3 is a complete breakdown of the entire RAG and agentic stack from vector RAG to agentic RAG, that I highly recommend for an introduction... there are lots of figures and plain language, so even useful if you're not that technical. The paper then provides an Agentic RAG taxonomy and gives a thorough, high level overview of different forms of Agentic RAG. It outlines the concepts relating to Agentic RAG, which it refers to as having: - Multiple, autonomous agents - Dynamic decision-making - Iterative refinement and workflow optimization - Adaptable to real-time changes - Scalable for multi-domain tasks - High accuracy What does that mean? Traditional RAG systems, with their static workflows and limited adaptability, often struggle to handle dynamic, multistep reasoning and complex real-world tasks. These limitations have spurred the integration of agentic intelligence, resulting in Agentic RAG. By incorporating autonomous agents capable of dynamic decision-making, iterative reasoning, and adaptive retrieval strategies, Agentic RAG builds on the modularity of earlier paradigms while overcoming their inherent constraints. This evolution enables more complex, multi-domain tasks to be addressed with enhanced precision and contextual understanding, positioning Agentic RAG as a cornerstone for next-generation AI applications. In particular, Agentic RAG systems reduce latency through optimized workflows and refine outputs iteratively, tackling the very challenges that have historically hindered traditional RAG’s scalability and effectiveness. That sounds cool... now how does it work? Agentic RAG is a combination of tool calling agents iteratively accessing different kinds of data stores and APIs in collaboration with other agents that may split tasks in parallel, check one another's work of perform different parts of a chain of prompts. Single Agentic RAG, the simplest form, rocks these features... in summary it can supplement RAG-retrieval by routing queries to different forms of search ("[filesystem, relational DB, graph DB], semantic search, web search"), as well as external APIs such as DuckDuckGo, Serp, Splunk, Wikipedia, Salesforce, Outlook, Dropbox, Google Workspace, Slack, Discord or multiple, iterative tool usage via your favorite MPC servers. The possibilities to craft your own agentic RAG workflows are enormous! The paper is on arXiv here: https://lnkd.in/gZ8ypXYf
Another awesome Graph RAG paper from February: Agentic Retrieval-Augmented Generation: A Survey ON Agentic RAG
·linkedin.com·
Another awesome Graph RAG paper from February: Agentic Retrieval-Augmented Generation: A Survey ON Agentic RAG
Protocols move bits. Semantics move value.
Protocols move bits. Semantics move value.
Protocols move bits. Semantics move value. The reports on agents are starting to sound samey: go vertical not horizontal; redesign workflows end-to-end; clean your data; stop doing pilots that automate inefficiencies; price for outcomes when the agent does the work. All true. All necessary. All needing repetition ad nauseam. So it’s refreshing to see a switch-up in Bain’s Technology Report 2025: the real leverage now sits with semantics. A shared layer of meaning. Bain notes that protocols are maturing. MCP and A2A let agents pass tool calls, tokens, and results between layers. Useful plumbing. But there’s still no shared vocabulary that says what an invoice, policy, or work order is, how it moves through states, and how it maps to APIs, tables, and approvals. Without that, cross-vendor reliability will keep stalling. They go further: whoever lands a pragmatic semantic layer first gets winner-takes-most network effects. Define the dictionary and you steer the value flow. This isn’t just a feature. It’s a control point. Bain frames the stack clearly: - Systems of record (data, rules, compliance) - Agent operating systems (orchestration, planning, memory) - Outcome interfaces (natural language requests, user-facing actions) The bottleneck is semantics. And there’s a pricing twist. If agents do the work, semantics define what “done” means. That unlocks outcome-based pricing, charging for tasks completed or value delivered, not log-ons. Bain is blunt: the open, any-to-any agent utopia will smash against vendor incentives, messy data, IP, and security. Translation: walled gardens lead first. Start where governance is clear and data is good enough, then use that traction to shape the semantics others will later adopt. This is where I’m seeing convergence. In practice, a knowledge graph can provide that shared meaning, identity, relationships, and policy. One workable pattern: the agent plans with an LLM, resolves entities and checks rules in the graph, then acts through typed APIs, writing back as events the graph can audit. That’s the missing vocabulary and the enforcement that protocols alone can’t cover. Tony Seale puts it well: “Neural and symbolic systems are not rivals; they are complements… a knowledge graph provides the symbolic backbone… to ground AI in shared semantics and enforce consistency.” To me, this is optimistic, because it moves the conversation from “make the model smarter” to “make the system understandable.” Agents don’t need perfection if they are predictable, composable, and auditable. Semantics deliver that. It’s also how smaller players compete with hyperscalers: you don’t need to win the model race to win the meaning race. With semantics, agents become infrastructure. The next few years won’t be won by who builds the biggest model. It’ll be won by who defines the smallest shared meaning. | 27 comments on LinkedIn
Protocols move bits. Semantics move value.
·linkedin.com·
Protocols move bits. Semantics move value.
The single most undervalued fact of graph theory: Every board is a graph in disguise
The single most undervalued fact of graph theory: Every board is a graph in disguise
The single most undervalued fact of graph theory: Every board is a graph in disguise. Here’s the 3-step mapping that turns messy “rooms” into clean, countable components. 0/ You’re given a map of walls and floor tiles. By eye, you see there are three rooms. But how do you get a computer to see them too? 1/ Start by modeling the board as a graph. Treat every floor tile as a node. Define valid moves as edges. In our case, moves are the four directions: • Up • Down • Left • Right Walls simply remove edges because you can’t step through them. 2/ Number the floor tiles arbitrarily so you can reference nodes. Now you’ve converted the board to an undirected graph. Why do this? Because two common board questions become standard graph problems. 1. “Shortest path between two tiles?” becomes “shortest path between two nodes.” 2. “How many rooms?” becomes “how many connected components?” That second one is our target. A “room” is just a maximal set of tiles reachable from each other without crossing walls. In graph terms, that’s a connected component. So the count of rooms equals the count of connected components. Here’s the practical recipe I use: • Nodes = all floor tiles. • Edges = pairs of floor tiles one step apart (U/D/L/R). • Walls = missing edges. • Rooms = connected components. • Answer = number of connected components. 3. You can run a DFS or BFS from every unvisited node and mark all reachable tiles. Each fresh start increments the room counter by one. That’s it. No heuristics, no guesswork, just graph structure doing the heavy lifting. Once you see boards as graphs, these problems stop feeling ad hoc. They become repeatable templates you can code in minutes. If this helped, repost so more people learn the “rooms = components” pattern.
The single most undervalued fact of graph theory:Every board is a graph in disguise
·linkedin.com·
The single most undervalued fact of graph theory: Every board is a graph in disguise
G-REASONER: foundation models for unified reasoning over graph-structured knowledge
G-REASONER: foundation models for unified reasoning over graph-structured knowledge
G-REASONER: foundation models for unified reasoning over graph-structured knowledge ... Why Graph-Enhanced AI Still Struggles with Complex Reasoning (And How G-REASONER Fixes It) Ever wondered why current AI systems still fail at connecting the dots across complex knowledge domains? The answer lies in how they handle structured information. 👉 The Core Problem Large language models excel at reasoning but hit a wall when dealing with interconnected knowledge. Traditional retrieval systems treat information as isolated fragments, missing the rich relationships that make knowledge truly useful. Current graph-enhanced approaches face three critical limitations: - They're designed for specific graph types only - They rely on expensive agent-based reasoning - They can't generalize across different domains 👉 What G-REASONER Brings to the Table Researchers from Monash University and collaborating institutions introduce G-REASONER, a unified framework that bridges graph and language foundation models. The key innovation is QuadGraph - a standardized four-layer structure that unifies diverse knowledge sources: - Community layer for global context - Document layer for textual information - Knowledge graph layer for factual relationships - Attribute layer for common properties 👉 How It Works in Practice G-REASONER employs a 34M-parameter graph foundation model that jointly processes graph topology and text semantics. Unlike previous approaches, it uses knowledge distillation to learn from large-scale datasets with weak supervision. The system implements distributed message-passing across multiple GPUs, enabling efficient scaling. Mixed-precision training reduces memory usage by 17.5% while doubling training throughput. Testing across six benchmarks shows consistent improvements over state-of-the-art baselines, with particularly strong performance on multi-hop reasoning tasks requiring complex knowledge connections. The framework demonstrates remarkable generalization - the same model works effectively across medical records, legal documents, and encyclopedia data without domain-specific fine-tuning. This represents a significant step toward AI systems that can reason over structured knowledge as fluidly as humans navigate interconnected concepts.
G-REASONER: foundation models for unified reasoning over graph-structured knowledge
·linkedin.com·
G-REASONER: foundation models for unified reasoning over graph-structured knowledge
Announcing the formation of a Data Façades W3C Community Group
Announcing the formation of a Data Façades W3C Community Group
I am excited to announce the formation of a Data Façades W3C Community Group. Façade-X, initially introduced at SEMANTICS 2021 and successfully implemented by the SPARQL Anything project, provides a simple yet powerful, homogeneous view over diverse and heterogeneous data sources (e.g., CSV, JSON, XML, and many others). With the recent v1.0.0 release of SPARQL Anything, the time was right to work on the long-term stability and widespread adoption of this approach by developing an open, vendor-neutral technology. The Façade-X concept was born to allow SPARQL users to query data in any structured format in plain SPARQL. Therefore, the choice of a W3C community group to lead efforts on specifications is just natural. Specifications will enhance its reliability, foster innovation, and encourage various vendors and projects—including graph database developers — to provide their own compatible implementations. The primary goals of the Data Façades Community Group is to: Define the core specification of the Façade-X method. Define Standard Mappings: Formalize the required mappings and profiles for connecting Façade-X to common data formats. Define the specification of the query dialect: Provide a reference for the SPARQL dialect, configuration conventions (like SERVICE IRIs), and the functions/magic properties used. Establish Governance: Create a monitored, robust process for adding support for new data formats. Foster Collaboration: Build connections with relevant W3C groups (e.g., RDF & SPARQL, Data Shapes) and encourage involvement from developers, businesses, and adopters. Join us! With Luigi Asprino Ivo Velitchkov Justin Dowdy Paul Mulholland Andy Seaborne Ryan Shaw ... CG: https://lnkd.in/eSxuqsvn Github: https://lnkd.in/dkHGT8N3 SPARQL Anything #RDF #SPARQL #W3C #FX
announce the formation of a Data Façades W3C Community Group
·linkedin.com·
Announcing the formation of a Data Façades W3C Community Group
Introducing the GitLab Knowledge Graph
Introducing the GitLab Knowledge Graph
Today, I'd like to introduce the GitLab Knowledge Graph. This release includes a code indexing engine, written in Rust, that turns your codebase into a live, embeddable graph database for LLM RAG. You can install it with a simple one-line script, parse local repositories directly in your editor, and connect via MCP to query your workspace and over 50,000 files in under 100 milliseconds. We also saw GKG agents scoring up to 10% higher on the SWE-Bench-lite benchmarks, with just a few tools and a small prompt added to opencode (an open-source coding agent). On average, we observed a 7% accuracy gain across our eval runs, and GKG agents were able to solve new tasks compared to the baseline agents. You can read more from the team's research here https://lnkd.in/egiXXsaE. This release is just the first step: we aim for this local version to serve as the backbone of a Knowledge Graph service that enables you to query the entire GitLab Software Development Life Cycle—from an Issue down to a single line of code. I am incredibly proud of the work the team has done. Thank you, Michael U., Jean-Gabriel Doyon, Bohdan Parkhomchuk, Dmitry Gruzd, Omar Qunsul, and Jonathan Shobrook. You can watch Bill Staples and I present this and more in the GitLab 18.4 release here: https://lnkd.in/epvjrhqB Try today at: https://lnkd.in/eAypneFA Roadmap: https://lnkd.in/eXNYQkEn Watch more below for a complete, in-depth tutorial on what we've built: | 19 comments on LinkedIn
introduce the GitLab Knowledge Graph
·linkedin.com·
Introducing the GitLab Knowledge Graph
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation ... Why Current AI Search Falls Short When You Need Real Answers What happens when you ask an AI system a complex question that requires connecting multiple pieces of information? Most current approaches retrieve some relevant documents, generate an answer, and call it done. But this single-pass strategy often misses critical evidence. 👉 The Problem with Shallow Retrieval Traditional retrieval-augmented generation (RAG) systems work like a student who only skims the first few search results before writing an essay. They grab what seems relevant on the surface but miss deeper connections that would lead to better answers. When researchers tested these systems on complex multi-hop questions, they found a consistent pattern: the AI would confidently provide answers based on incomplete evidence, leading to logical gaps and missing key facts. 👉 A New Approach: Deep Searching with Dual Channels Researchers from IDEA Research and Hong Kong University of Science and Technology developed GraphSearch, which works more like a thorough investigator than a quick searcher. The system breaks down complex questions into smaller, manageable pieces, then searches through both text documents and structured knowledge graphs. Think of it as having two different research assistants: one excellent at finding descriptive information in documents, another skilled at tracing relationships between entities. 👉 How It Actually Works Instead of one search-and-answer cycle, GraphSearch uses six coordinated modules: Query decomposition splits complex questions into atomic sub-questions Context refinement filters out noise from retrieved information Query grounding fills in missing details from previous searches Logic drafting organizes evidence into coherent reasoning chains Evidence verification checks if the reasoning holds up Query expansion generates new searches to fill identified gaps The system continues this process until it has sufficient evidence to provide a well-grounded answer. 👉 Real Performance Gains Testing across six different question-answering benchmarks showed consistent improvements. On the MuSiQue dataset, for example, answer accuracy jumped from 35% to 51% when GraphSearch was integrated with existing graph-based systems. The approach works particularly well under constrained conditions - when you have limited computational resources for retrieval, the iterative searching strategy maintains performance better than single-pass methods. This research points toward more reliable AI systems that can handle the kind of complex reasoning we actually need in practice. Paper: "GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation" by Yang et al.
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
·linkedin.com·
GraphSearch: An Agentic Deep‑Search Workflow for Graph Retrieval‑Augmented Generation
Product management makes or breaks AI. The role of graph
Product management makes or breaks AI. The role of graph
Product management makes or breaks AI. That includes 𝐝𝐚𝐭𝐚. The role of 𝐝𝐚𝐭𝐚 𝐏𝐌 is shifting. For years, the focus was BI - dashboards, reports, warehouses. But AI demands more: context, retrieval, real time, and integration into the flow of work. Data PMs who understand AI requirements will define the next generation of enterprise success. Here’s how my team thinks about BI-ready vs AI-ready data 👇
Product management makes or breaks AI.
·linkedin.com·
Product management makes or breaks AI. The role of graph
You Don't Need a Graph DB
You Don't Need a Graph DB
Many teams adopt graph databases believing they need specialized tools for relationship data, adding unnecessary complexity to their stack. This session reveals that for most use cases, the performance benefits don't justify the overhead. You'll learn to evaluate whether you truly need graph DB capabilities and how to implement graph patterns using simpler alternatives.
·maven.com·
You Don't Need a Graph DB
City2graph is a Python library that turns urban datasets such as streets, buildings, transit networks, and mobility flows into graph structures ready for Graph Neural Networks.
City2graph is a Python library that turns urban datasets such as streets, buildings, transit networks, and mobility flows into graph structures ready for Graph Neural Networks.
city2graph logo city2graph logo city2graph is a Python library for converting geospatial datasets into graphs for GNN with integrated interface of GeoPandas, NetworkX, and Pytorch Geometric across ...
·city2graph.net·
City2graph is a Python library that turns urban datasets such as streets, buildings, transit networks, and mobility flows into graph structures ready for Graph Neural Networks.