Standing on Giants' Shoulders: What Happens When Formal Ontology Meets Modern Verification? 🚀 | LinkedIn
Building on Decades of Foundational Research The formal ontology community has given us incredible foundations - Barry Smith's BFO framework, Alan Ruttenberg's CLIF axiomatizations, and Microsoft Research's Z3 theorem prover. What happens when we combine these mature technologies with modern graph d
On request, this is the complete slide deck I used in my course at the C-FORS summer school on Foundational Ontologies (see https://lnkd.in/e9Af5JZF) at the University of Oslo, Norway.
If you want to know more, here are some papers related to the talk:
On the ontology itself:
a) for a gentle introduction to UFO: https://lnkd.in/egS5FsQ
b) to understand the UFO history and ecosystem (including OntoUML): https://lnkd.in/emCaX5pF
c) a more formal paper on the axiomatization of UFO but also with examples (in OntoUML): https://lnkd.in/e_bUuTMa
d) focusing on UFO's theory of Types and Taxonomic Structures: https://lnkd.in/eGPXHeh
e) focusing on its Theory of Relations (including relationship reification): https://lnkd.in/eTFFRBy8 and https://lnkd.in/eMNmi7-B
f) focusing on Qualities and Modes (aspect reification): https://lnkd.in/eNXbrKrW and https://lnkd.in/eQtNC9GH
g) focusing on events and processes: https://lnkd.in/e3Z8UrCD, https://lnkd.in/ePZEaJh9, https://lnkd.in/eYnirFv6, https://lnkd.in/ev-cb7_e, https://lnkd.in/e_nTwBc7
On the tools:
a) Model Auto-repair and Constraint Learning: https://lnkd.in/esuYSU9i
b) Model Validation and Anti-Pattern Detection: https://lnkd.in/e2SxvVzS
c) Ontological Patterns and Pattern Grammars: https://lnkd.in/exMFMgpT and https://lnkd.in/eCeRtMNz
d) Multi-Level Modeling: https://lnkd.in/eVavvURk and https://lnkd.in/e8t3sMdU
e) Complexity Management: https://lnkd.in/eq3xWp-U
f) FAIR catalog of models and Pattern Mining: https://lnkd.in/eaN5d3QR and https://lnkd.in/ecjhfp8e
g) Anti-Patterns on Wikidata: https://lnkd.in/eap37SSU
h) Model Transformation/implementation: https://lnkd.in/eh93u5Hg, https://lnkd.in/e9bU_9NC, https://lnkd.in/eQtNC9GH, https://lnkd.in/esGS8ZTb
#ontology #UFO #ontologies #foundationalontology #toplevelontology #TLO
Semantics, Cybersecurity, and Services (SCS)/University of Twente
A Pragmatic Introduction to Knowledge Graphs | LinkedIn
Audience: This blog is written for engineering leaders, architects, and decision-makers who want to understand what a knowledge graph is, when it makes sense, and when it doesn’t. It is not a deep technical dive, but a strategic overview.
Graph RAG open source stack to generate and visualize knowledge graphs
A serious knowledge graph effort is much more than a bit of Github, but customers and adventurous minds keep asking me if there is an easy to use (read: POC click-and-go solution) graph RAG open source stack they can use to generate knowledge graphs.
So, here is my list of projects I keep an eye on. Mind, there is nothing simple if you venture into graphs, despite all the claims and marketing. Things like graph machine learning, graph layout and distributed graph analytics is more than a bit of pip install.
The best solutions are hidden inside multi-nationals, custom made. Equity firms and investors sometimes ask me to evaluate innovations. It's amazing what talented people develop and never shows up in the news, or on Github.
TrustGraph - The Knowledge Platform for AI https://trustgraph.ai/ The only one with a distributed architecture and made for enterprise KG.
itext2kg - https://lnkd.in/e-eQbwV5 Clean and plain. Wrapped prompts done right.
Fast GraphRAG - https://lnkd.in/e7jZ9GZH Popular and with some basic visualization.
ZEP - https://lnkd.in/epxtKtCU Geared towards agentic memory.
Triplex - https://lnkd.in/eGV8FR56 LLM to extract triples.
GraphRAG Local with UI - https://lnkd.in/ePGeqqQE Another starting point for small KG efforts. Or to convince your investors.
GraphRAG visualizer - https://lnkd.in/ePuMmfkR Makes pretty pictures but not for drill-downs.
Neo4j's GraphRAG - https://lnkd.in/ex_A52RU A python package with a focus on getting data into Neo4j.
OpenSPG - https://lnkd.in/er4qUFJv Has a different take and more academic.
Microsoft GraphRAG - https://lnkd.in/e_a-mPum A classic but I don't think anyone is using this beyond experimentation.
yWorks - https://www.yworks.com If you are serious about interactive graph layout.
Ogma - https://lnkd.in/evwnJCBK If you are serious about graph data viz.
Orbifold Consulting - https://lnkd.in/e-Dqg4Zx If you are serious about your KG journey.
#GraphRAG #GraphViz #GraphMachineLearning #KnowledgeGraphs
graph RAG open source stack they can use to generate knowledge graphs.
Beyond Gruber: Rethinking Ontologies in the Enterprise Landscape | LinkedIn
1. Introduction: A Tension in Knowledge Representation The classic definition by Thomas Gruber—"an ontology is a formal, explicit specification of a shared conceptualization"—has long served as a foundational reference in knowledge representation.
LLMs generate possibilities; knowledge graphs remember what works
LLMs generate possibilities; knowledge graphs remember what works. Together, they forge the recursive memory and creative engine that enables AI systems to truly evolve themselves.
Combining neural components (like large language models) with symbolic verification creates a powerful framework for self-evolution that overcomes limitations of either approach used independently.
AlphaEvolve demonstrates that self-evolving systems face a fundamental tension between generating novel solutions and ensuring those solutions actually work.
The paper shows how AlphaEvolve addresses this through a hybrid architecture where:
Neural components (LLMs) provide creative generation of code modifications by drawing on patterns learned from vast training data
Symbolic components (code execution) provide ground truth verification through deterministic evaluation
Without this combination, a system would either generate interesting but incorrect solutions (neural-only approach) or be limited to small, safe modifications within known patterns (symbolic-only approach).
The system can operate at multiple levels of abstraction depending on the problem: raw solution evolution, constructor function evolution, search algorithm evolution, or co-evolution of intermediate solutions and search algorithms.
This capability emanates directly from the neurosymbolic integration, where:
Neural networks excel at working with continuous, high-dimensional spaces and recognizing patterns across abstraction levels
Symbolic systems provide precise representations of discrete structures and logical relationships
This enables AlphaEvolve to modify everything from specific lines of code to entire algorithmic approaches.
While AlphaEvolve currently uses an evolutionary database, a knowledge graph structure could significantly enhance self-evolution by:
Capturing evolutionary relationships between solutions
Identifying patterns of code changes that consistently lead to improvements
Representing semantic connections between different solution approaches
Supporting transfer learning across problem domains
Automated, objective evaluation is the core foundation enabling self-evolution:
The main limitation of AlphaEvolve is that it handles problems for which it is possible to devise an automated evaluator.
This evaluation component provides the "ground truth" feedback that guides evolution, allowing the system to:
Differentiate between successful and unsuccessful modifications
Create selection pressure toward better-performing solutions
Avoid hallucinations or non-functional solutions that might emerge from neural components alone.
When applied to optimize Gemini's training kernels, the system essentially improved the very LLM technology that powers it. | 12 comments on LinkedIn
LLMs generate possibilities; knowledge graphs remember what works
I added a Knowledge Graph to Cursor using MCP.
You gotta see this working!
Knowledge graphs are a game-changer for AI Agents, and this is one example of how you can take advantage of them.
How this works:
1. Cursor connects to Graphiti's MCP Server. Graphiti is a very popular open-source Knowledge Graph library for AI agents.
2. Graphiti connects to Neo4j running locally.
Now, every time I interact with Cursor, the information is synthesized and stored in the knowledge graph. In short, Cursor now "remembers" everything about our project.
Huge!
Here is the video I recorded.
To get this working on your computer, follow the instructions on this link:
https://lnkd.in/eeZ_4dkb
Something super cool about using Graphiti's MCP server:
You can use one model to develop the requirements and a completely different model to implement the code. This is a huge plus because you could use the stronger model at each stage.
Also, Graphiti supports custom entities, which you can use when running the MCP server.
You can use these custom entities to structure and recall domain-specific information, which will tenfold the accuracy of your results.
Here is an example of what these look like:
https://lnkd.in/efv7kTaH
By the way, knowledge graphs for agents are a big thing.
A few ridiculous and eye-opening benchmarks comparing an AI Agent using knowledge graphs with state-of-the-art methods:
• 94.8% accuracy versus 93.4% in the Deep Memory Retrieval (DMR) benchmark.
• 71.2% accuracy versus 60.2% on conversations simulating real-world enterprise use cases.
• 2.58s of latency versus 28.9s.
• 38.4% improvement in temporal reasoning.
You'll find these benchmarks in this paper: https://fnf.dev/3CLQjBK | 36 comments on LinkedIn
Efficient Graph Storage for Entity Resolution Using Clique-Based Compression | Towards Data Science
Entity resolution systems face challenges with dense, interconnected graphs, and clique-based graph compression offers an efficient solution by reducing storage overhead and improving system performance during data deletion and reprocessing.
6 years after, where are we?
Interested by your feedback concerning the evolution of the graph technologies landscape and about what the current landscape is.
https://lnkd.in/eEPkExH | 25 comments on LinkedIn
𝙏𝙝𝙤𝙪𝙜𝙝𝙩 𝙛𝙤𝙧 𝙩𝙝𝙚 𝘿𝙖𝙮: What if we could encapsulate everything a person knows—their entire bubble of knowledge, what I’d call a Personal Knowledge Domain or better, our 𝙎𝙚𝙢𝙖𝙣𝙩𝙞𝙘 𝙎𝙚𝙡𝙛, and represent it in an RDF graph? From that foundation, we could create Personal Agents that act on our behalf. Each of us would own our agent, with the ability to share or lease it for collaboration with other agents.
If we could make these agents secure, continuously updatable, and interoperable, what kind of power might we unlock for the human race?
Is this idea so far-fetched? It has solid grounding in knowledge representation, identity theory, and agent-based systems. It fits right in with current trends: AI assistants, the semantic web, Web3 identity, and digital twins. Yes, the technical and ethical hurdles are significant, but this could become the backbone of a future architecture for personalized AI and cooperative knowledge ecosystems.
Pieces of the puzzle already exist: Tim Berners-Lee’s Solid Project, digital twins for individuals, Personal AI platforms like personal.ai, Retrieval-Augmented Language Model agents (ReALM), and Web3 identity efforts such as SpruceID, architectures such as MCP and inter-agent protocols such as A2A. We see movement in human-centric knowledge graphs like FOAF and SIOC, learning analytics, personal learning environments, and LLM-graph hybrids.
What we still need is a unified architecture that:
* Employs RDF or similar for semantic richness
* Ensures user ownership and true portability
* Enables secure agent-to-agent collaboration
* Supports continuous updates and trust mechanisms
* Integrates with LLMs for natural, contextual reasoning
These are certainly not novel notions, for example:
* MyPDDL (My Personal Digital Life) and the PDS (Personal Data Store) concept from MIT and the EU’s DECODE project.
* The Human-Centric AI Group at Stanford and the Augmented Social Cognition group at PARC have also published research around lifelong personal agents and social memory systems.
However, one wonders if anyone is working on combining all of the ingredients into a fully baked cake - after which we can enjoy dessert while our personal agents do our bidding. | 21 comments on LinkedIn
Ora's Rule of Knowledge Graph Implementation | LinkedIn
The other day something reminded me of Greenspun's Tenth Rule: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." https://en.
We talk about knowledge management and systems for knowledge (like knowledge graphs) a lot these days. Especially with the rising interest in #semantics, #metadata, #taxonomies and #ontologies, thanks to AI.
But what makes for knowledge that is operational and actionable?
Less often discussed is knowledge infrastructure.
Fundamental to knowledge management and knowledge repositories, as derived from the field of library and information science, is a service—oriented approach.
Knowledge infrastructure is focused on creating systems that deliver information and knowledge that is accurate and satisfies the requirements of:
⚪️ Creators: those who generate knowledge (researchers, experts, content authors, data producers)
⚪️ Products: the formal outputs of knowledge (e.g., documents, datasets, models, applications, platforms, chatbots/AI assistants)
⚪️ Distributors: systems and platforms that make knowledge available (repositories, databases, APIs)
⚪️ Disseminators: communicators and interpreters (educators, marketers, dashboards, wikis)
⚪️ Users: individuals or systems that apply the knowledge (decisionmakers, AI agents, learners, stakeholders)
Let’s put this into perspective. Without supporting knowledge infrastructures, knowledge becomes a one-off, relegated to silos or single use instances.
We see this with products. When we manage knowledge as a product, we fail to cast a wider net, assuming successes based on metrics that are localized to the product rather than distributed to be inclusive of all signals, input and output.
If knowledge is not managed as infrastructure, we create anti-patterns for the business and AI systems. A recognizable symptom of these anti-patterns are silos.
I’ll be publishing an article soon about knowledge infrastructure, and what it takes to build and manage a knowledge infrastructure program.
#ai #ia #knowledgeinfrastructure
For reference, excerpt from Richard E Rubin’s MLS textbook, Foundations of Information and Library Science in comments 👇👇👇 | 42 comments on LinkedIn
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer that serves as the brain for AI agents to act on knowledge of your internal data and deliver timely, accurate and hallucination-free insights!
#semanticlayer #knowledgegraphs #genai #decisionintelligence
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
Relational Graph Transformers: A New Frontier in AI for Relational Data - Kumo
Relational Graph Transformers represent the next evolution in Relational Deep Learning, allowing AI systems to seamlessly navigate and learn from data spread across multiple tables. By treating relational databases as the rich, interconnected graphs they inherently are, these models eliminate the need for extensive feature engineering and complex data pipelines that have traditionally slowed AI adoption.
In this post, we'll explore how Relational Graph Transformers work, why they're uniquely suited for enterprise data challenges, and how they're already revolutionizing applications from customer analytics and recommendation systems to fraud detection and demand forecasting.
Do you want to fine-tune an LLM model for triplet extraction?
These findings from a recently published paper (first comment) could save you much time.
✅ Does the choice of coding vs natural language prompts significantly impact performance? When fine-tuning these open weights and small LLMs, the choice between code and natural language prompts has a limited impact on performance.
✅ Does training fine-tuned models to include chain-of-thought (rationale) sections in their outputs improve KG construction (KGC) performance? It is ineffective at best and highly detrimental at worst for fine-tuned models. This performance decrease is observed regardless of the number of in-context learning examples provided. Attention analysis suggests this might be due to the model's attention being dispersed on redundant information when rationale is used. Without rationale lists occupying prompt space, the model's attention can focus directly on the ICL examples while extracting relations.
✅ How do the fine-tuned smaller, open-weight LLMs perform compared to the CodeKGC baseline, which uses larger, closed-source models (GPT-3.5)? The selected lightweight LLMs significantly outperform the much larger CodeKGC baseline after fine-tuning. The best fine-tuned models improve upon the CodeKGC baseline by as much as 15–20 absolute F1 points across the dataset.
✅ Does model size matter for KGC performance when fine-tuning with a small amount of training data? Yes, but not in a straightforward way. The 70 B-parameter versions yielded worse results than the 1B, 3B, and 8B models when undergoing the same small amount of training. This implies that for KGC with limited fine-tuning, smaller models can perform better than much larger ones.
✅ For instruction-tuned models without fine-tuning, does prompt language or rationale help? For models without fine-tuning, using code prompts generally yields the best results for both code LLMs and the Mistral natural language model. In addition, using rationale generally seems to help these models, with most of the best results obtained when including rationale lists in the prompt.
✅ What do the errors made by the models suggest about the difficulty of the KGC task? difficulty in predicting relations, entities, and their order, especially when dealing with specialized terminology or specific domain knowledge, which poses a challenge even after fine-tuning. Some errors include adding superfluous adjectives or mistaking entity instances for class names.
✅ What is the impact of the number of in-context learning (ICL) examples during fine-tuning? The greatest performance benefit is obtained when moving from 0 to 3 ICL examples. However, additional ICL examples beyond 3 do not lead to any significant performance delta and can even lead to worse results. This further indicates that the fine-tuning process itself is the primary driver of performance gain, allowing the model to learn the task from the input text and target output.
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role.
It’s not just smarter retrieval. It’s structured memory for AI agents.
》 Why NodeRAG?
Most Retrieval-Augmented Generation (RAG) methods retrieve chunks of text. Good enough — until you need reasoning, precision, and multi-hop understanding.
This is how NodeRAG solves these problems:
》 🔹Step 1: Graph Decomposition
NodeRAG begins by decomposing raw text into smart building blocks:
✸ Semantic Units (S): Little event nuggets ("Hinton won the Nobel Prize.")
✸ Entities (N): Key names or concepts ("Hinton", "Nobel Prize")
✸ Relationships (R): Links between entities ("awarded to")
✩ This is like teaching your AI to recognize the actors, actions, and scenes inside any document.
》 🔹Step 2: Graph Augmentation
Decomposition alone isn't enough. NodeRAG augments the graph by identifying important hubs:
✸ Node Importance: Using K-Core and Betweenness Centrality to find critical nodes
✩ Important entities get special attention — their attributes are summarized into new nodes (A).
✸ Community Detection: Grouping related nodes into communities and summarizing them into high-level insights (H).
✩ Each community gets a "headline" overview node (O) for quick retrieval.
It's like adding context and intuition to raw facts.
》 🔹 Step 3: Graph Enrichment
Knowledge without detail is brittle. So NodeRAG enriches the graph:
✸ Original Text: Full chunks are linked back into the graph (Text nodes, T)
✸ Semantic Edges: Using HNSW for fast, meaningful similarity connections
✩ Only smart nodes are embedded (not everything!) — saving huge storage space.
✩ Dual search (exact + vector) makes retrieval laser-sharp.
It’s like turning a 2D map into a 3D living world.
》 🔹 Step 4: Graph Searching
Now comes the magic.
✸ Dual Search: First find strong entry points (by name or by meaning)
✸ Shallow Personalized PageRank (PPR): Expand carefully from entry points to nearby relevant nodes.
✩ No wandering into irrelevant parts of the graph. The search is surgical.
✩ Retrieval includes fine-grained semantic units, attributes, high-level elements — everything you need, nothing you don't.
It’s like sending out agents into a city — and they return not with everything they saw, but exactly what you asked for, summarized and structured.
》 Results: NodeRAG's Performance
Compared to GraphRAG, LightRAG, NaiveRAG, and HyDE — NodeRAG wins across every major domain: Tech, Science, Writing, Recreation, and Finance.
NodeRAG isn’t just a better graph. NodeRAG is a new operating system for memory.
≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣
⫸ꆛ Want to build Real-World AI agents?
Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 TODAY!
➠ Build Real-World AI Agents + RAG Pipelines
➠ Learn 3 Tools: LangGraph/LangChain | CrewAI | OpenAI Swarm
➠ Work with Text, Audio, Video and Tabular Data
👉𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟯𝟰% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁):
https://lnkd.in/eGuWr4CH
| 20 comments on LinkedIn
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Today, Amazon Web Services (AWS) announced the general availability of Amazon Bedrock Knowledge Bases GraphRAG (GraphRAG), a capability in Amazon Bedrock Knowledge Bases that enhances Retrieval-Augmented Generation (RAG) with graph data in Amazon Neptune Analytics. In this post, we discuss the benefits of GraphRAG and how to get started with it in Amazon Bedrock Knowledge Bases.
Last week I was fortunate to attend the Knowledge Graph Conference in NYC!
Here are a few trends that span multiple presentations and conversations.
- AI and LLM Integration: A major focus [again this year] was how LLMs can be used to enrich knowledge graphs and how knowledge graphs, in turn, can improve LLM outputs. This included using LLMs for entity extraction, verification, inference, and query generation. Many presentations demonstrated how grounding LLMs in knowledge graphs leads to more accurate, contextual, and explainable AI responses.
- Semantic Layers and Enterprise Knowledge: There was a strong emphasis on building semantic layers that act as gateways to structured, connected enterprise data. These layers facilitate data integration, governance, and more intelligent AI agents. Decentralized semantic data products (DPROD) were discussed as a framework for internal enterprise data ecosystems.
- From Data to Knowledge: Many speakers highlighted that AI is just the “tip of the iceberg” and the true power lies in the data beneath. Converting raw data into structured, connected knowledge was seen as crucial. The hidden costs of ignoring semantics were also discussed, emphasizing the need for consistent data preparation, cleansing, and governance.
- Ontology Management and Change: Managing changes and governance in ontologies was a recurring theme. Strategies such as modularization, version control, and semantic testing were recommended. The concept of “SemOps” (Semantic Operations) was discussed, paralleling DevOps for software development.
- Practical Tools and Demos: The conference included numerous demos of tools and platforms for building, querying, and visualizing knowledge graphs. These ranged from embedded databases like KuzuDB and RDFox to conversational AI interfaces for KGs, such as those from Metaphacts and Stardog.
I especially enjoyed catching up with the Semantic Arts team (Mark Wallace, Dave McComb and Steve Case), talking Gist Ontology and SemOps. I also appreciated the detailed Neptune Q&A I had with Brian O'Keefe, the vision of Ora Lassila and then a chance meeting Adrian Gschwend for the first time, where we connected on LinkML and Elmo as a means to help with bidirectional dataflows. I was so excited by these conversations that I planned to have two team members join me in June at the Data Centric Architecture Workshop Forum, https://www.dcaforum.com/
On the different roles of ontologies (& machine learning) | LinkedIn
In a previous post I was touching on how ontologies are foundational to many data activities, yet "obscure". As a consequence, the different roles of ontologies are not always known among people that make use of them, as they may focus only on some of the aspects relevant for specific use cases.
pretty stoked that our paper on the vectorised #SPARQL execution engine in Stardog got accepted to the GRADES-NDA workshop at #SIGMOD2025. It's a cool piece of work describing how modern vectorised join algorithms, more widely known in the SQL world, make graph query processing much more efficient. Talking of up to an order of magnitude difference when it comes to analytical queries and large-scale joins.
Hugely proud of my brilliant co-authors Simon Grätzer (the lead engineer on the BARQ project) and Lars Heling. It was their idea to do this work, and I couldn't be more proud that it worked out in the end.
The preprint is now on arXiv: https://lnkd.in/eqXtVMqe
ectorised hashtag#SPARQL execution engine in Stardog got accept