Neuro-symbolic AI: The key to truly intelligent systems
Unlock true AI potential with neuro-symbolic AI. Learn how combining LLMs with knowledge graphs solves unreliable and inaccurate outputs for enterprise success.
What is an ontology? Well it depends on whoโs talking.
Ontology talk has sprung up a lot in data circles the last couple of years. You may have read in the news that the Department of Defense adopted an ontology, Juan will tell you enterprise AI needs an ontology, Jessica will tell you how to build an ontology pipeline, and Palantir will gladly sell you one (๐คฆโโ๏ธ). Few people actually spell out what they mean when talking about โontologyโ and unsurprisingly theyโre not all talking about the same thing.
Ontology is a borrow word for information scientists who took it from philosophy where ontology is an account of the fundamental things around us. Some of you no doubt read Platoโs Republic with the allegory of the cave, which introduces the theory of forms. Aristotleโs had two ontologies, one in the Categories and another in the Metaphysics. (My friend Jessica would call the former a Taxonomy). When I talk about ontology as a philosopher Iโm interested in the fundamental nature or reality. Is it made up of medium sized dry goods or subatomic wave functions.
Information scientists arenโt interested in the fundamental nature of reality, but they are interested in how we organize our data about reality. So when they talk about ontologies they actually mean one of several different technologies.
When Juan talks about ontologies I know in my head he means knowledge graphs. This introduces a regression because knowledge graphs can be implemented in number of different ways, though the Resource Description Framework (RDF) is probably the most popular. If youโve ever built a website, RDF will look familiar because itโs simply URIs that represent subject predicate object triples. (Juan-works at-ServiceNow) Because weโre technologists there are a number of different ways to represent, store, and query a knowledge graph. (See XKCD 927)
Knowledge graphs are cool and all, but theyโre not the only approach to ontologies. When the DoD went shopping for an ontology, they started with an upper formal ontology, specifically the Basic Formal Ontology. I think BFO is cool if only because itโs highly influenced by philosophy through the work of philosopher Barry Smith (Buffalo). Formal ontologies can organize the concepts, relations, axioms, across large domains like healthcare, but theyโre best fit for slowly evolving industries. While BFO might be the most popular upper ontology itโs certainly not the only one on the market.
My own view is that in data weโre all engaged in ontological work in a broad sense. If youโre building a data model, you need a good account of โwhat there isโ for the business domain. At what grain do we count inventory? Bottles, cases, pallets, etc? The more specific we get around doing ontological work, the harder the deliverables become. eg knowledge graphs are harder to build than data models, formal ontologies are harder to build than knowledge graphs. Most organizations need good data models over formal ontologies. | 109 comments on LinkedIn
A database tells you what is connected. A knowledge graph tells you why.
A database tells youย whatย is connected.
A knowledge graph tells youย why.
โ SQL hides semantics in schema logic. Foreign keys donโt explain relationships, they just enforce them.
โ Knowledge graphs make relationships explicit. Edges have meaning, context, synonyms, hierarchies.
โ Traversal in SQL = JOIN gymnastics. Traversal in a KG = natural multi-hop reasoning.
Benchmarks show LLMs answered enterprise questions correctlyย 16.7% of the time over SQL โฆ vs.ย 54.2% over the same data in a KG. Same data, different representation.
Sure, you can bolt ontologies, synonyms, and metadata onto SQL. But at that point, youโve basically reinvented a knowledge graph.
So the real question is:
Do you want storage, or do you want reasoning?
#KnowledgeGraphs #AI #LLM #Agents #DataEngineering | 51 comments on LinkedIn
A database tells youย whatย is connected.A knowledge graph tells youย why.
Tried Automating Knowledge Graphs โ Ended Up Rewriting Everything I Knew
This post captures the desire for a short cut to #KnowledgeGraphs, the inability of #LLMs to reliably generate #StructuredKnowledge, and the lengths folks will go to realize even basic #semantic queries (the author manually encoded 1,000 #RDF triples, but didnโt use #OWL). https://lnkd.in/eJE_27gS #Ontologists by nature are generally rigorous, if not a tad bit pedantic, as they seek to structure #domain knowledge. 25 years of #SemanticWeb and this is still primarily a manual, tedious, time-consuming and error-prone process. In part, #DeepLearning is a reaction to #structured, #labelled, manually #curated #data (#SymbolicAI). When #GenAI exploded on the scene a couple of years ago, #Ontologist were quick to note the limitations of LLMs. Now some #Ontologists are having a "Road to Damascus" moment - they are aspirationally looking to Language Models as an interface for #Ontologies to lower barrier to ontology creation and use, which are then used for #GraphRAG, but this is a circular firing squad given the LLM weaknesses they have decried. This isn't a solution, it's a Hail Mary. They are lowering the standards on quality and setting up the even more tedious task of identifying non-obvious, low-level LLM errors in an #Ontology (same issue Developers have run into with LLM CodeGen - good for prototypes, not for production code). The answer is not to resign ourselves and subordinate ontologies to LLMs, but to take the high-road using #UpperOntologies to ease and speed the design, use and maintenance of #KGs. An upper ontology is a graph of high-level concepts, types and policies independent of a specific #domain implementation. It provides an abstraction layer with re-usable primitives, building blocks and services that streamline and automate domain modeling tasks (i.e., a #DSL for DSLs). Importantly, an upper ontology drives well-formed and consistent objects and relationships and provides for governance (e.g., security/identity, change management). This is what we do EnterpriseWeb. #Deterministic, reliable, trusted ontologies should be the center of #BusinessArchitecture, not a side-car to an LLM.
Every knowledge system has to wrestle with a deceptively simple question: what do we assert, and what do we derive? That line between assertion and derivation is where Object-Role Modeling (ORM) and the Resource Description Framework (RDF) with the Web Ontology Language (OWL) go in radically differe
Enterprise Adoption of GraphRAG: The CRUD Challenge
Enterprise Adoption of GraphRAG: The CRUD Challenge
GraphRAG and other retrieval-augmented generation (RAG) workflows are currently attracting a lot of attention. Their prototypes are impressive, with data ingestion, embedding generation, knowledge graph creation, and answer generation all functioning smoothly.
However, without proper CRUD (Create, Read, Update, Delete) support, these systems are limited to academic experimentation rather than becoming enterprise-ready solutions.
Update: knowledge is constantly evolving. Regulations change, medical guidelines are updated, and product catalogues are revised. If a system cannot reliably update its information, it will produce outdated answers and quickly lose credibility.
Delete: Incorrect or obsolete information must be deleted. In regulated industries such as healthcare, finance and law, retaining deleted data can lead to compliance issues. Without a deletion mechanism, incorrect or obsolete information can persist in the system long after it should have been removed.
This is an issue that many GraphRAG pilots face. Although the proof of concept looks promising, limitations become evident when someone asks, "What happens when the source of truth changes?"
While reading and creation are straightforward, updates and deletions determine whether a system remains a prototype or becomes a reliable enterprise tool. Most implementations stop at 'reading', and while retrieval and answer generation work, real-world enterprise systems never stand still.
In order for GraphRAG and RAG in general to transition from research labs to widespread enterprise adoption, support for CRUD must be an fundamental aspect of the design process.
#GraphRAG #RAG #KnowledgeGraph #EnterpriseAI #CRUD #EnterpriseAdoption #TrustworthyAI #DataManagement
Enterprise Adoption of GraphRAG: The CRUD Challenge
A Knowledge Graph for "No other choice", the dark comedy thriller by Park Chan-wook thatโs leading the Venice buzz.
๐ฅ ๐ง๐ต๐ฒ ๐ณ๐ถ๐น๐บ ๐๐ต๐ฎ๐โ๐ ๐ฎ๐ฏ๐ผ๐๐ ๐๐ผ ๐๐ถ๐ป ๐ฉ๐ฒ๐ป๐ถ๐ฐ๐ฒ? ๐๐ฒ๐'๐ ๐บ๐ฎ๐ฝ ๐ถ๐ ๐ถ๐ป ๐ฎ ๐๐ป๐ผ๐๐น๐ฒ๐ฑ๐ด๐ฒ ๐๐ฟ๐ฎ๐ฝ๐ต
Just asked ๐๐ด๐ฒ๐ป๐ ๐ช๐ผ๐ฟ๐ฑ๐๐ถ๐ณ๐ to retrieve the ๐ฎ๐๐๐๐๐ ๐ฒ๐๐๐๐๐๐ ๐๐ ๐ฎ๐๐๐๐ details for "๐ต๐ ๐ถ๐๐๐๐ ๐ช๐๐๐๐๐" โ the dark comedy thriller by Park Chan-wook thatโs leading the Venice buzz.
๐๐ฒ๐ฟ๐ฒ ๐๐ต๐ฒ ๐ฅ๐ฒ๐๐๐น๐:
โ Google KG entity ID: /g/11w2hjvh3j
โ Wikidata: Q129906152
โ Cast, director, source material, and more โ in under a second.
โ Fallback-safe, deeply enriched with attributes from Wikidata.
This is Enhanced Entity Research via MCP (Model Context Protocol).
โ๏ธ Pulls data from Googleโs Enterprise KG
โ๏ธ Enriches with Wikidata (gender, occupation, relationships, etc.)
โ๏ธ Builds 3โ5ร richer profiles โ instantly ready for clustering, schema, content.
๐ช๐ต๐ ๐ฑ๐ผ๐ฒ๐ ๐ถ๐ ๐บ๐ฎ๐๐๐ฒ๐ฟ?
This is what AI-ready structured data looks like โ live, real-time, and grounded. From the Venice Film Festival to your Knowledge Graph in milliseconds.
๐ฅ Donโt let a cascade of incompetence pollute your AI workflows.
๐ก Combine structured knowledge with agentic AI to drive precision, context, and trust across multi-step tasks.
๐ Enjoy the artifact (on Claude): https://lnkd.in/dYYMby26
๐ Full workflow: https://lnkd.in/dNHJ_DpP
the dark comedy thriller by Park Chan-wook thatโs leading the Venice buzz.
Google Cloud releases new Agentspace Knowledge Graph, built on Spanner Graph
It's great to see the launch of Google Cloud's new Agentspace Knowledge Graph, built on Spanner Graph.
Agentspace Knowledge Graph (https://lnkd.in/gYM6xZQS) allows an AI agent to understand the real-world context of your organizationโthe web of relationships between people, projects, and products. This is the difference between finding a document and understanding who wrote it, what team they're on, and what project it's for.
Because this context is a network, the problem is uniquely suited for a graph model. Spanner Graph (https://lnkd.in/gkwbGFbS) provides a natural way to model this reality, allowing an AI agent to instantly traverse complex connections to find not just data, but genuine insight.
This is how we move from AI that finds information to AI that understands it. The ability to reason over the "why" behind the data is a true game-changer.
#GoogleCloud #GenAI #Agentspace #SpannerGraph #KnowledgeGraph
Because this context is a network, the problem is uniquely suited for a graph model. Spanner Graph (https://lnkd.in/gkwbGFbS) provides a natural way to model this reality, allowing an AI agent to instantly traverse complex connections to find not just data, but genuine insight.
Why is ontology engineering such a mess?
There's a simple reason: proprietary data models, proprietary inference engines, and proprietary query engines.
Some ontology traditions have always been open standards: Prolog/Datalog, RDF, conceptual graphs all spring to mind.
However, startups in ontology often take government money, apparently under conditions that inspire them to close their standards. One notable closed-standard ontology vendor is Palantir. If you look into the world of graph databases, you will discover many more vendors operating on closed standards, as well as some vendors who've implemented less popular and in my view less user-friendly open standards.
My advice to ontology consultants and to their clients is to prioritize vendors that implement open standards. Given that this list includes heavyweights like Oracle and AWS, it isn't hard to remain within one's comfort zone while embracing open standards. Prolog and RDF are likely the most popular and widely known standards for automated inference, knowledge representation, etc. There are more potential engineers and computer scientists and modelers who've trained on these standards than any vendor you may wish to name with a closed standard, there are prebuilt ontologies and query rewriting approaches and inference engine profiles, there are constraint programming approaches for both, etc.
Oracle and AWS have chosen to go with open standards rather than inventing some new graph data model and yet another query processor to handle the same inference and business rule workloads we've been handling with various technologies since the 1950s. Learn from their example, and please quit wasting all of our time on Earth by reinventing the semantic network.
Debunking Urban Myths about RDF and Explaining How Ontologies Help GraphRAG | LinkedIn
I recently came across some misconceptions about why the LPG graph model is more effective than RDF for GraphRAG, and I wrote this article to debunk them. At the end, I also elaborate on two principal advantages of RDF when it comes to provision of context and grounding to LLMs (i) schema languages
Guy van den Broeck (UCLA)https://simons.berkeley.edu/talks/guy-van-den-broeck-ucla-2025-04-29Theoretical Aspects of Trustworthy AIToday, many expect AI to ta...
Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions
Autonomous vehicles (AVs) represent a transformative innovation in transportation, promising enhanced safety, efficiency, and sustainability. Despite these promises, achieving robustness, reliability, and adherence to ethical standards in AV systems remains challenging due to the complexity of integrating diverse technologies. This survey reviews literature from 2017 to 2023, analyzing over 90 papers to explore the integration of knowledge graphs (KGs) into AV technologies. Our findings indicate that KGs significantly enhance AV systems by providing structured semantic understanding, improving real-time decision-making, and ensuring compliance with regulatory standards. The paper identifies that while KGs contribute to better environmental perception and contextual reasoning, challenges remain in their seamless integration with existing systems and in maintaining processing speed. We also address the ethical dimensions of AV decision-making, advocating for frameworks that prioritize safety and transparency. This review underscores the potential of KGs to address critical challenges in AV technologies, offering a hopeful and optimistic outlook for the development of robust, reliable, and socially responsible autonomous transportation solutions.
Blue Morpho: A new solution for building AI apps on top of knowledge bases
Blue Morpho: A new solution for building AI apps on top of knowledge bases
Blue Morpho helps you build AI agents that understand your business context, using ontologies and knowledge graphs.
Knowledge Graphs work great with LLMs. The problem is that building KGs from unstructured data is hard.
Blue Morpho promises a system that turns PDFs and text files into knowledge graphs. KGs are then used to augment LLMs with the right context to answer queries, make decisions, produce reports, and automate workflows.
How it works:
1. Upload documents (pdf or txt).
2. Define your ontology: concepts, properties, and relationships. (Coming soon: ontology generation via AI assistant.)
3. Extract a knowledge graph from documents based on that ontology. Entities are automatically deduplicated across chunks and documents, so every mention of โWalmart,โ for example, resolves to the same node.
4. Build agents on top. Connect external ones via MCP, or use Blue Morpho: Q&A (โtext-to-cypherโ) and Dashboard Generation agents.
Blue Morpho differentiation:
- Strong focus on reliability. Guardrails in place to make sure LLMs follow instructions and the ontology.ย
- Entity deduplication, with AI reviewing edge cases.
- Easy to iterate on ontologies: they are versioned, extraction runs are versioned as well with all their parameters, and changes only trigger necessary recomputes.ย
- Vector embeddings are only used in very special circumstances, coupled with other techniques.
Link in comments. Jรฉrรฉmy Thomas
#KnowledgeGraph #AI #Agents #MCP #NewRelease #Ontology #LLMs #GenAI #Application
--
Connected Data London 2025 is coming! 20-21 November, Leonardo Royal Hotel London Tower Bridge
Join us for all things #KnowledgeGraph #Graph #analytics #datascience #AI #graphDB #SemTech #Ontology
๐๏ธ Ticket sales are open. Benefit from early bird prices with discounts up to 30%. https://lnkd.in/diXHEXNE
๐บ Sponsorship opportunities are available. Maximize your exposure with early onboarding. Contact us at info@connected-data.london for more.
Blue Morpho: A new solution for building AI apps on top of knowledge bases
Box's Invisible Moat: The permission graph driving 28% operating margins
Everyone's racing to build AI agents.
Few are thinking about data permissions.
Box spent two decades building a boring moat- a detailed map of who can touch what document, when, why, and with what proof.
This invisible metadata layer is now their key moat against irrelevance.
Q2 FY26:
โ Revenue: $294M (+9% YoY)
โ Gross margin: 81.4%
โ Operating margin: 28.6%
โ Net retention: 103%
โ Enterprise Advanced: 10% of revenue (up from 5%)
Slow-growth, high-margin business at a crossroads.
The Permission Graph
Every document in Box has a shadow: its permission metadata. Who created it, modified it, can access it. What compliance rules govern it. Which systems can call it.
When an AI agent requests a contract, it needs more than the PDF. It needs proof it's allowed to see it, verification it's the right version, an audit trail.
Twenty years of accumulated governance that can't be easily replicated.
Why This Matters Now
The CEO Aaron Levie recently told CNBC: "If you don't maintain access controls well, AI agents will find the wrong information - leading to wrong answers or security incidents."
Every enterprise faces the same AI crisis: scattered data with inconsistent permissions, no unified governance, one breach risking progress.
The permission graph solves this.
The Context Control Problem
Box recently launched Enterprise Advanced: AI agents, workflow automation, document generation. They are adding contextual layers because they see a future where AI agents calling their API while users never see Box.
Microsoft owns the experience.
Box becomes plumbing.
This push is their attempt to stay visible. But it's still Product Rails, not Operating Rails. They're adding features to documents, not deepening their permission moat.
The Bull vs Bear Case
Bull: Enterprises will pay for bulletproof governance even if transformation happens elsewhere. The permission graph remains valuable.
Bear: Microsoft acquires or partners with Varonis + Cloudfuze to recreate the graph. The moat may not be deep enough.
Every SaaS Company's Dilemma
Box isn't alone. Every legacy SaaS faces the same question: how do you avoid becoming invisible infrastructure?
They're all trying the same failing playbook. Add AI features, claim "AI-native," hope the moat holds.
Box's advantage: the permission graph is genuinely hard to replicate.ย
Box's disadvantage: they still think like a document storage company.
Market's View
Box has 81% gross margins on commodity storage because of the permission graph. Yet the market values them at 24x forward P/E, not pricing the graph premium.
The other factor is that Box is led by Aaron Levie. He's a founder who's spent two decades obsessing over one problem: enterprise content governance.
โ
That obsession matters now more than ever. โ
The question isn't whether the permission graph has value. It's whether Box can deepen the moat before others make it irrelevant.
(Full version sent to subscribers) | 25 comments on LinkedIn
Third edition of the Knowledge Graphs course at Ghent University.
In February 2026 I will start teaching the third edition of the Knowledge Graphs course at Ghent University. This is an elective course in which I teach everything I know about creating interoperable data ecosystems.
As in the previous editions, we open up this elective course as well to professionals using a micro credential. Feel like going back to school? We poured our heart and soul into this one.
๐ค ๐ https://lnkd.in/euUiiEwJ
Co-teachers include Ruben Verborgh, Ben De Meester, Ruben Taelman and yourself (thereโs a peer teaching assignment).
In February 2026 I will start teaching the third edition of the Knowledge Graphs course at Ghent University.
๐๐ฎ๐ป ๐ฎ ๐ฅ๐ฒ๐น๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ ๐๐ฒ ๐ฎ ๐๐ป๐ผ๐๐น๐ฒ๐ฑ๐ด๐ฒ ๐๐ฟ๐ฎ๐ฝ๐ต?
Not all knowledge graphs are equal. A semantic KG (RDF/OWL/Stardog) isnโt the same as a property graph (Neo4j), and both differ from enforcing graph-like structures in a relational DB (CockroachDB/Postgres). Each has strengths and trade-offs:
๐ฆ๐ฒ๐บ๐ฎ๐ป๐๐ถ๐ฐ ๐๐๐ excel at reasoning and inference over ontologies.
๐ฃ๐ฟ๐ผ๐ฝ๐ฒ๐ฟ๐๐ ๐ด๐ฟ๐ฎ๐ฝ๐ต๐ shine when exploring relationships with intuitive query patterns.
๐ฅ๐ฒ๐น๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐ฎ๐ฝ๐ฝ๐ฟ๐ผ๐ฎ๐ฐ๐ต๐ฒ๐ enforce graph-like models via schema, FKs, indexes, recursive CTEs โ with added benefits of scale, distributed TXs, and decades of maturity.
๐๐ฟ๐ฎ๐ฝ๐ต ๐ง๐ฟ๐ฎ๐๐ฒ๐ฟ๐๐ฎ๐น๐ ๐ถ๐ป ๐ฆ๐ค๐
Recursive CTEs let SQL โwalk the graph.โ Start with a base case (movie + actors), then repeatedly join back to discover multi-hop paths (actors โ movies โ actors โ movies). This simulates โfriends-of-friendsโ traversals in a few lines of SQL.
๐ฅ๐๐, ๐๐ฟ๐ฎ๐ฝ๐ต๐ฅ๐๐, ๐ฎ๐ป๐ฑ ๐๐๐ ๐
RAG and GraphRAG give LLMs grounding in structured data, reducing hallucinations and injecting context. Whether via RDF triples, LPG edges, or SQL joins โ the principle is the same: real relationships fuel better answers.
๐ง๐ต๐ฒ ๐ฏ-๐๐ผ๐ฝ ๐๐ฟ๐ด๐๐บ๐ฒ๐ป๐
Some vendors claim SQL breaks down after 3 hops. In reality, recursive CTEs traverse arbitrary depth. SQL may not be as compact as Cypher or GQL, but itโs expressive and efficient โ the โ3-hop wallโ is outdated FUD.
๐๐ผ๐ฎ๐ฑ๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ ๐ฎ๐ ๐ฆ๐ฐ๐ฎ๐น๐ฒ
One graph DB is notorious for slow, resource-heavy CSV loads. Distributed RDBMS like CockroachDB can bulk ingest 100s of GB to TBs efficiently.
๐ก๐ผ ๐ฆ๐๐ฎ๐น๐ฒ ๐๐ฎ๐๐ฎ
Too often, data must move from TX systems into a graph before use โ by then, itโs stale. For AI-driven apps, that lag means hallucinations, missed insights, and poor UX.
๐ช๐ต๐ ๐ง๐ต๐ถ๐ ๐ ๐ฎ๐๐๐ฒ๐ฟ๐ ๐ณ๐ผ๐ฟ ๐๐
As AI apps go multi-regional and global, they demand low latency + strong consistency. Centralized graph DBs hit lag, hotspots, scaling pain. Distributed SQL delivers expressive queries and global consistency โ exactly what AI workloads need.
You donโt need to pick โgraphโ or โrelationalโ as religion. Choose the right model for scale, consistency, and AI grounding. Sometimes RDF. Sometimes LPG. And sometimes, graph-enforced in SQL.
#KnowledgeGraph #ArtificialIntelligence #GenerativeAI #DistributedSQL #CockroachDB
| 11 comments on LinkedIn
Build a knowledge graph from structured & unstructured data [Code Tutorial]
Looking into building knowledge graphs? Check out this code tutorial on how we built a knowledge graph of the latest 'La Liga' standings! โฝ๏ธ๐ฉโ๐ป Google Coll...
Webinar: Semantic Graphs in Action - Bridging LPG and RDF Frameworks - Enterprise Knowledge
As organizations increasingly prioritize linked data capabilities to connect information across the enterprise, selecting the right graph framework to leverage has become more important than ever. In this webinar, graph technology experts from Enterprise Knowledge Elliot Risch, James Egan, David Hughes, and Sara Nash shared the best ways to manage and apply a selection of these frameworks to meet enterprise needs.
A new notebook exploring Semantic Entity Resolution & Extraction using DSPy and Google's new LangExtract library.
Just released a new notebook exploring Semantic Entity Resolution & Extraction using DSPy (Community) and Google's new LangExtract library.
Inspired by Russell Jurneyโs excellent work on semantic entity resolution, this demo follows his approach of combining:
โ embeddings,
โ kNN blocking,
โ and LLM matching with DSPy (Community).
On top of that, I added a general extraction layer to test-drive LangExtract, a Gemini-powered, open-source Python library for reliable structured information extraction. The goal? Detect and merge mentions of the same real-world entities across text.
Itโs an end-to-end flow tackling one of the most persistent data challenges.
Check it out, experiment with your own data, ๐๐ง๐ฃ๐จ๐ฒ ๐ญ๐ก๐ ๐ฌ๐ฎ๐ฆ๐ฆ๐๐ซ and let me know your thoughts!
cc Paco Nathan you might like this ๐
https://wor.ai/8kQ2qa
a new notebook exploring Semantic Entity Resolution & Extraction using DSPy (Community) and Google's new LangExtract library.
Stop manually building your company's brain. โ
Having reviewed the excellent DeepLearning.AI lecture on Agentic Knowledge Graph Construction, by Andreas Kollegger and writing a book on Agentic graph system with Sam Julien, it is clear that the use of agentic systems represents a shift in how we build and maintain knowledge graphs (KGs).
Most organizations are sitting on a goldmine of data spread across CSVs, documents, and databases.
The dream is to connect it all into a unified Knowledge Graph, an intelligent brain that understands your entire business.
The reality? It's a brutal, expensive, and unscalable manual process.
But a new approach is changing everything.
Hereโs the new playbook for building intelligent systems:
๐ง Deploy an AI Agent Workforce
Instead of rigid scripts, you use a cognitive assembly line of specialized AI agents. A Proposer agent designs the data model, a Critic refines it, and an Extractor pulls the facts.
This modular approach is proven to reduce errors and improve the accuracy and coherence of the final graph.
๐จ Treat AI as a Designer, Not Just a Doer
The agents act as data architects. In discovery mode, they analyze unstructured data (like customer reviews) and propose a new logical structure from scratch.
In an enterprise with an existing data model, they switch to alignment mode, mapping new information to the established structure.
๐๏ธ Use a 3-Part Graph Architecture
This technique is key to managing data quality and uncertainty. You create three interconnected graphs:
The Domain Graph: Your single source of truth, built from trusted, structured data.
The Lexical Graph: The raw, original text from your documents, preserving the evidence.
The Subject Graph: An AI-generated bridge that connects them. It holds extracted insights that are validated before being linked to your trusted data.
Jaro-Winkler is a string comparison algorithm that measures the similarity or edit distance between two strings. It can be used here for entity resolution, the process of identifying and linking entities from the unstructured text (Subject Graph) to the official entities in the structured database (Domain Graph).
For example, the algorithm compares a product name extracted from a customer review (e.g., "the gothenburg table") with the official product names in the database. If the Jaro-Winkler similarity score is above a certain threshold, the system automatically creates a CORRESPONDS_TO relationship, effectively linking the customer's comment to the correct product in the supply chain graph.
๐ค Augment Humans, Don't Replace Them
The workflow is Propose, then Approve. AI does the heavy lifting, but a human expert makes the final call.
This process is made reliable by tools like Pydantic and Outlines, which enforce a rigid contract on the AI's output, ensuring every piece of data is perfectly structured and consistent.
And once discovered and validated, a schema can be enforced. | 32 comments on LinkedIn
by J Bittner John Sowa once observed: In logic, the existential quantifier โ is a notation for asserting that something exists. But logic itself has no vocabulary for describing the things that exist.