Why AI Hallucinates: The Shallow Semantics Problem | LinkedIn
By J Bittner Part 1 in our 5-part series: From Hallucination to Reasoning—The Case for Ontology-Driven AI Welcome to “Semantically Speaking”—a new series on what makes AI systems genuinely trustworthy, explainable, and future-proof. This is Part 1 in a 5-part journey, exploring why so many AI system
‘The Relational Model Always Wins,' RelationalAI CEO Says
The tech industry has a voracious appetite for the Next Big Thing. But sometimes, it’s the older thing that ends up being the right tool for a new job.
Multi-modal graphs are everywhere in the digital world.
Yet the tools used to understand them haven't evolved as much as one would expect.
What if the same model could handle your social network analysis, molecular discovery, AND urban planning tasks?
A new paper from Tsinghua University proposes Multi-modal Graph Large Language Models (MG-LLM) - a paradigm shift in how we process complex interconnected data that combines text, images, audio, and structured relationships.
Think of it as ChatGPT for graphs, but, metaphorically speaking, with eyes, ears, and structural understanding.
Their key insight? Treating all graph tasks as generative problems.
Instead of training separate models for node classification, link prediction, or graph reasoning, MG-LLM frames everything as transforming one multi-modal graph into another.
This unified approach means the same model that predicts protein interactions could also analyze social media networks or urban traffic patterns.
What makes this particularly exciting is the vision for natural language interaction with graph data. Imagine querying complex molecular structures or editing knowledge graphs using plain English, without learning specialized query languages.
The challenges remain substantial - from handling the multi-granularity of data (pixels to full images) to managing multi-scale tasks (entire graph input, single node output).
But if successful, this could fundamentally change the level of graph-based insights across industries that have barely scratched the surface of AI adoption.
↓
𝐖𝐚𝐧𝐭 𝐭𝐨 𝐤𝐞𝐞𝐩 𝐮𝐩? Join my newsletter with 50k+ readers and be the first to learn about the latest AI research: llmwatch.com 💡
A Knowledge Graph Approach for the Standardization, Integration and Exploitation of Carbon Emissions Data
The European Union's (EU) Carbon Border Adjustment Mechanism (CBAM) aims to prevent carbon leakage and accelerate global decarbonization by aligning carbon
Gartner 2025 AI Hype Cycle: The focus is shifting from hype to foundational innovations
Gartner 2025 AI Hype Cycle: The focus is shifting from hype to foundational innovations
Knowledge Graphs are a key part of the shift, positioned on the slope of enlightenment
By Haritha Khandabattu and Birgi Tamersoy:
Al investment remains strong, but focus is shifting from GenAl hype to foundational innovations like Al-ready data, Al agents, Al engineering and ModelOps.
This research helps leaders prioritize high-impact, emerging Al techniques while navigating regulatory complexity and operational scaling.
As Gartner notes, Generative AI capabilities are advancing at a rapid pace and the tools that will become available over the next 2-5 years will be transformative.
The rapid evolution of these technologies and techniques continues unabated, as does the corresponding hype, making this tumultuous landscape difficult to navigate.
These conditions mean GenAI continues to be a top priority for the C-suite.
Weaving in another foundational concept, Systems of Intelligence as coined by Geoffrey Moore and reference by David Vellante and George Gilbert:
Systems of Intelligence are the linchpin of modern enterprise architecture because [AI] agents are only as smart as the state of the business represented in the knowledge graph.
If a platform controls that graph, it becomes the default policymaker for “why is this happening, what comes next, and what should we do?”
For enterprises, there is only one feasible answer to the "who controls the graph" question: you should.
To do that, start working on your enterprise knowledge graph today, if you haven't already.
And if you are looking for the place to learn, network, and share experience and knowledge, look no further 👇
Connected Data London 2025 has been announced! 20-21 November, Leonardo Royal Hotel London Tower Bridge
Join us for all things #KnowledgeGraph #Graph #analytics #datascience #AI #graphDB #SemTech
🎟️ Ticket sales are open. Benefit from early bird prices with discounts up to 30%. 2025.connected-data.london
📋 Call for submissions is open. Check topics of interest, submission process and evaluation criteria https://lnkd.in/dhbAeYtq
📺 Sponsorship opportunities are available. Maximize your exposure with early onboarding. Contact us at info@connected-data.london for more.
Gartner 2025 AI Hype Cycle: The focus is shifting from hype to foundational innovations
Multi-modal graphs are everywhere in the digital world.
Yet the tools used to understand them haven't evolved as much as one would expect.
What if the same model could handle your social network analysis, molecular discovery, AND urban planning tasks?
A new paper from Tsinghua University proposes Multi-modal Graph Large Language Models (MG-LLM) - a paradigm shift in how we process complex interconnected data that combines text, images, audio, and structured relationships.
Think of it as ChatGPT for graphs, but, metaphorically speaking, with eyes, ears, and structural understanding.
Their key insight? Treating all graph tasks as generative problems.
Instead of training separate models for node classification, link prediction, or graph reasoning, MG-LLM frames everything as transforming one multi-modal graph into another.
This unified approach means the same model that predicts protein interactions could also analyze social media networks or urban traffic patterns.
What makes this particularly exciting is the vision for natural language interaction with graph data. Imagine querying complex molecular structures or editing knowledge graphs using plain English, without learning specialized query languages.
The challenges remain substantial - from handling the multi-granularity of data (pixels to full images) to managing multi-scale tasks (entire graph input, single node output).
But if successful, this could fundamentally change the level of graph-based insights across industries that have barely scratched the surface of AI adoption.
↓
𝐖𝐚𝐧𝐭 𝐭𝐨 𝐤𝐞𝐞𝐩 𝐮𝐩? Join my newsletter with 50k+ readers and be the first to learn about the latest AI research: llmwatch.com 💡
HippoRAG takes cues from the brain to improve LLM retrieval
HippoRAG is a technique inspired from the interactions between the cortex and hippocampus to improve knowledge retrieval for large language models (LLM).
Knowledge graphs as the foundation for Systems of Intelligence
In this Breaking Analysis we examine how Snowflake moves Beyond Walled Gardens and is entering a world where it faces new competitive dynamics from SaaS vendors like Salesforce, ServiceNow, Palantir and of course Databricks.
Beyond Walled Gardens: How Snowflake Navigates New Competitive Dynamics
AI Engineer World's Fair 2025: GraphRAG Track Spotlight
📣 AI Engineer World's Fair 2025: GraphRAG Track Spotlight! 🚀
So grateful to have hosted the GraphRAG Track at the Fair. The sessions were great, highlighting the depth and breadth of graph thinking for AI.
Shoutouts to...
- Mitesh Patel "HybridRAG" as a fusion of graph and vector retrieval designed to master complex data interpretation and specialized terminology for question answering
- Chin Keong Lam "Wisdom Discovery at Scale" using Knowledge Augmented Generation (KAG) in a multi agent system with n8n
- Sam Julien "When Vectors Break Down" carefully explaining how graph-based RAG architecture achieved a whopping 86.31% accuracy for dense enterprise knowledge
- Daniel Chalef "Stop Using RAG as Memory" explored temporally-aware knowledge graphs, built by the open-source Graphiti framework, to provide precise, context-rich memory for agents,
- Ola Mabadeje "Witness the power of Multi-Agent AI & Network Knowledge Graphs" showing dramatic improvements in ticket resolution efficiency and overall execution quality in network operations.
- Thomas Smoker "Beyond Documents"! casually mentioning scraping the entire internet to distill a knowledge graph focused with legal agents
- Mark Bain hosting an excellent Agentic Memory with Knowledge Graphs lunch&learn, with expansive thoughts and demos from Vasilije Markovic Daniel Chalef and Alexander Gilmore
Also, of course, huge congrats to Shawn swyx W and Benjamin Dunphy on an excellent conference. 🎩
#graphrag Neo4j AI Engineer
AI Engineer World's Fair 2025: GraphRAG Track Spotlight
OntoAligner: A Comprehensive Modular and Robust Python Toolkit for...
Ontology Alignment (OA) is fundamental for achieving semantic interoperability across diverse knowledge systems. We present OntoAligner, a comprehensive, modular, and robust Python toolkit for...
Last week, I was happy to be able to attend the 22nd European Semantic Web Conference. I’m a regular at this conference and it’s great to see many friends and colleagues as well as meet…
Building more Expressive Knowledge Graph Nodes | LinkedIn
In a knowledge graph, more expressive nodes are clearly more useful, dramatically more valuable nodes – when we focus on the right nodes. This was a key lesson I learned building knowledge graphs at LinkedIn with the terrific team that I assembled.
Optimizing the Interface Between Knowledge Graphs and LLMs for...
Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) results in complex systems with numerous hyperparameters that directly affect performance. While such systems are increasingly...
Unified graph architecture for Agentic AI based on Postgres and Apache AGE
Picture an AI agent that seamlessly traverses knowledge graphs while performing semantic vector searches, applies probabilistic predictions alongside deterministic rules, reasons about temporal evolution and spatial relationships, and resolves contradictions between multiple data sources—all within a single atomic transaction.
It is PostgreSQL-based architecture that consolidates traditionally distributed data systems into a single, coherent platform.
This architecture doesn't just store different data types; it enables every conceivable form of reasoning—deductive, inductive, abductive, analogical, causal, and spatial—transforming isolated data modalities into a coherent intelligence substrate where graph algorithms, embeddings, tabular predictions, and ontological inference work in perfect harmony.
It changes how agentic systems operate by eliminating the complexity and inconsistencies inherent in multi-database architectures while enabling sophisticated multi-modal reasoning capabilities.
Conventional approaches typically distribute agent knowledge across multiple specialized systems: vector databases for semantic search, graph databases for relationship reasoning, relational databases for structured data, and separate ML platforms for predictions. This fragmentation creates synchronization nightmares, latency penalties, and operational complexity that can cripple agent performance and reliability.
Apache AGE brings native graph database capabilities to PostgreSQL, enabling complex relationship traversal and graph algorithms without requiring a separate graph database. Similarly, pgvector enables semantic search through vector embeddings, while extensions like TabICL provide zero-shot machine learning predictions directly within the database. This extensibility allows PostgreSQL to serve as a unified substrate for all data modalities that agents require.
While AGE may not match the pure performance of dedicated graph databases like Neo4j for certain specialized operations, it excels in the hybrid queries that agents typically require. An agent rarely needs just graph traversal or just vector search; it needs to combine these operations with structured queries and ML predictions in coherent reasoning chains. The ability to perform these operations within single ACID transactions eliminates entire classes of consistency bugs that plague distributed systems.
Foundational models eliminate traditional ML complexity. TabICL and TabSTAR enable instant predictions on new data patterns without training, deployment, or complex MLOps pipelines. This capability is particularly crucial for agentic systems that must adapt quickly to new situations and data types without human intervention or retraining cycles.
The unified architecture simplifies every aspect of system management: one backup strategy instead of multiple, unified security through PostgreSQL's mature RBAC system, consistent monitoring, and simplified debugging. | 21 comments on LinkedIn
Want to Fix LLM Hallucination? Neurosymbolic Alone Won’t Cut It
Want to Fix LLM Hallucination? Neurosymbolic Alone Won’t Cut It
The Conversation’s new piece makes a clear case for neurosymbolic AI—integrating symbolic logic with statistical learning—as the long-term fix for LLM hallucinations. It’s a timely and necessary argument:
“No matter how large a language model gets, it can’t escape its fundamental lack of grounding in rules, logic, or real-world structure. Hallucination isn’t a bug, it’s the default.”
But what’s crucial—and often glossed over—is that symbolic logic alone isn’t enough. The real leap comes from adding formal ontologies and semantic constraints that make meaning machine-computable. OWL, Shapes Constraint Language (SHACL), and frameworks like BFO, Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE), the Suggested Upper Merged Ontology (SUMO), and the Common Core Ontologies (CCO) don’t just “represent rules”—they define what exists, what can relate, and under what conditions inference is valid. That’s the difference between “decorating” a knowledge graph and engineering one that can detect, explain, and prevent hallucinations in practice.
I’d go further:
• Most enterprise LLM hallucinations are just semantic errors—mislabeling, misattribution, or class confusion that only formal ontologies can prevent.
• Neurosymbolic systems only deliver if their symbolic half is grounded in ontological reality, not just handcrafted rules or taxonomies.
The upshot:
We need to move beyond mere integration of symbols and neurons. We need semantic scaffolding—ontologies as infrastructure—to ensure AI isn’t just fluent, but actually right.
Curious if others are layering formal ontologies (BFO, DOLCE, SUMO) into their AI stacks yet? Or are we still hoping that more compute and prompt engineering will do the trick?
#NeuroSymbolicAI #SemanticAI #Ontology #LLMs #AIHallucination #KnowledgeGraphs #AITrust #AIReasoning
Want to Fix LLM Hallucination? Neurosymbolic Alone Won’t Cut It
AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas
AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas
👉 Why This Matters
Traditional knowledge graphs face a paradox: they require expert-crafted schemas to organize information, creating bottlenecks for scalability and adaptability. This limits their ability to handle dynamic real-world knowledge or cross-domain applications effectively.
👉 What Changed
AutoSchemaKG eliminates manual schema design through three innovations:
1. Dynamic schema induction: LLMs automatically create conceptual hierarchies while extracting entities/events
2. Event-aware modeling: Captures temporal relationships and procedural knowledge missed by entity-only approaches
3. Multi-level conceptualization: Organizes instances into semantic categories through abstraction layers
The system processed 50M+ documents to build ATLAS - a family of KGs with:
- 900M+ nodes (entities/events/concepts)
- 5.9B+ relationships
- 95% alignment with human-created schemas (zero manual intervention)
👉 How It Works
1. Triple extraction pipeline:
- LLMs identify entity-entity, entity-event, and event-event relationships
- Processes text at document level rather than sentence level for context preservation
2. Schema induction:
- Automatically groups instances into conceptual categories
- Creates hierarchical relationships between specific facts and abstract concepts
3. Scale optimization:
- Handles web-scale corpora through GPU-accelerated batch processing
- Maintains semantic consistency across 3 distinct domains (Wikipedia, academic papers, Common Crawl)
👉 Proven Impact
- Boosts multi-hop QA accuracy by 12-18% over state-of-the-art baselines
- Improves LLM factuality by up to 9% on specialized domains like medicine and law
- Enables complex reasoning through conceptual bridges between disparate facts
👉 Key Insight
The research demonstrates that billion-scale KGs with dynamic schemas can effectively complement parametric knowledge in LLMs when they reach critical mass (1B+ facts). This challenges the assumption that retrieval augmentation needs domain-specific tuning to be effective.
Question for Discussion
As autonomous KG construction becomes viable, how should we rethink the role of human expertise in knowledge representation? Should curation shift from schema design to validation and ethical oversight? | 15 comments on LinkedIn
AutoSchemaKG: Building Billion-Node Knowledge Graphs Without Human Schemas