Our SPARQL Notebook extension for Visual Studio Code makes it super easy to document SPARQL queries and run them, either against live endpoints or directly on local RDF files. I just (finally!) published a 15-minute walkthrough on our YouTube channel Giant Global Graph. It gives you a quick overview of how it works and how you can get started.
Link in the comments.
Fun fact: I recorded this two years ago and apparently forgot to hit publish. Since then, we've added new features like improved table renderers with pivoting support, so it's even more useful now. Check it out! | 11 comments on LinkedIn
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
Instead of just pulling facts, the system samples multi-step paths within the graph, such as a causal chain from a disease to a symptom, and translates these paths into natural language reasoning tasks complete with a step-by-step thinking trace
Alhamdulillah, iText2KG v0.0.8 is finally out!
(Yes, I’ve been quite busy these past few months 😅)
.. and it can now build dynamic knowledge graphs. The GIF below shows a dynamic KG generated from OpenAI tweets between June 18 and July 17.
(Note: Temporal/logical conflicts aren't handled yet in this version, but you can still resolve them with a post-processing filter.)
Here are the main updated features:
- iText2KG_Star: Introduced a simpler and more efficient version of iText2KG that eliminates the separate entity extraction step. Instead of extracting entities and relations separately, iText2KG_Star directly extracts triplets from text. This approach is more efficient as it reduces processing time and token consumption and does not need to handle invented/isolated entities.
- Facts-Based KG Construction: Enhanced the framework with facts-based knowledge graph construction using the Document Distiller to extract structured facts from documents, which are then used for incremental KG building. This approach provides more exhaustive and precise knowledge graphs.
- Dynamic Knowledge Graphs: iText2KG now supports building dynamic knowledge graphs that evolve. By leveraging the incremental nature of the framework and document snapshots with observation dates, users can track how knowledge changes and grows.
Check out the new version and an example of OpenAI Dynamic KG Construction in the first comment.
Why Businesses Must Ground Their AI in Knowledge Graphs | LinkedIn
Here, I clearly explain why businesses must transition from raw tabular data to RDF-based knowledge graphs, and why this is essential to ground AI in logic-driven, traceable inference rather than black-box prediction: 1. Your tabular data is dumb.
Millions of G∈AR-s: Extending GraphRAG to Millions of Documents
Scaling GraphRAG to Millions of Documents: Lessons from the SIGIR 2025 LiveRAG Challenge
👉 WHY THIS MATTERS
Retrieval-augmented generation (RAG) struggles with multi-hop questions that require connecting information across documents. While graph-based RAG methods like GEAR improve reasoning by structuring knowledge as entity-relationship triples, scaling these approaches to web-sized datasets (millions/billions of documents) remains a bottleneck. The culprit? Traditional methods rely heavily on LLMs to extract triples—a process too slow and expensive for large corpora.
👉 WHAT THEY DID
Researchers from Huawei and the University of Edinburgh reimagined GEAR to sidestep costly offline triple extraction.
Their solution:
- Pseudo-alignment: Link retrieved passages to existing triples in Wikidata via sparse retrieval.
- Iterative expansion: Use a lightweight LLM (Falcon-3B-Instruct) to iteratively rewrite queries and retrieve additional evidence through Wikidata’s graph structure.
- Multi-step filtering: Combine Reciprocal Rank Fusion (RRF) and prompt-based filtering to reconcile noisy alignments between Wikidata and document content.
This approach achieved 87.6% correctness and 53% faithfulness on the SIGIR 2025 LiveRAG benchmark, despite challenges in aligning Wikidata’s generic triples with domain-specific document content.
👉 KEY INSIGHTS
1. Trade-offs in alignment: Linking Wikidata triples to documents works best for general knowledge but falters with niche topics (e.g., "Pacific geoduck reproduction" mapped incorrectly to oyster biology).
2. Cost efficiency: Avoiding LLM-based triple extraction reduced computational overhead, enabling scalability.
3. The multi-step advantage: Query rewriting and iterative retrieval improved performance on complex questions requiring 2+ reasoning hops.
👉 OPEN QUESTIONS
- How can we build asymmetric semantic models to better align text and graph data?
- Can hybrid alignment strategies (e.g., blending domain-specific KGs with Wikidata) mitigate topic drift?
- Does graph expansion improve linearly with scale, or are diminishing returns inevitable?
Why read this paper?
It’s a pragmatic case study in balancing scalability with reasoning depth in RAG systems. The code and prompts are fully disclosed, offering a blueprint for adapting GraphRAG to real-world, large-scale applications.
Paper: "Millions of G∈AR-s: Extending GraphRAG to Millions of Documents" (Shen et al., SIGIR 2025). Preprint: arXiv:2307.17399.
Millions of G∈AR-s: Extending GraphRAG to Millions of Documents
As the fifth most popular website on the Internet, keeping Wikipedia running smoothly is no small feat. The free encyclopedia hosts more than 65 million
This is the title of my upcoming book. And it’s all about the Shapes Constraint Language (SHACL). Expected release before November 1st 2025. The book is written and illustrated by Veronika He…
I've spent long, hard years learning how to talk about knowledge graphs and semantics with software engineers who have little training in linguistics. I feel quite fluent at this point, after investing huge amounts of effort into understanding statistics (I was a humanities undergrad) and into unpac
What’s the difference between context engineering and ontology engineering?
What’s the difference between context engineering and ontology engineering?
We hear a lot about “context engineering” these days in AI wonderland. A lot of good thing are being said but it’s worth noting what’s missing.
Yes, context matters. But context without structure is narrative, not knowledge. And if AI is going to scale beyond demos and copilots into systems that reason, track memory, and interoperate across domains… then context alone isn’t enough.
We need ontology engineering.
Here’s the difference:
- Context engineering is about curating inputs: prompts, memory, user instructions, embeddings. It’s the art of framing.
- Ontology engineering is about modeling the world: defining entities, relations, axioms, and constraints that make reasoning possible.
In other words:
Context guides attention. Ontology shapes understanding.
What’s dangerous is that many teams stop at context, assuming that if you feed the right words to an LLM, you’ll get truth, traceability, or decisions you can trust. This is what I call “hallucination of control”.
Ontologies provide what LLMs lack: grounding, consistency, and interoperability, but they are hard to build without the right methods, adapted from the original discipline that started 20+ years ago with the semantic web, now it’s time to work it out for the LLM AI era.
If you’re serious about scaling AI across business processes or mission-critical systems, the real challenge is more than context, it’s shared meaning. And tech alone cannot solve this.
That’s why we need put ontology discussion in the board room, because integrating AI into organizations is much more complicated than just providing the right context in a prompt or a context window.
That’s it for today. More tomorrow!
I’m trying to get back at journaling here every day. 🤙 hope you will find something useful in what I write. | 71 comments on LinkedIn
What’s the difference between context engineering and ontology engineering?
how both OWL and SHACL can be employed during the decision-making phase for AI Agents when using a knowledge graph instead of relying on an LLM that hallucinates
𝙏𝙝𝙤𝙪𝙜𝙝𝙩 𝙛𝙤𝙧 𝙩𝙝𝙚 𝙙𝙖𝙮: I've been mulling over how both OWL and SHACL can be employed during the decision-making phase for AI Agents when using a knowledge graph instead of relying on an LLM that hallucinates. In this way, the LLM can still be used for assessment and sensory feedback, but it augments the graph, not the other way around. OWL and SHACL serve different roles. SHACL is not just a preprocessing validator; it can play an active role in constraining, guiding, or triggering decisions, especially when integrated into AI pipelines. However, OWL is typically more central to inferencing and reasoning tasks.
SHACL can actively participate in decision-making, especially when decisions require data integrity, constraint enforcement, or trigger-based logic. In complex agents, OWL provides the inferencing engine, while SHACL acts as the constraint gatekeeper and occasionally contributes to rule-based decision-making.
For example, an AI agent processes RDF data describing an applicant's skills, degree, and experience. SHACL validates the data's structure, ensuring required fields are present and correctly formatted. OWL reasoning infers that the applicant is qualified for a technical role and matches the profile of a backend developer. SHACL is then used again to check policy compliance. With all checks passed, the applicant is shortlisted, and a follow-up email is triggered.
In AI agent decision-making, OWL and SHACL often work together in complementary ways. SHACL is commonly used as a preprocessing step to validate incoming RDF data. If the data fails validation, it's flagged or excluded, ensuring only clean, structurally sound data reaches the OWL reasoner. In this role, SHACL acts as a gatekeeper.
They can also operate in parallel or in an interleaved manner within a pipeline. As decisions evolve, SHACL shapes may be checked mid-process. Some AI agents even use SHACL as a rule engine—to trigger alerts, detect actionable patterns, or constrain reasoning paths—while OWL continues to handle more complex semantic inferences, such as class hierarchies or property logic.
Finally, SHACL can augment decision-making by confirming whether OWL-inferred actions comply with specific constraints. OWL may infer that “A is a type of B, so do X,” and SHACL then determines whether doing X adheres to a policy or requirement. Because SHACL supports closed-world assumptions (which OWL does not), it plays a valuable role in enforcing policies or compliance rules during decision execution.
Illustrated:
how both OWL and SHACL can be employed during the decision-making phase for AI Agents when using a knowledge graph instead of relying on an LLM that hallucinates
It’s already the end of Sunday — I hope you all had a wonderful week. Mine was exceptionally busy, with the GUG seminar and the upcoming tutorial preparation. I usually take time for a personal…
I'm trying to build a Knowledge Graph. Our team has done experiments with current libraries available (𝐋𝐥𝐚𝐦𝐚𝐈𝐧𝐝𝐞𝐱, 𝐌𝐢𝐜𝐫𝐨𝐬𝐨𝐟𝐭'𝐬 𝐆𝐫𝐚𝐩𝐡𝐑𝐀𝐆, 𝐋𝐢𝐠𝐡𝐫𝐚𝐠, 𝐆𝐫𝐚𝐩𝐡𝐢𝐭𝐢 etc.) From a Product perspective, they seem to be missing the basic, common-sense features.
𝐒𝐭𝐢𝐜𝐤 𝐭𝐨 𝐚 𝐅𝐢𝐱𝐞𝐝 𝐓𝐞𝐦𝐩𝐥𝐚𝐭𝐞:
My business organizes information in a specific way. I need the system to use our predefined entities and relationships, not invent its own. The output has to be consistent and predictable every time.
𝐒𝐭𝐚𝐫𝐭 𝐰𝐢𝐭𝐡 𝐖𝐡𝐚𝐭 𝐖𝐞 𝐀𝐥𝐫𝐞𝐚𝐝𝐲 𝐊𝐧𝐨𝐰:
We already have lists of our products, departments, and key employees. The AI shouldn't have to guess this information from documents. I want to seed this this data upfront so that the graph can be build on this foundation of truth.
𝐂𝐥𝐞𝐚𝐧 𝐔𝐩 𝐚𝐧𝐝 𝐌𝐞𝐫𝐠𝐞 𝐃𝐮𝐩𝐥𝐢𝐜𝐚𝐭𝐞𝐬:
The graph I currently get is messy. It sees "First Quarter Sales" and "Q1 Sales Report" as two completely different things. This is probably easy but want to make sure this does not happen.
𝐅𝐥𝐚𝐠 𝐖𝐡𝐞𝐧 𝐒𝐨𝐮𝐫𝐜𝐞𝐬 𝐃𝐢𝐬𝐚𝐠𝐫𝐞𝐞:
If one chunk says our sales were $10M and another says $12M, I need the library to flag this disagreement, not just silently pick one. It also needs to show me exactly which documents the numbers came from so we can investigate.
Has anyone solved this? I'm looking for a library —that gets these fundamentals right. | 21 comments on LinkedIn
❓ Why I Wrote This Book?
In the past two to three years, we've witnessed a revolution. First with ChatGPT, and now with autonomous AI agents. This is only the beginning. In the years ahead, AI will transform not only how we work but how we live. At the core of this transformation lies a single breakthrough technology: large language models (LLMs). That’s why I decided to write this book.
This book explores what an LLM is, how it works, and how it develops its remarkable capabilities. It also shows how to put these capabilities into practice, like turning an LLM into the beating heart of an AI agent. Dissatisfied with the overly simplified or fragmented treatments found in many current books, I’ve aimed to provide both solid theoretical foundations and hands-on demonstrations. You'll learn how to build agents using LLMs, integrate technologies like retrieval-augmented generation (RAG) and knowledge graphs, and explore one of today’s most fascinating frontiers: multi-agent systems. Finally, I’ve included a section on open research questions (areas where today’s models still fall short, ethical issues, doubts, and so on), and where tomorrow’s breakthroughs may lie.
🧠 Who is this book for?
Anyone curious about LLMs, how they work, and how to use them effectively. Whether you're just starting out or already have experience, this book offers both accessible explanations and practical guidance. It's for those who want to understand the theory and apply it in the real world.
🛑 Who is this book not for?
Those who dismiss AI as a passing fad or have no interest in what lies ahead. But for everyone else this book is for you. Because AI agents are no longer speculative. They’re real, and they’re here.
A huge thanks to my co-author Gabriele Iuculano, and the Packt's team: Gebin George, Sanjana Gupta, Ali A., Sonia Chauhan, Vignesh Raju., Malhar Deshpande
#AI #LLMs #KnowledgeGraphs #AIagents #RAG #GenerativeAI #MachineLearning #NLP #Agents #DeepLearning
| 22 comments on LinkedIn
What makes the "𝐒𝐞𝐦𝐚𝐧𝐭𝐢𝐜 𝐃𝐚𝐭𝐚 𝐏𝐫𝐨𝐝𝐮𝐜𝐭" so valid in data conversations today?💡 𝐁𝐨𝐮𝐧𝐝𝐞𝐝 𝐂𝐨𝐧𝐭𝐞𝐱𝐭 and Right-to-Left Flow from consumers to raw materials.
What makes the "𝐒𝐞𝐦𝐚𝐧𝐭𝐢𝐜 𝐃𝐚𝐭𝐚 𝐏𝐫𝐨𝐝𝐮𝐜𝐭" so valid in data conversations today?💡 𝐁𝐨𝐮𝐧𝐝𝐞𝐝 𝐂𝐨𝐧𝐭𝐞𝐱𝐭 and Right-to-Left Flow from consumers to raw materials.
Tony Seale perfectly defines the value of bounded context.
…𝘵𝘰 𝘴𝘶𝘴𝘵𝘢𝘪𝘯 𝘪𝘵𝘴𝘦𝘭𝘧, 𝘢 𝘴𝘺𝘴𝘵𝘦𝘮 𝘮𝘶𝘴𝘵 𝘮𝘪𝘯𝘪𝘮𝘪𝘴𝘦 𝘪𝘵𝘴 𝘧𝘳𝘦𝘦 𝘦𝘯𝘦𝘳𝘨𝘺- 𝘢 𝘮𝘦𝘢𝘴𝘶𝘳𝘦 𝘰𝘧 𝘶𝘯𝘤𝘦𝘳𝘵𝘢𝘪𝘯𝘵𝘺. 𝘔𝘪𝘯𝘪𝘮𝘪𝘴𝘪𝘯𝘨 𝘪𝘵 𝘦𝘲𝘶𝘢𝘵𝘦𝘴 𝘵𝘰 𝘭𝘰𝘸 𝘪𝘯𝘵𝘦𝘳𝘯𝘢𝘭 𝘦𝘯𝘵𝘳𝘰𝘱𝘺. 𝘈 𝘴𝘺𝘴𝘵𝘦𝘮 𝘢𝘤𝘩𝘪𝘦𝘷𝘦𝘴 𝘵𝘩𝘪𝘴 𝘣𝘺 𝘧𝘰𝘳𝘮𝘪𝘯𝘨 𝘢𝘤𝘤𝘶𝘳𝘢𝘵𝘦 𝘱𝘳𝘦𝘥𝘪𝘤𝘵𝘪𝘰𝘯𝘴 𝘢𝘣𝘰𝘶𝘵 𝘵𝘩𝘦 𝘦𝘹𝘵𝘦𝘳𝘯𝘢𝘭 𝘦𝘯𝘷 𝘢𝘯𝘥 𝘶𝘱𝘥𝘢𝘵𝘪𝘯𝘨 𝘪𝘵𝘴 𝘪𝘯𝘵𝘦𝘳𝘯𝘢𝘭 𝘴𝘵𝘢𝘵𝘦𝘴 𝘢𝘤𝘤𝘰𝘳𝘥𝘪𝘯𝘨𝘭𝘺, 𝘢𝘭𝘭𝘰𝘸𝘪𝘯𝘨 𝘧𝘰𝘳 𝘢 𝘥𝘺𝘯𝘢𝘮𝘪𝘤 𝘺𝘦𝘵 𝘴𝘵𝘢𝘣𝘭𝘦 𝘪𝘯𝘵𝘦𝘳𝘢𝘤𝘵𝘪𝘰𝘯 𝘸𝘪𝘵𝘩 𝘪𝘵𝘴 𝘴𝘶𝘳𝘳𝘰𝘶𝘯𝘥𝘪𝘯𝘨𝘴. 𝘖𝘯𝘭𝘺 𝘱𝘰𝘴𝘴𝘪𝘣𝘭𝘦 𝘰𝘯 𝘥𝘦𝘭𝘪𝘯𝘦𝘢𝘵𝘪𝘯𝘨 𝘢 𝘣𝘰𝘶𝘯𝘥𝘢𝘳𝘺 𝘣𝘦𝘵𝘸𝘦𝘦𝘯 𝘪𝘯𝘵𝘦𝘳𝘯𝘢𝘭 𝘢𝘯𝘥 𝘦𝘹𝘵𝘦𝘳𝘯𝘢𝘭 𝘴𝘺𝘴𝘵𝘦𝘮𝘴. 𝘋𝘪𝘴𝘤𝘰𝘯𝘯𝘦𝘤𝘵𝘦𝘥 𝘴𝘺𝘴𝘵𝘦𝘮𝘴 𝘴𝘪𝘨𝘯𝘢𝘭 𝘸𝘦𝘢𝘬 𝘣𝘰𝘶𝘯𝘥𝘢𝘳𝘪𝘦𝘴.
Data Products enable a way to bind context to specific business purposes or use cases. This enables data to become:
✅ Purpose-driven
✅ Accurately Discoverable
✅ Easily Understandable & Addressable
✅ Valuable as an independent entity
𝐓𝐡𝐞 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧: The Data Product Model. A conceptual model that precisely captures the business context through an interface operable by business users or domain experts.
We have often referred to this as The Data Product Prototype, which is essentially a semantic model and captures information on:
➡️ Popular Metrics the Business wants to drive
➡️ Measures & Dimensions
➡️ Relationships & formulas
➡️ Further context with tags, descriptions, synonyms, & observability metrics
➡️ Quality SLOs - or simply, conditions necessary
➡️ Additional policy specs contributed by Governance Stewards
Once the Prototype is validated and given a green flag, development efforts kickstart. Note how all data engineering efforts (left-hand side) are not looped in until this point, saving massive costs and time drainage.
The DE teams, who only have a partial view of the business landscape, are now no longer held accountable for this lack in strong business understanding. 𝐓𝐡𝐞 𝐨𝐰𝐧𝐞𝐫𝐬𝐡𝐢𝐩 𝐨𝐟 𝐭𝐡𝐞 𝐃𝐚𝐭𝐚 𝐏𝐫𝐨𝐝𝐮𝐜𝐭 𝐦𝐨𝐝𝐞𝐥 𝐢𝐬 𝐞𝐧𝐭𝐢𝐫𝐞𝐥𝐲 𝐰𝐢𝐭𝐡 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬.
🫠 DEs have a blueprint to refer and simply map sources or source data products to the prescribed Data Product Model. Any new request comes through this prototype itself, managed by Data Product Managers in collaboration with business users. Dissolving all bottlenecks from centralised data engineering teams.
At this level, necessary transformations are delivered,
🔌 that activate the SLOs
🔌 enable interoperability with native tools and upstream data products,
🔌 allow reusability of pre-existing transforms in the form of Source or Aggregate data products.
#datamanagement #dataproducts
A Graph-Native Workflow Application using Neo4j/Cypher | Medium
A full working Cypher script that simulates a Tendering System with multiple workflows, AI agent interactions, conversations, approvals, and more — all modeled and executed natively in a Graph.
GraphRAG in Action: A Simple Agent for Know-Your-Customer Investigations | Towards Data Science
This blog post provides a hands-on guide for AI engineers and developers on how to build an initial KYC agent prototype with the OpenAI Agents SDK. We'll explore how to equip our agent with a suite of tools (including MCP Server tools) to uncover and investigate potential fraud patterns.
Transform Claude's Hidden Memory Into Interactive Knowledge Graphs
Transform Claude's Hidden Memory Into Interactive Knowledge Graphs
Universal tool to visualize any Claude user's memory.json in beautiful interactive graphs. Transform your Claude Memory MCP data into stunning interactive visualizations to see how your AI assistant's knowledge connects and evolves over time.
Enterprise teams using Claude lack visibility into how their AI assistant accumulates and organizes institutional knowledge. Claude Memory Viz provides zero-configuration visualization that automatically finds memory files and displays 72 entities with 93 relationships in real-time force-directed layouts. Teams can filter by entity type, search across all data, and explore detailed connections through rich tooltips.
The technical implementation supports Claude's standard NDJSON memory format, automatically detecting and color-coding entity types from personality profiles to technical tools. Node size reflects connection count, while adjustable physics parameters enable optimal spacing for large knowledge graphs. Built with Cytoscape.js for performance optimization.
Built with the philosophy "Solve it once and for all," the tool works for any Claude user with zero configuration. The visualizer automatically searches common memory file locations, provides demo data fallback, and offers clear guidance when files aren't found. Integration requires just git clone and one command execution.
This matters because AI memory has been invisible to users, creating trust and accountability gaps in enterprise AI deployment. When teams can visualize how their AI assistant organizes knowledge, they gain insights into decision-making patterns and can optimize their AI collaboration strategies.
👩💻https://lnkd.in/e__RQh_q | 10 comments on LinkedIn
Transform Claude's Hidden Memory Into Interactive Knowledge Graphs
This is it.
This is the conversation every leadership team needs to be having right now.
"The Orchestration Graph" by WRITER product leader Matan-Paul Shetrit linked in comments is a must-read.
The primary constraint on business is no longer execution. It's supervision.
For a century, we built companies to overcome the high cost of getting things done.
We built hierarchies, departments, and complex processes — all to manage labor-intensive execution.
That era is over.
With AI agents, execution is becoming abundant, on-demand, and programmatic.
The new bottleneck is our ability to direct, govern, and orchestrate this immense new capacity.
The firm is evolving from a factory into an "operating system."
Your ORG CHART is no longer the map.
The real map is the Orchestration Graph: the dynamic, software-defined network of humans, models, and agents that actually does the work.
This isn't just a new tool or a productivity hack. It's a fundamental rewiring of the enterprise. It demands we rethink everything:
Structure: How do we manage systems, not just people?
Strategy: What work do we insource to our agentic "OS" versus outsource to models-as-a-service?
Metrics: Are we still measuring human activity, or are we measuring system throughput and intelligence?
This is the WRITER call to arms: The companies that win won't just adopt AI; they will restructure themselves around it. They will build their own Orchestration Graph, with governance and institutional memory at the core.
They will treat AI not as a feature, but as the new foundation.
At WRITER, this is the future we are building every single day — giving companies the platform to create their own secure, governed, and intelligent orchestration layer.
The time to act is now.
Read the article. Start the conversation with your leaders. And begin rewiring your firm. | 37 comments on LinkedIn
When people discuss how LLMS "reason," you’ll often hear that they rely on transduction rather than abduction. It sounds technical, but the distinction matters - especially as we start wiring LLMs into systems that are supposed to think.
🔵 Transduction is case-to-case reasoning. It doesn’t build theories; it draws fuzzy connections based on resemblance. Think: “This metal conducts electricity, and that one looks similar - so maybe it does too.”
🔵 Abduction, by contrast, is about generating explanations. It’s what scientists (and detectives) do: “This metal is conducting - maybe it contains free electrons. That would explain it.”
The claim is that LLMs operate more like transducers - navigating high-dimensional spaces of statistical similarity, rather than forming crisp generalisations. But this isn’t the whole picture. In practice, it seems to me that LLMs also perform a kind of induction - abstracting general patterns from oceans of text. They learn the shape of ideas and apply them in novel ways. That’s closer to “All metals of this type have conducted in the past, so this one probably will.”
Now add tools to the mix - code execution, web search, Elon Musk's tweet history 😉 - and LLMs start doing something even more interesting: program search and synthesis. It's messy, probabilistic, and not at all principled or rigorous. But it’s inching toward a form of abductive reasoning.
Which brings us to a more principled approach for reasoning within an enterprise domain: the neuro-symbolic loop - a collaboration between large language models and knowledge graphs. The graph provides structure: formal semantics, ontologies, logic, and depth. The LLM brings intuition: flexible inference, linguistic creativity, and breadth. One grounds. The other leaps.
💡 The real breakthrough could come when the grounding isn’t just factual, but conceptual - when the ontology encodes clean, meaningful generalisations. That’s when the LLM’s leaps wouldn’t just reach further - they’d rise higher, landing on novel ideas that hold up under formal scrutiny. 💡
So where do metals fit into this new framing?
🔵 Transduction: “This metal conducts. That one looks the same - it probably does too.”
🔵 Induction: “I’ve tested ten of these. All conducted. It’s probably a rule.”
🔵 Abduction: “This metal is conducting. It shares properties with the ‘conductive alloy’ class - especially composition and crystal structure. The best explanation is a sea of free electrons.”
LLMs, in isolation, are limited in their ability to perform structured abduction. But when embedded in a system that includes a formal ontology, logical reasoning, and external tools, they can begin to participate in richer forms of reasoning. These hybrid systems are still far from principled scientific reasoners - but they hint at a path forward: a more integrated and disciplined neuro-symbolic architecture that moves beyond mere pattern completion.
S&P Global Unlocks the Future of AI-driven insights with AI-Ready Metadata on S&P Global Marketplace
🚀 When I shared our 2025 goals for the Enterprise Data Organization, one of the things I alluded to was machine-readable column-level metadata. Let’s unpack what that means—and why it matters.
🔍 What: For datasets we deliver via modern cloud distribution, we now provide human - and machine - readable metadata at the column level. Each column has an immutable URL (no auth, no CAPTCHA) that hosts name/value metadata - synonyms, units of measure, descriptions, and more - in multiple human languages. It’s semantic context that goes far beyond what a traditional data dictionary can convey. We can't embed it, so we link to it.
💡 Why: Metadata is foundational to agentic, precise consumption of structured data. Our customers are investing in semantic layers, data catalogs, and knowledge graphs - and they shouldn’t have to copy-paste from a PDF to get there. Use curl, Python, Bash - whatever works - to automate ingestion. (We support content negotiation and conditional GETs.)
🧠 Under the hood? It’s RDF. Love it or hate it, you don’t need to engage with the plumbing unless you want to.
✨ To our knowledge, this hasn’t been done before. This is our MVP. We’re putting it out there to learn what works - and what doesn’t. It’s vendor-neutral, web-based, and designed to scale across:
📊 Breadth of datasets across S&P
🧬 Depth of metadata
🔗 Choice of linking venue
🙏 It took a village to make this happen. I can’t name everyone without writing a book, but I want to thank our executive leadership for the trust and support to go build this.
Let us know what you think!
🔗 https://lnkd.in/gbe3NApH
Martina Cheung, Saugata Saha, Swamy Kocherlakota, Dave Ernsberger, Mark Eramo, Frank Tarsillo, Warren Breakstone, Hamish B., Erica Robeen, Laura Miller, Justine S Iverson, | 17 comments on LinkedIn
metaphacts unveils metis, the new Knowledge-driven AI platform for Enterprises
Introducing metis: an enterprise AI platform from metaphactory. Get trusted, context-aware, knowledge-driven AI for actionable insights & intelligent agents.
Should ontologies be treated as organizational resources for semantic capabilities?
💡 Should ontologies be treated as organizational resources for semantic capabilities?
More and more organizations are investing in data platforms, modeling tools, and integration frameworks. But one key capability is often underused or misunderstood: ontologies as semantic infrastructure.
While databases handle facts and BI platforms handle queries, ontologies structure meaning. They define what things are, not just what data says. When treated as living organizational resources, ontologies can bring:
🔹 Shared understanding across silos
🔹 Reasoning and inference beyond data queries
🔹 Semantic integration of diverse systems
🔹 Clarity and coherence in enterprise models
But here’s the challenge — ontologies don’t operate in isolation. They must be positioned alongside:
🔸 Data-oriented technologies (RDF, RDF-star, quad stores) that track facts and provenance
🔸 Enterprise modeling tools (e.g., ArchiMate) that describe systems and views
🔸 Exploratory approaches (like semantic cartography) that support emergence over constraint
These layers each come with their own logic — epistemic vs. ontologic, structural vs. operational, contextual vs. formal.
✅ Building semantic capabilities requires aligning all these dimensions.
✅ It demands governance, tooling, and a culture of collaboration between ontologists, data managers, architects, and domain experts.
✅ And it opens the door to richer insight, smarter automation, and more agile knowledge flows.
🔍 With projects like ArchiCG (semantic interactive cartography), I aim to explore how we can visually navigate this landscape — not constrained by predefined viewpoints, but guided by logic, meaning, and emergent perspectives.
What do you think? Are ontologies ready to take their place as core infrastructure in your organization? | 16 comments on LinkedIn