Specifications to define data assets managed as products
📚 In recent years, several specifications have emerged to define data assets managed as products. Today, two main types of specifications exist:
1️⃣ 𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗮𝗰𝘁 𝗦𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 (𝗗𝗖𝗦): Focused on describing the data asset and its associated metadata.
2️⃣ 𝗗𝗮𝘁𝗮 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗦𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 (𝗗𝗣𝗦): Focused on describing the data product that manages and exposes the data asset.
👉 The 𝗢𝗽𝗲𝗻 𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗮𝗰𝘁 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱 (𝗢𝗗𝗖𝗦) by Bitol is an example of the first specification type, while the 𝗗𝗮𝘁𝗮 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗗𝗲𝘀𝗰𝗿𝗶𝗽𝘁𝗼𝗿 𝗦𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 (𝗗𝗣𝗗𝗦) by the Open Data Mesh Initiative represents the second.
🤔 But what are the key differences between these two approaches? Where do they overlap, and how can they complement each other? More broadly, are they friends, enemies, or frenemies?
🔎 I explored these questions in my latest blog post. The image below might give away some spoilers, but if you're curious about the full reasoning, read the post.
❤️ I'd love to hear your thoughts!
#TheDataJoy #DataContracts #DataProducts #DataGovernance | 29 comments on LinkedIn
specifications have emerged to define data assets managed as products
SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs
LLMs that automatically fill knowledge gaps - too good to be true?
Large Language Models (LLMs) often stumble in logical tasks due to hallucinations, especially when relying on incomplete Knowledge Graphs (KGs).
Current methods naively trust KGs as exhaustive truth sources - a flawed assumption in real-world domains like healthcare or finance where gaps persist.
SymAgent is a new framework that approaches this problem by making KGs active collaborators, not passive databases.
Its dual-module design combines symbolic logic with neural flexibility:
1. Agent-Planner extracts implicit rules from KGs (e.g., "If drug X interacts with Y, avoid co-prescription") to decompose complex questions into structured steps.
2. Agent-Executor dynamically pulls external data when KG triples are missing, bypassing the "static repository" limitation.
Perhaps most impressively, SymAgent’s self-learning observes failed reasoning paths to iteratively refine its strategy and flag missing KG connections - achieving 20-30% accuracy gains over raw LLMs.
Equipped with SymAgent, even 7B models rival their much larger counterparts by leveraging this closed-loop system.
It would be great if LLMs were able to autonomously curate knowledge and adapt to domain shifts without costly retraining.
But are we there yet? Are hybrid architectures like SymAgent the future?
↓
Liked this post? Join my newsletter with 50k+ readers that breaks down all you need to know about the latest LLM research: llmwatch.com 💡
What are the key ontology standards you should have in mind?
Ontology standards are crucial for knowledge representation and reasoning in AI and data… | 32 comments on LinkedIn
Dynamic Reasoning Graphs + LLMs = 🤝
Large Language Models (LLMs) often stumble on complex tasks when confined to linear reasoning.
What if they could… | 10 comments on LinkedIn
And so we set out to understand _feedforward_ graphs (i.e. graphs w/o back edges) ⏩
Turns out these graphs are rather understudied for how often they are…
🌟 Pathway to Artificial General Intelligence (AGI) 🌟 This is my view on the evolutionary steps toward AGI: 1️⃣ Large Language Models (LLMs): Language models…
Graph Databases after 15 Years – Where Are They Headed?
Speaker: Gábor Szárnyas (LDBC)Event: Data Analytics developer room at FOSDEM 2025Talk page: https://fosdem.org/2025/schedule/track/analytics/Slides: https://...
We're very happy to announce our latest release of Kùzu, version 0.8.0, is now available and ready to use! This release brings an exciting new feature that…
Nakala : from an RDF dataset to a query UI in minutes - SHACL automated generation and Sparnatural - Sparna Blog
Here is a usecase of an automated version of Sparnatural submitted as an example for Veronika Heimsbakk’s SHACL for the Practitioner upcoming book about the Shapes Constraint Language (SHACL). “ The Sparnatural knowledge graph explorer leverages SHACL specifications to drive a user interface (UI) that allows end users to easily discover the content of an RDF graph. What…
In my last post, AI Supported Taxonomy Term Generation, I used an LLM to help generate candidate terms for the revision of a topic taxonomy that had fallen out of sync with the content it was meant to tag. In that example, the taxonomy in question is for the "Insights" articles on my consulting webs
Ontology is not only about data! Many people think that ontologies are only about data (information). But an information model provides only one perspective… | 85 comments on LinkedIn
GitHub - apache/incubator-hugegraph: A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends) - apache/incubator-hugegraph
Organisations have oceans of data, but most remains siloed, fragmented, and underutilized. Enterprise Knowledge Graphs are a practical and scalable solution…
How crazy is it that over 20 years ago, Berners-Lee, Hendler, and Lassila laid out a vision in 'The Semantic Web' that we're only now fully appreciating as the… | 24 comments on LinkedIn
Bluesky starter pack on KGs and Semantic Web Technologies
500 million+ members | Manage your professional identity. Build and engage with your professional network. Access knowledge, insights and opportunities.
Bluesky, I created a starter pack on KGs and Semantic Web Technologies
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented...
The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications. However, it also has limitations, including the gap...
A comparison between ChatGPT and DeepSeek capabilities writing a valid Cypher query
Today, I conducted a comparison between ChatGPT and DeepSeek chat capabilities by providing them with a schema and a natural language question. I tasked them…
a comparison between ChatGPT and DeepSeek chat capabilities by providing them with a schema and a natural language question. I tasked them with writing a valid Cypher query to answer the question.