Third edition of the Knowledge Graphs course at Ghent University.
In February 2026 I will start teaching the third edition of the Knowledge Graphs course at Ghent University. This is an elective course in which I teach everything I know about creating interoperable data ecosystems.
As in the previous editions, we open up this elective course as well to professionals using a micro credential. Feel like going back to school? We poured our heart and soul into this one.
๐ค ๐ https://lnkd.in/euUiiEwJ
Co-teachers include Ruben Verborgh, Ben De Meester, Ruben Taelman and yourself (thereโs a peer teaching assignment).
In February 2026 I will start teaching the third edition of the Knowledge Graphs course at Ghent University.
๐๐ฎ๐ป ๐ฎ ๐ฅ๐ฒ๐น๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐๐ฎ๐๐ฎ๐ฏ๐ฎ๐๐ฒ ๐๐ฒ ๐ฎ ๐๐ป๐ผ๐๐น๐ฒ๐ฑ๐ด๐ฒ ๐๐ฟ๐ฎ๐ฝ๐ต?
Not all knowledge graphs are equal. A semantic KG (RDF/OWL/Stardog) isnโt the same as a property graph (Neo4j), and both differ from enforcing graph-like structures in a relational DB (CockroachDB/Postgres). Each has strengths and trade-offs:
๐ฆ๐ฒ๐บ๐ฎ๐ป๐๐ถ๐ฐ ๐๐๐ excel at reasoning and inference over ontologies.
๐ฃ๐ฟ๐ผ๐ฝ๐ฒ๐ฟ๐๐ ๐ด๐ฟ๐ฎ๐ฝ๐ต๐ shine when exploring relationships with intuitive query patterns.
๐ฅ๐ฒ๐น๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น ๐ฎ๐ฝ๐ฝ๐ฟ๐ผ๐ฎ๐ฐ๐ต๐ฒ๐ enforce graph-like models via schema, FKs, indexes, recursive CTEs โ with added benefits of scale, distributed TXs, and decades of maturity.
๐๐ฟ๐ฎ๐ฝ๐ต ๐ง๐ฟ๐ฎ๐๐ฒ๐ฟ๐๐ฎ๐น๐ ๐ถ๐ป ๐ฆ๐ค๐
Recursive CTEs let SQL โwalk the graph.โ Start with a base case (movie + actors), then repeatedly join back to discover multi-hop paths (actors โ movies โ actors โ movies). This simulates โfriends-of-friendsโ traversals in a few lines of SQL.
๐ฅ๐๐, ๐๐ฟ๐ฎ๐ฝ๐ต๐ฅ๐๐, ๐ฎ๐ป๐ฑ ๐๐๐ ๐
RAG and GraphRAG give LLMs grounding in structured data, reducing hallucinations and injecting context. Whether via RDF triples, LPG edges, or SQL joins โ the principle is the same: real relationships fuel better answers.
๐ง๐ต๐ฒ ๐ฏ-๐๐ผ๐ฝ ๐๐ฟ๐ด๐๐บ๐ฒ๐ป๐
Some vendors claim SQL breaks down after 3 hops. In reality, recursive CTEs traverse arbitrary depth. SQL may not be as compact as Cypher or GQL, but itโs expressive and efficient โ the โ3-hop wallโ is outdated FUD.
๐๐ผ๐ฎ๐ฑ๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ ๐ฎ๐ ๐ฆ๐ฐ๐ฎ๐น๐ฒ
One graph DB is notorious for slow, resource-heavy CSV loads. Distributed RDBMS like CockroachDB can bulk ingest 100s of GB to TBs efficiently.
๐ก๐ผ ๐ฆ๐๐ฎ๐น๐ฒ ๐๐ฎ๐๐ฎ
Too often, data must move from TX systems into a graph before use โ by then, itโs stale. For AI-driven apps, that lag means hallucinations, missed insights, and poor UX.
๐ช๐ต๐ ๐ง๐ต๐ถ๐ ๐ ๐ฎ๐๐๐ฒ๐ฟ๐ ๐ณ๐ผ๐ฟ ๐๐
As AI apps go multi-regional and global, they demand low latency + strong consistency. Centralized graph DBs hit lag, hotspots, scaling pain. Distributed SQL delivers expressive queries and global consistency โ exactly what AI workloads need.
You donโt need to pick โgraphโ or โrelationalโ as religion. Choose the right model for scale, consistency, and AI grounding. Sometimes RDF. Sometimes LPG. And sometimes, graph-enforced in SQL.
#KnowledgeGraph #ArtificialIntelligence #GenerativeAI #DistributedSQL #CockroachDB
| 11 comments on LinkedIn
Build a knowledge graph from structured & unstructured data [Code Tutorial]
Looking into building knowledge graphs? Check out this code tutorial on how we built a knowledge graph of the latest 'La Liga' standings! โฝ๏ธ๐ฉโ๐ป Google Coll...
Webinar: Semantic Graphs in Action - Bridging LPG and RDF Frameworks - Enterprise Knowledge
As organizations increasingly prioritize linked data capabilities to connect information across the enterprise, selecting the right graph framework to leverage has become more important than ever. In this webinar, graph technology experts from Enterprise Knowledge Elliot Risch, James Egan, David Hughes, and Sara Nash shared the best ways to manage and apply a selection of these frameworks to meet enterprise needs.
Tackle the core challenges related to enterprise-ready graph representation and learning. With this hands-on guide, applied data scientists, machine learning engineers, and... - Selection from Scaling Graph Learning for the Enterprise [Book]
A new notebook exploring Semantic Entity Resolution & Extraction using DSPy and Google's new LangExtract library.
Just released a new notebook exploring Semantic Entity Resolution & Extraction using DSPy (Community) and Google's new LangExtract library.
Inspired by Russell Jurneyโs excellent work on semantic entity resolution, this demo follows his approach of combining:
โ embeddings,
โ kNN blocking,
โ and LLM matching with DSPy (Community).
On top of that, I added a general extraction layer to test-drive LangExtract, a Gemini-powered, open-source Python library for reliable structured information extraction. The goal? Detect and merge mentions of the same real-world entities across text.
Itโs an end-to-end flow tackling one of the most persistent data challenges.
Check it out, experiment with your own data, ๐๐ง๐ฃ๐จ๐ฒ ๐ญ๐ก๐ ๐ฌ๐ฎ๐ฆ๐ฆ๐๐ซ and let me know your thoughts!
cc Paco Nathan you might like this ๐
https://wor.ai/8kQ2qa
a new notebook exploring Semantic Entity Resolution & Extraction using DSPy (Community) and Google's new LangExtract library.
Stop manually building your company's brain. โ
Having reviewed the excellent DeepLearning.AI lecture on Agentic Knowledge Graph Construction, by Andreas Kollegger and writing a book on Agentic graph system with Sam Julien, it is clear that the use of agentic systems represents a shift in how we build and maintain knowledge graphs (KGs).
Most organizations are sitting on a goldmine of data spread across CSVs, documents, and databases.
The dream is to connect it all into a unified Knowledge Graph, an intelligent brain that understands your entire business.
The reality? It's a brutal, expensive, and unscalable manual process.
But a new approach is changing everything.
Hereโs the new playbook for building intelligent systems:
๐ง Deploy an AI Agent Workforce
Instead of rigid scripts, you use a cognitive assembly line of specialized AI agents. A Proposer agent designs the data model, a Critic refines it, and an Extractor pulls the facts.
This modular approach is proven to reduce errors and improve the accuracy and coherence of the final graph.
๐จ Treat AI as a Designer, Not Just a Doer
The agents act as data architects. In discovery mode, they analyze unstructured data (like customer reviews) and propose a new logical structure from scratch.
In an enterprise with an existing data model, they switch to alignment mode, mapping new information to the established structure.
๐๏ธ Use a 3-Part Graph Architecture
This technique is key to managing data quality and uncertainty. You create three interconnected graphs:
The Domain Graph: Your single source of truth, built from trusted, structured data.
The Lexical Graph: The raw, original text from your documents, preserving the evidence.
The Subject Graph: An AI-generated bridge that connects them. It holds extracted insights that are validated before being linked to your trusted data.
Jaro-Winkler is a string comparison algorithm that measures the similarity or edit distance between two strings. It can be used here for entity resolution, the process of identifying and linking entities from the unstructured text (Subject Graph) to the official entities in the structured database (Domain Graph).
For example, the algorithm compares a product name extracted from a customer review (e.g., "the gothenburg table") with the official product names in the database. If the Jaro-Winkler similarity score is above a certain threshold, the system automatically creates a CORRESPONDS_TO relationship, effectively linking the customer's comment to the correct product in the supply chain graph.
๐ค Augment Humans, Don't Replace Them
The workflow is Propose, then Approve. AI does the heavy lifting, but a human expert makes the final call.
This process is made reliable by tools like Pydantic and Outlines, which enforce a rigid contract on the AI's output, ensuring every piece of data is perfectly structured and consistent.
And once discovered and validated, a schema can be enforced. | 32 comments on LinkedIn
by J Bittner John Sowa once observed: In logic, the existential quantifier โ is a notation for asserting that something exists. But logic itself has no vocabulary for describing the things that exist.
FinReflectKG: Agentic Construction and Evaluation of Financial Knowledge Graphs
Sharing our recent research ๐ ๐ข๐ง๐๐๐๐ฅ๐๐๐ญ๐๐: ๐๐ ๐๐ง๐ญ๐ข๐ ๐๐จ๐ง๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ข๐จ๐ง ๐๐ง๐ ๐๐ฏ๐๐ฅ๐ฎ๐๐ญ๐ข๐จ๐ง ๐จ๐ ๐ ๐ข๐ง๐๐ง๐๐ข๐๐ฅ ๐๐ง๐จ๐ฐ๐ฅ๐๐๐ ๐ ๐๐ซ๐๐ฉ๐ก๐ฌ. It is the largest financial knowledge graph built from unstructured data. The preprint of our article is out on arXiv now (link is in the comments). It is coauthored with Abhinav Arun | Fabrizio Dimino | Tejas Prakash Agrawal
While LLMs make it easier than ever to generate knowledge graphs, the real challenge lies in ensuring quality without hallucinations, with strong coverage, precision, comprehensiveness, and relevance. FinReflectKG tackles this through an iterative, evaluation-driven agentic approach, carefully optimized across multiple evaluation metrics to deliver a trustworthy and high-quality knowledge graph.
Designed to power use cases like entity search, question answering, signal generation, predictive modeling, and financial network analysis, FinReflectKG sets a new benchmark for building reliable financial KGs and showcases the potential of agentic workflows in LLM-driven systems.
We will be creating a suite of benchmarks using FinReflectKG for KG related tasks in financial services. More details to come soon. | 15 comments on LinkedIn
barnard59 is a toolkit to automate extract, transform and load (ETL) tasks. It allows you to generate RDF out of non-RDF data sources
Reliability in data pipelines depends on knowing what went wrong before your users do. With the new OpenTelemetry integration in our RDF ETL framework barnard59, every pipeline and API integration is now fully traceable!
Errors, validation results and performance metrics are automatically collected and visualised in Grafana. Instead of hunting through logs, you immediately see where time was spent and where an error occurred. This makes RDF-based ETL pipelines far more transparent and easier to operate at scale.
SynaLinks is an open-source framework designed to make it easier to partner language models (LMs) with your graph technologies. Since most companies are not in a position to train their own language models from scratch, SynaLinks empowers you to adapt existing LMs on the market to specialized tasks.
In the history of data standards, a recurring pattern should concern anyone working in semantics today. A new standard emerges, promises interoperability, gains adoption across industries or agencies, and for a time seems to solve the immediate need.
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains ...
When AI Diagnoses Patients, Should Reasoning Be a Team Sport?
๐ Why Existing Approaches Fall Short
Medical question answering demands precision, but current AI methods struggle with two key issues:
1. Error Accumulation: Linear reasoning chains (like Chain-of-Thought) risk compounding mistakesโif the first step is wrong, the entire answer falters.
2. Flat Knowledge Retrieval: Traditional retrieval-augmented methods treat medical facts as unrelated text snippets, ignoring complex relationships between symptoms, diseases, and treatments.
This leads to unreliable diagnoses and opaque decision-makingโa critical problem when patient outcomes are at stake.
๐ What MIRAGE Does Differently
MIRAGE transforms reasoning from a solo sprint into a coordinated team effort:
- Parallel Detective Work: Instead of one linear chain, multiple specialized "detectives" (reasoning chains) investigate different symptoms or entities in parallel.
- Structured Evidence Hunting: Retrieval operates on medical knowledge graphs, tracing connections between symptoms (e.g., "face pain โ lead poisoning") rather than scanning documents.
- Cross-Check Consensus: Answers from parallel chains are verified against each other to resolve contradictions, like clinicians discussing differential diagnoses.
๐ How It Works (Without the Jargon)
1. Break It Down
ย ย - Splits complex queries ("Why am I fatigued with knee pain?") into focused sub-questions grounded in specific symptoms/entities.
ย ย - Example: "Conditions linked to fatigue" and "Causes of knee lumps" become separate investigation threads.
2. Graph-Guided Retrieval
ย ย - Each thread explores a medical knowledge graph like a map:
ย ย ย - Anchor Mode: Examines direct connections (e.g., diseases causing a symptom).
ย ย ย - Bridge Mode: Hunts multi-step relationships (e.g., toxin exposure โ neurological symptoms โ joint pain).
3. Vote & Verify
ย ย - Combines evidence from all threads, prioritizing answers supported by multiple independent chains.
ย ย - Discards conflicting hypotheses (e.g., ruling out lupus if only one chain suggests it without corroboration).
๐ Why This Matters
Tested on three medical benchmarks (including real clinician queries), MIRAGE:
- Outperformed GPT-4 and Tree-of-Thought variants in accuracy (84.8% vs. 80.2%)
- Reduced error propagation by 37% compared to linear retrieval-augmented methods
- Produced answers with traceable evidence paths, critical for auditability in healthcare
The Big Picture
MIRAGE shifts AI reasoning from brittle, opaque processes to collaborative, structured exploration. By mirroring how clinicians synthesize information from multiple angles, it highlights a path toward AI systems that are both smarter and more trustworthy in high-stakes domains.
Paper: Wei et al. MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains
MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains
๐๐ผ๐ ๐๐ฎ๐ธ๐ฒ ๐ผ๐ป ๐๐ต๐ฒ โ๐ณ๐ฎ๐๐๐ฒ๐ฟ ๐๐ต๐ฎ๐ป ๐๐ถ๐ท๐ธ๐๐๐ฟ๐ฎโ ๐ต๐ฒ๐ฎ๐ฑ๐น๐ถ๐ป๐ฒ๐:
The recent result given in the paper: https://lnkd.in/dQSbqrhD is a breakthrough for theory. It beats Dijkstraโs classic worst-case bound for single-source shortest paths on directed graphs with non-negative weights. Thatโs big for the research community.
๐๐๐ ๐ถ๐ ๐ฑ๐ผ๐ฒ๐๐ปโ๐ โ๐ฟ๐ฒ๐๐ฟ๐ถ๐๐ฒโ ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป ๐ฟ๐ผ๐๐๐ถ๐ป๐ด.
In practice, large-scale systems (maps, logistics, ride-hailing) moved past plain Dijkstra years ago. They rely on heavy preprocessing. Contraction Hierarchies, Hub Labels and other methods are used to answer point-to-point queries in milliseconds, even on large, continental networks.
๐ช๐ต๐ ๐๐ต๐ฒ ๐ฑ๐ถ๐๐ฐ๐ผ๐ป๐ป๐ฒ๐ฐ๐?
ย โข Different goals: The paper targets single-source shortest paths; production prioritizes point-to-point queries at interactive latencies.
ย โข Asymptotics vs. constants: Beating O(m + n log n) matters in principle, but real systems live and die by constants, cache behavior, and integration with traffic/turn costs.
ย โข Preprocessing wins: Once you allow preprocessing, the speedups from hierarchical/labeling methods dwarf Dijkstra and likely any drop-in replacement without preprocessing.
We should celebrate the theoretical advance and keep an eye on practical implementations. Just donโt confuse a sorting-barrier result with an immediate upgrade for Google Maps.
๐๐ผ๐๐๐ผ๐บ ๐น๐ถ๐ป๐ฒ: Great theory milestone. Production routing already โchanged the rulesโ years ago with preprocessing and smart graph engineering.
4.7 times better write query price-performance with AWS Graviton4 R8g instances using Amazon Neptune v1.4.5 | Amazon Web Services
Amazon Neptune version 1.4.5 introduces engine improvements and support for AWS Graviton-based r8g instances. In this post, we show you how these updates can improve your graph database performance and reduce costs. We walk you through the benchmark results for Gremlin and openCypher comparing Neptune v1.4.5 on r8g instances against previous versions. You'll see performance improvements of up to 4.7x for write throughput and 3.7x for read throughput, along with the cost implications.
Faster than Dijkstra? Tsinghua Universityโs new shortest path algorithm just rewrite the rules of graph traversal.
๐ Faster than Dijkstra? Tsinghua Universityโs new shortest path algorithm just rewrite the rules of graph traversal.
For 65+ years, Dijkstraโs algorithm was the gold standard for finding shortest paths in weighted graphs. But now, a team from Tsinghua University has introduced a recursive partial ordering method that outperforms Dijkstraโespecially on directed graphs.
๐ Whatโs different?ย
Instead of sorting all vertices by distance (which adds log-time overhead), this new approach uses a clever recursive structure that breaks the O(m + n log n) barrier โจ.ย
Itโs faster, leaner, and already winning awards at STOC 2025 ๐.
๐ Why it matters:ย
Think Google Maps, Uber routing, disaster evacuation planning, circuit designโany system that relies on real-time pathfinding across massive graphs.
Paper โก https://lnkd.in/dGTdRj2X
#Algorithms #ComputerScience #Engineering #Dijkstra #routing #planning #logistic
| 34 comments on LinkedIn
Faster than Dijkstra? Tsinghua Universityโs new shortest path algorithm just rewrite the rules of graph traversal.
Quality metrics: mathematical functions designed to measure the โgoodnessโ of a network visualization
Iโm proud to share an exciting piece of work by my PhD student,ย Simon van Wageningen, whom I have the pleasure of supervising. Simon asked a bold question that challenges the state of the art in our field!
A bit of background first: together with Simon, we studyย network visualizationsย โ those diagrams made of dots and lines. Theyโre more than just pretty pictures: they help us gain intuition about the structure of networks around us, such as social networks, protein networks, or even money-laundering networks ๐. But how do we know if a visualization really shows the structure well? Thatโs whereย quality metricsย come in โ mathematical functions designed to measure the โgoodnessโ of a network visualization. Many of these metrics correlate nicely with human intuition. Yet, in our community, there has long been a belief โ more of a tacit knowledge โ that these metrics fail in certain cases.
This is exactly where Simonโs work comes in: he set out to make this tacit knowledge explicit. Take a look at the dancing man and the network in the slider โ they represent theย same networkย withย very similar quality metric values. And yet, the dancing man clearly does not donโt show the network's structure. This tells us something important: we canโt blindly rely on quality metrics.
Simonโs work will be presented at theย International Symposium on Graph Drawing and Network Visualizationย in Norrkรถping, Sweden this year. ๐
If youโd like to dive deeper, hereโs the link to the GitHub repository https://lnkd.in/eqw3nYmZ #graphdrawing #networkvisualization #qualitymetrics #research with Simon van Wageningen and Alex Telea | 13 comments on LinkedIn
quality metricsย come in โ mathematical functions designed to measure the โgoodnessโ of a network visualization