DeepGraph AI is open-sourcing GraphLite—the first fully open-source embedded graph database implementing the ISO GQL standard
I'm excited to announce that DeepGraph AI is open-sourcing GraphLite—the first fully open-source embedded graph database implementing the ISO GQL standard (ISO/IEC 39075:2024) in Rust.
A Graph RAG (Retrieval-Augmented Generation) chat application that combines OpenAI GPT with knowledge graphs stored in GraphDB
After seeing yet another Graph RAG demo using Neo4j with no ontology, I decided to show what real semantic Graph RAG looks like.
The Problem with Most Graph RAG Demos:
Everyone's building Graph RAG with LPG databases (Neo4j, TigerGraph, Arrango etc.) and calling it "knowledge graphs." But here's the thing:
Without formal ontologies, you don't have a knowledge graph—you just have a graph database.
The difference?
❌ LPG: Nodes and edges are just strings. No semantics. No reasoning. No standards.
✅ RDF/SPARQL: Formal ontologies (RDFS/OWL) that define domain knowledge. Machine-readable semantics. W3C standards. Built-in reasoning.
So I Built a Real Semantic Graph RAG
Using:
- Microsoft Agent Framework - AI orchestration
- Formal ontologies - RDFS/OWL knowledge representation
- Ontotext GraphDB - RDF triple store
- SPARQL - semantic querying
- GPT-5 - ontology-aware extraction
It's all on github, a simple template as boilerplate for you project:
The "Jaguar problem":
What does "Yesterday I was hit by a Jaguar" really mean? It is impossible to know without concept awareness. To demonstrate why ontologies matter, I created a corpus with mixed content:
🐆 Wildlife jaguars (Panthera onca)
🚗 Jaguar cars (E-Type, XK-E)
🎸 Fender Jaguar guitars
I fed this to GPT-5 along with a jaguar conservation ontology.
The result? The LLM automatically extracted ONLY wildlife-related entities—filtering out cars and guitars—because it understood the semantic domain from the ontology.
No post-processing. No manual cleanup. Just intelligent, concept-aware extraction.
This is impossible with LPG databases because they lack formal semantic structure. Labels like (:Jaguar) are just strings—the LLM has no way to know if you mean the animal, car, or guitar.
Knowledge Graphs = "Data for AI"
LLMs don't need more data—they need structured, semantic data they can reason over.
That's what formal ontologies provide:
✅ Domain context
✅ Class hierarchies
✅ Property definitions
✅ Relationship semantics
✅ Reasoning rules
This transforms Graph RAG from keyword matching into true semantic retrieval.
Check Out the Full Implementation, the repo includes:
Complete Graph RAG implementation with Microsoft Agent Framework
Working jaguar conservation knowledge graph
Jupyter notebook: ontology-aware extraction from mixed-content text
https://lnkd.in/dmf5HDRm
And if you have gotten this far, you realize that most of this post is written by Cursor ... That goes for the code too. 😁
Your Turn:
I know this is a contentious topic. Many teams are heavily invested in LPG-based Graph RAG. What are your thoughts on RDF vs. LPG for Graph RAG? Drop a comment below!
#GraphRAG #KnowledgeGraphs #SemanticWeb #RDF #SPARQL #AI #MachineLearning #LLM #Ontology #KnowledgeRepresentation #OpenSource #neo4j #graphdb #agentic-framework #ontotext #agenticai | 148 comments on LinkedIn
Since I'm not at #ISWC2025, it's more easy for me to speak up. There are ginormous issues with QLever and the associated Sparqloscope benchmark by Hannah Bast and colleagues.
The main results table already shows something that's too good to be true. And while I'm sure that table is technically true, the tacit implication that this table has any bearing on real-world performance, is false.
QLever is faster than the state of the art… at COUNTing. That's it. QLever can count faster. The implication is that this would mean QLever can also produce results faster. Yet we have zero reason to assume it can—until there's proof.
In the real world, query engines rarely compute all results at once. They stream those results. The Sparqloscope benchmark is designed to trick established query engines into actually producing the result set and counting items. And you know what? Sometimes, the established engines are even faster at that than QLever, which seems to be purposefully designed to count fast. Yes—I'm sure QLever is a fast counter. But what on earth does that have to do with real-world streaming query performance? And did I mention that Virtuoso supports SPARQL UPDATE?
How can you tell, just from the table? Well, Virtuoso is faster than QLever for just about anything that doesn't rely on pure counting. QLever does “Regex: prefix” or “Filter: English literals” in the ridiculously fast 0.01s? The only rational explanation is that it has a great structure for specifically this kind of counting (again, not results, just count). But Virtuoso is faster for “strbefore”? Well, there you see the real QLever performance when it cannot just count. And only one of those strategies has impact on the real world.
So what if a query engine can count faster than any other to 65,099,859,287 (real result BTW). Call me when you can produce 65,099,859,287 results faster, then we'll have something to talk about.
In the first place, it's a major failure of peer review that a benchmark based on COUNT was accepted. And I'd be very happy to be proven wrong: let's release the benchmark results for all engines, but without COUNT this time. Then we'll continue the conversation.
https://lnkd.in/eT5XrR2k | 19 comments on LinkedIn
Open-source Graph Explorer v2.4.0 is now released, and it includes a new SPARQL editor
Calling all Graph Explorers! 📣
I'm excited to share that open-source Graph Explorer v2.4.0 is now released, and it includes a new SPARQL editor!
Release notes: https://lnkd.in/ePhwPQ5W
This means that in addition to being a powerful no-code exploration tool, you can now start your visualization and exploration by writing queries directly in SPARQL. (Gremlin & openCypher too for Property Graph workloads).
This makes Graph Explorer an ideal companion for Amazon Neptune, as it supports connections via all three query languages, but you can connect to other graph databases that support these languages too.
🔹 Run it anywhere (it's open source): https://lnkd.in/ehbErxMV
🔹 Access through the AWS console in a Neptune graph notebook: https://lnkd.in/gZ7CJT8D
Special thanks go to Kris McGinnes for his efforts.
#AWS #AmazonNeptune #GraphExplorer #SPARQL #Gremlin #openCypher #KnowledgeGraph #OpenSource #RDF #LPG
open-source Graph Explorer v2.4.0 is now released, and it includes a new SPARQL editor
FalkorDB/QueryWeaver: An open-source Text2SQL tool that transforms natural language into SQL using graph-powered schema understanding. Ask your database questions in plain English, QueryWeaver handles the weaving.
An open-source Text2SQL tool that transforms natural language into SQL using graph-powered schema understanding. Ask your database questions in plain English, QueryWeaver handles the weaving. - Fal...
QLever's distinguishing features · ad-freiburg/qlever Wiki · GitHub
Graph database implementing the RDF and SPARQL standards. Very fast and scales to hundreds of billions of triples on a single commodity machine. - ad-freiburg/qlever
When we present QLever, people often ask "how is this possible" as our speed and scale is on another dimension. We now have a page in the wiki that goes into a bit more detail on why and how this is possible. In short:
• Purpose built for large scale graph data, not retrofitted
• Indexing optimized for fast queries without full in-memory loading
• Designed in C++ for efficiency and low overhead
• Integrated full text and spatial search in the same engine
• Fast interactive queries even on hundreds of billions of triples
Link to the wiki page in the comments.
Qlever: graph database implementing the RDF and SPARQL standards. Very fast and scales to hundreds of billions of triples on a single commodity machine.
Sounds to good to be true, anyone tested this out?
https://lnkd.in/esXKt79J #GraphDatbase #ontology #RDF | 14 comments on LinkedIn
Ladybug: The Next Chapter for Embedded Graph Databases | LinkedIn
It's with deep gratitude for the amazing product the #KuzuDB team created, and a mix of necessity and excitement, that I announce the launch of Ladybug. This is a new open-source project and a community-driven fork of the popular embedded graph database.
happy to add support for LadybugDB on G.V() - Graph Database Client & Visualization Tooling, picking right up where we left off with our KuzuDB integration.
Kuzu is no more
The project was archived last night with one last major release.
The communication has not been very clear, but I can bet Semih Salihoğlu is under a lot of pressure and I am looking forward to hearing the full story someday.
We liked the product and will fork it and continue supporting it as a way for our users to run local memory workloads on their machines.
We'll not support it in production anymore though, since we are not database developers and don't plan to be.
You can only get that far without the need to grow a mighty Unix beard.
Instead, we'll be going with Neo4j for larger loads and our partner Qdrant for embeddings + extend our FalkorDB and Postgres support.
It does feel a bit strange when your default DB disappears overnight.
That is why cognee is database agnostic and all features that were Kuzu specific will be migrated in about 2 weeks.
This time we were just too fast for our own good. | 47 comments on LinkedIn
Last week, the Kùzu Inc team announced that they will no longer actively support the open-source KuzuDB project.
I've been a fan of KuzuDB and think its discontinuation leaves a big gap in the graph ecosystem.
This is especially the case for open-source solutions – over the last few years, many open-source graph database systems were forked, relicensed or discontinued. Currently, users looking for an OSS graph database are left to pick from:
- community editions of systems with enterprise/cloud offerings (Neo4j, Dgraph)
- variants of a heavily-forked system (ArcadeDB / YouTrackDB, HugeGraph)
- projects under non-OSI approved licenses
- experimental systems (e.g., DuckPGQ)
I'm wondering whether this trends continues or someone steps up to maintain KuzuDB or create a new OSS system.
A sophisticated knowledge graph memory system that stores interconnected information with rich semantic structure using Neo4j.
A sophisticated knowledge graph memory system that stores interconnected information with rich semantic structure using Neo4j. - shuruheel/mcp-neo4j-shan
Announcing the formation of a Data Façades W3C Community Group
I am excited to announce the formation of a Data Façades W3C Community Group.
Façade-X, initially introduced at SEMANTICS 2021 and successfully implemented by the SPARQL Anything project, provides a simple yet powerful, homogeneous view over diverse and heterogeneous data sources (e.g., CSV, JSON, XML, and many others). With the recent v1.0.0 release of SPARQL Anything, the time was right to work on the long-term stability and widespread adoption of this approach by developing an open, vendor-neutral technology.
The Façade-X concept was born to allow SPARQL users to query data in any structured format in plain SPARQL. Therefore, the choice of a W3C community group to lead efforts on specifications is just natural. Specifications will enhance its reliability, foster innovation, and encourage various vendors and projects—including graph database developers — to provide their own compatible implementations.
The primary goals of the Data Façades Community Group is to:
Define the core specification of the Façade-X method.
Define Standard Mappings: Formalize the required mappings and profiles for connecting Façade-X to common data formats.
Define the specification of the query dialect: Provide a reference for the SPARQL dialect, configuration conventions (like SERVICE IRIs), and the functions/magic properties used.
Establish Governance: Create a monitored, robust process for adding support for new data formats.
Foster Collaboration: Build connections with relevant W3C groups (e.g., RDF & SPARQL, Data Shapes) and encourage involvement from developers, businesses, and adopters.
Join us!
With Luigi Asprino Ivo Velitchkov Justin Dowdy Paul Mulholland Andy Seaborne Ryan Shaw ...
CG: https://lnkd.in/eSxuqsvn
Github: https://lnkd.in/dkHGT8N3
SPARQL Anything #RDF #SPARQL #W3C #FX
announce the formation of a Data Façades W3C Community Group
Introducing Brahmand: a Graph Database built on top of ClickHouse
Introducing Brahmand: a Graph Database built on top of ClickHouse. Extending ClickHouse with native graph modeling and OpenCypher, merging OLAP speed with graph analysis.
While it’s still in early development, it’s been fun writing my own Cypher parser, query planner with logical plan, analyzer, and optimizer in Rust.
On the roadmap: native JSON support, bolt protocol, missing Cypher features like WITH, EXISTS, and variable-length relationship matches, along with bitmap-based optimizations and distributed cluster support.
Feel free to check out the repo: https://lnkd.in/d-Bhh-qD
I’d really appreciate a ⭐ if you find it useful!
Introducing Brahmand: a Graph Database built on top of ClickHouse
Hydra is a unique functional programming language based on the LambdaGraph data model.
In case you were wondering what I have been up to lately, Hydra is a large part of it. This is the open source graph programming language I alluded to last year at the Knowledge Graph Conference. Hydra is almost ready for its 1.0 release, and I am planning on making it into a community project, possibly through the Apache Incubator.
In this initial demo video, we take an arbitrary tabular dataset and use Hydra + Claude to map it into a property graph. More specifically, we use the LLM once to construct a pair of schemas and a mapping. From there, we apply the mapping deterministically and efficiently to each row of data, without additional calls to the LLM. The recording was a little too long for LinkedIn, so I broke it into two parts. I will post part 2 momentarily (edit: part 2 is here: https://lnkd.in/gZmHicXu). More videos will follow as we get closer to the release.
GitHub: https://lnkd.in/g8v2hvd5
Discord: https://bit.ly/lg-discord
Our SPARQL Notebook extension for Visual Studio Code makes it super easy to document SPARQL queries and run them, either against live endpoints or directly on local RDF files. I just (finally!) published a 15-minute walkthrough on our YouTube channel Giant Global Graph. It gives you a quick overview of how it works and how you can get started.
Link in the comments.
Fun fact: I recorded this two years ago and apparently forgot to hit publish. Since then, we've added new features like improved table renderers with pivoting support, so it's even more useful now. Check it out! | 11 comments on LinkedIn
A Graph-Native Workflow Application using Neo4j/Cypher | Medium
A full working Cypher script that simulates a Tendering System with multiple workflows, AI agent interactions, conversations, approvals, and more — all modeled and executed natively in a Graph.
Improving Text2Cypher for Graph RAG via schema pruning | Kuzu
In this post, we describe how to improve the quality of the Cypher queries generated by Text2Cypher via graph schema pruning, viewed through the lens of context engineering.
Use Graph Machine Learning to detect fraud with Amazon Neptune Analytics and GraphStorm | Amazon Web Services
Every year, businesses and consumers lose billions of dollars to fraud, with consumers reporting $12.5 billion lost to fraud in 2024, a 25% increase year over year. People who commit fraud often work together in organized fraud networks, running many different schemes that companies struggle to detect and stop. In this post, we discuss how to use Amazon Neptune Analytics, a memory-optimized graph database engine for analytics, and GraphStorm, a scalable open source graph machine learning (ML) library, to build a fraud analysis pipeline with AWS services.