Workshop from @FalkorDB and ZEP (Graphiti): Building Production Knowledge Graphs from Structured/Unstructured Data Sources.👩💻 Google Collab for the demo:...
Hydra is a unique functional programming language based on the LambdaGraph data model.
In case you were wondering what I have been up to lately, Hydra is a large part of it. This is the open source graph programming language I alluded to last year at the Knowledge Graph Conference. Hydra is almost ready for its 1.0 release, and I am planning on making it into a community project, possibly through the Apache Incubator.
In this initial demo video, we take an arbitrary tabular dataset and use Hydra + Claude to map it into a property graph. More specifically, we use the LLM once to construct a pair of schemas and a mapping. From there, we apply the mapping deterministically and efficiently to each row of data, without additional calls to the LLM. The recording was a little too long for LinkedIn, so I broke it into two parts. I will post part 2 momentarily (edit: part 2 is here: https://lnkd.in/gZmHicXu). More videos will follow as we get closer to the release.
GitHub: https://lnkd.in/g8v2hvd5
Discord: https://bit.ly/lg-discord
Semantic Data in Medallion Architecture: Enterprise Knowledge Graphs at Scale | LinkedIn
Building Enterprise Knowledge Graphs Within Modern Data Platforms - Version 26 Louie Franco III Enterprise Architect - Knowledge Graph Architect - Semantics Architect August 3, 2025 In my previous article on Data Vault Medallion Architecture, I outlined how structured data flows through Landing, Bro
Jessica Talisman has been publishing a series of articles on Substack about how to develop more robust AI systems by leveraging vocabularies, thesauri, tax...
A gentle introduction to DSPy for graph data enrichment | Kuzu
📢 Check out our latest blog post by Prashanth Rao, where we introduce the DSPy framework to help you build composable pipelines with LLMs and graphs. In the post, we dive into a fascinating dataset of Nobel laureates and their mentorship networks for a data enrichment task. 👇🏽
✅ The source data that contains the tree structures is enriched with data from the official Nobel Prize API.
✅ We showcase a 2-step methodology that combines the benefits of Kuzu's vector search capabilities with DSPy's powerful primitives to build an LLM-as-a-judge pipeline that help disambiguate entities in the data.
✅ The DSPy approach is scalable, low-cost and efficient, and is flexible enough to apply to a wide variety of domains and use cases.
PyG (PyTorch Geometric) has evolved significantly since its initial release, establishing itself as a leading framework for Graph Neural Networks. In this paper, we present Pyg 2.0 (and its...
Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning
Graph-R1
New RAG framework just dropped!
Combines agents, GraphRAG, and RL.
Here are my notes:
Introduces a novel RAG framework that moves beyond traditional one-shot or chunk-based retrieval by integrating graph-structured knowledge, agentic multi-turn interaction, and RL.
Graph-R1 is an agent that reasons over a knowledge hypergraph environment by iteratively issuing queries and retrieving subgraphs using a multi-step “think-retrieve-rethink-generate” loop.
Unlike prior GraphRAG systems that perform fixed retrieval, Graph-R1 dynamically explores the graph based on evolving agent state.
Retrieval is modeled as a dual-path mechanism: entity-based hyperedge retrieval and direct hyperedge similarity, fused via reciprocal rank aggregation to return semantically rich subgraphs. These are used to ground subsequent reasoning steps.
The agent is trained end-to-end using GRPO with a composite reward that incorporates structural format adherence and answer correctness. Rewards are only granted if reasoning follows the proper format, encouraging interpretable and complete reasoning traces.
On six RAG benchmarks (e.g., HotpotQA, 2WikiMultiHopQA), Graph-R1 achieves state-of-the-art F1 and generation scores, outperforming prior methods including HyperGraphRAG, R1-Searcher, and Search-R1. It shows particularly strong gains on harder, multi-hop datasets and under OOD conditions.
The authors find that Graph-R1’s performance degrades sharply without its three key components: hypergraph construction, multi-turn interaction, and RL.
Ablation study supports that graph-based and multi-turn retrieval improves information density and accuracy, while end-to-end RL bridges the gap between structure and language.
Paper: https://lnkd.in/eGbf4HhX | 15 comments on LinkedIn
Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning
Our SPARQL Notebook extension for Visual Studio Code makes it super easy to document SPARQL queries and run them, either against live endpoints or directly on local RDF files. I just (finally!) published a 15-minute walkthrough on our YouTube channel Giant Global Graph. It gives you a quick overview of how it works and how you can get started.
Link in the comments.
Fun fact: I recorded this two years ago and apparently forgot to hit publish. Since then, we've added new features like improved table renderers with pivoting support, so it's even more useful now. Check it out! | 11 comments on LinkedIn
Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need
Instead of just pulling facts, the system samples multi-step paths within the graph, such as a causal chain from a disease to a symptom, and translates these paths into natural language reasoning tasks complete with a step-by-step thinking trace
Alhamdulillah, iText2KG v0.0.8 is finally out!
(Yes, I’ve been quite busy these past few months 😅)
.. and it can now build dynamic knowledge graphs. The GIF below shows a dynamic KG generated from OpenAI tweets between June 18 and July 17.
(Note: Temporal/logical conflicts aren't handled yet in this version, but you can still resolve them with a post-processing filter.)
Here are the main updated features:
- iText2KG_Star: Introduced a simpler and more efficient version of iText2KG that eliminates the separate entity extraction step. Instead of extracting entities and relations separately, iText2KG_Star directly extracts triplets from text. This approach is more efficient as it reduces processing time and token consumption and does not need to handle invented/isolated entities.
- Facts-Based KG Construction: Enhanced the framework with facts-based knowledge graph construction using the Document Distiller to extract structured facts from documents, which are then used for incremental KG building. This approach provides more exhaustive and precise knowledge graphs.
- Dynamic Knowledge Graphs: iText2KG now supports building dynamic knowledge graphs that evolve. By leveraging the incremental nature of the framework and document snapshots with observation dates, users can track how knowledge changes and grows.
Check out the new version and an example of OpenAI Dynamic KG Construction in the first comment.
Why Businesses Must Ground Their AI in Knowledge Graphs | LinkedIn
Here, I clearly explain why businesses must transition from raw tabular data to RDF-based knowledge graphs, and why this is essential to ground AI in logic-driven, traceable inference rather than black-box prediction: 1. Your tabular data is dumb.
Millions of G∈AR-s: Extending GraphRAG to Millions of Documents
Scaling GraphRAG to Millions of Documents: Lessons from the SIGIR 2025 LiveRAG Challenge
👉 WHY THIS MATTERS
Retrieval-augmented generation (RAG) struggles with multi-hop questions that require connecting information across documents. While graph-based RAG methods like GEAR improve reasoning by structuring knowledge as entity-relationship triples, scaling these approaches to web-sized datasets (millions/billions of documents) remains a bottleneck. The culprit? Traditional methods rely heavily on LLMs to extract triples—a process too slow and expensive for large corpora.
👉 WHAT THEY DID
Researchers from Huawei and the University of Edinburgh reimagined GEAR to sidestep costly offline triple extraction.
Their solution:
- Pseudo-alignment: Link retrieved passages to existing triples in Wikidata via sparse retrieval.
- Iterative expansion: Use a lightweight LLM (Falcon-3B-Instruct) to iteratively rewrite queries and retrieve additional evidence through Wikidata’s graph structure.
- Multi-step filtering: Combine Reciprocal Rank Fusion (RRF) and prompt-based filtering to reconcile noisy alignments between Wikidata and document content.
This approach achieved 87.6% correctness and 53% faithfulness on the SIGIR 2025 LiveRAG benchmark, despite challenges in aligning Wikidata’s generic triples with domain-specific document content.
👉 KEY INSIGHTS
1. Trade-offs in alignment: Linking Wikidata triples to documents works best for general knowledge but falters with niche topics (e.g., "Pacific geoduck reproduction" mapped incorrectly to oyster biology).
2. Cost efficiency: Avoiding LLM-based triple extraction reduced computational overhead, enabling scalability.
3. The multi-step advantage: Query rewriting and iterative retrieval improved performance on complex questions requiring 2+ reasoning hops.
👉 OPEN QUESTIONS
- How can we build asymmetric semantic models to better align text and graph data?
- Can hybrid alignment strategies (e.g., blending domain-specific KGs with Wikidata) mitigate topic drift?
- Does graph expansion improve linearly with scale, or are diminishing returns inevitable?
Why read this paper?
It’s a pragmatic case study in balancing scalability with reasoning depth in RAG systems. The code and prompts are fully disclosed, offering a blueprint for adapting GraphRAG to real-world, large-scale applications.
Paper: "Millions of G∈AR-s: Extending GraphRAG to Millions of Documents" (Shen et al., SIGIR 2025). Preprint: arXiv:2307.17399.
Millions of G∈AR-s: Extending GraphRAG to Millions of Documents
As the fifth most popular website on the Internet, keeping Wikipedia running smoothly is no small feat. The free encyclopedia hosts more than 65 million
This is the title of my upcoming book. And it’s all about the Shapes Constraint Language (SHACL). Expected release before November 1st 2025. The book is written and illustrated by Veronika He…
I've spent long, hard years learning how to talk about knowledge graphs and semantics with software engineers who have little training in linguistics. I feel quite fluent at this point, after investing huge amounts of effort into understanding statistics (I was a humanities undergrad) and into unpac
The future of trustworthy AI.
Powered by graphs.
data² has secured a groundbreaking patent for explainable AI powered by graphs.
🚨 AI hallucinations destroy trust.
That's not acceptable when lives and missions are at stake.
While others rush to patch traditional RAG systems, we've engineered a fundamentally different approach.
Our patented innovation delivers what leaders demand:
🔍 **Complete Transparency**
- Watch AI traverse relationship paths in real-time
- No more black box decisions
📊 **Evidence You Can Trust**
- Every conclusion links to source data
- Full citation trails for audit readiness
How did we build it?
🔗 **Graph-Based Architecture**
- Knowledge graphs capture critical relationships traditional RAG misses
- Every connection adds context and validates accuracy
This isn't just innovation for innovation's sake.
At data² we are solving critical challenges across:
↳ Intelligence operations requiring all-source validation
↳ Cyber threat analysis demanding instant verification
↳ Energy infrastructure decisions where safety is paramount
↳ Financial investigations tracking complex money flows
↳ Supply chain operations in contested environments
While others promise AI accuracy, we've patented how to prove it.
💬 Interested in learning more? Reach out directly.
🔔 Follow me Daniel Bukowski for daily insights about delivering transparent AI with graph technology. | 90 comments on LinkedIn
Getting Started with the Graph Query Language (GQL): The complete guide to designing, querying, and managing graph databases with GQL: 9781836204015: Computer Science Books @ Amazon.com
Getting Started with the Graph Query Language (GQL): The complete guide to designing, querying, and managing graph databases with GQL: 9781836204015: Computer Science Books @ Amazon.com