GraphNews

4343 bookmarks
Custom sorting
knowledge infrastructure
knowledge infrastructure
We talk about knowledge management and systems for knowledge (like knowledge graphs) a lot these days. Especially with the rising interest in #semantics, #metadata, #taxonomies and #ontologies, thanks to AI. But what makes for knowledge that is operational and actionable? Less often discussed is knowledge infrastructure. Fundamental to knowledge management and knowledge repositories, as derived from the field of library and information science, is a service—oriented approach. Knowledge infrastructure is focused on creating systems that deliver information and knowledge that is accurate and satisfies the requirements of: ⚪️ Creators: those who generate knowledge (researchers, experts, content authors, data producers) ⚪️ Products: the formal outputs of knowledge (e.g., documents, datasets, models, applications, platforms, chatbots/AI assistants) ⚪️ Distributors: systems and platforms that make knowledge available (repositories, databases, APIs) ⚪️ Disseminators: communicators and interpreters (educators, marketers, dashboards, wikis) ⚪️ Users: individuals or systems that apply the knowledge (decisionmakers, AI agents, learners, stakeholders) Let’s put this into perspective. Without supporting knowledge infrastructures, knowledge becomes a one-off, relegated to silos or single use instances. We see this with products. When we manage knowledge as a product, we fail to cast a wider net, assuming successes based on metrics that are localized to the product rather than distributed to be inclusive of all signals, input and output. If knowledge is not managed as infrastructure, we create anti-patterns for the business and AI systems. A recognizable symptom of these anti-patterns are silos. I’ll be publishing an article soon about knowledge infrastructure, and what it takes to build and manage a knowledge infrastructure program. #ai #ia #knowledgeinfrastructure For reference, excerpt from Richard E Rubin’s MLS textbook, Foundations of Information and Library Science in comments 👇👇👇 | 42 comments on LinkedIn
knowledge infrastructure
·linkedin.com·
knowledge infrastructure
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer that serves as the brain for AI agents to act on knowledge of your internal data and deliver timely, accurate and hallucination-free insights! #semanticlayer #knowledgegraphs #genai #decisionintelligence
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
·linkedin.com·
The new AI powered Anayltics stack is here…says Gartner’s Afraz Jaffri ! A key element of that stack is an ontology powered Semantic Layer
Relational Graph Transformers: A New Frontier in AI for Relational Data - Kumo
Relational Graph Transformers: A New Frontier in AI for Relational Data - Kumo
Relational Graph Transformers represent the next evolution in Relational Deep Learning, allowing AI systems to seamlessly navigate and learn from data spread across multiple tables. By treating relational databases as the rich, interconnected graphs they inherently are, these models eliminate the need for extensive feature engineering and complex data pipelines that have traditionally slowed AI adoption. In this post, we'll explore how Relational Graph Transformers work, why they're uniquely suited for enterprise data challenges, and how they're already revolutionizing applications from customer analytics and recommendation systems to fraud detection and demand forecasting.
·kumo.ai·
Relational Graph Transformers: A New Frontier in AI for Relational Data - Kumo
Fine-tue an LLM model for triplet extraction
Fine-tue an LLM model for triplet extraction
Do you want to fine-tune an LLM model for triplet extraction? These findings from a recently published paper (first comment) could save you much time. ✅ Does the choice of coding vs natural language prompts significantly impact performance? When fine-tuning these open weights and small LLMs, the choice between code and natural language prompts has a limited impact on performance. ✅ Does training fine-tuned models to include chain-of-thought (rationale) sections in their outputs improve KG construction (KGC) performance? It is ineffective at best and highly detrimental at worst for fine-tuned models. This performance decrease is observed regardless of the number of in-context learning examples provided. Attention analysis suggests this might be due to the model's attention being dispersed on redundant information when rationale is used. Without rationale lists occupying prompt space, the model's attention can focus directly on the ICL examples while extracting relations. ✅ How do the fine-tuned smaller, open-weight LLMs perform compared to the CodeKGC baseline, which uses larger, closed-source models (GPT-3.5)? The selected lightweight LLMs significantly outperform the much larger CodeKGC baseline after fine-tuning. The best fine-tuned models improve upon the CodeKGC baseline by as much as 15–20 absolute F1 points across the dataset. ✅ Does model size matter for KGC performance when fine-tuning with a small amount of training data? Yes, but not in a straightforward way. The 70 B-parameter versions yielded worse results than the 1B, 3B, and 8B models when undergoing the same small amount of training. This implies that for KGC with limited fine-tuning, smaller models can perform better than much larger ones. ✅ For instruction-tuned models without fine-tuning, does prompt language or rationale help? For models without fine-tuning, using code prompts generally yields the best results for both code LLMs and the Mistral natural language model. In addition, using rationale generally seems to help these models, with most of the best results obtained when including rationale lists in the prompt. ✅ What do the errors made by the models suggest about the difficulty of the KGC task? difficulty in predicting relations, entities, and their order, especially when dealing with specialized terminology or specific domain knowledge, which poses a challenge even after fine-tuning. Some errors include adding superfluous adjectives or mistaking entity instances for class names. ✅ What is the impact of the number of in-context learning (ICL) examples during fine-tuning? The greatest performance benefit is obtained when moving from 0 to 3 ICL examples. However, additional ICL examples beyond 3 do not lead to any significant performance delta and can even lead to worse results. This further indicates that the fine-tuning process itself is the primary driver of performance gain, allowing the model to learn the task from the input text and target output.
fine-tune an LLM model for triplet extraction
·linkedin.com·
Fine-tue an LLM model for triplet extraction
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role. It’s not just smarter retrieval. It’s structured memory for AI agents. 》 Why NodeRAG? Most Retrieval-Augmented Generation (RAG) methods retrieve chunks of text. Good enough — until you need reasoning, precision, and multi-hop understanding. This is how NodeRAG solves these problems: 》 🔹Step 1: Graph Decomposition NodeRAG begins by decomposing raw text into smart building blocks: ✸ Semantic Units (S): Little event nuggets ("Hinton won the Nobel Prize.") ✸ Entities (N): Key names or concepts ("Hinton", "Nobel Prize") ✸ Relationships (R): Links between entities ("awarded to") ✩ This is like teaching your AI to recognize the actors, actions, and scenes inside any document. 》 🔹Step 2: Graph Augmentation Decomposition alone isn't enough. NodeRAG augments the graph by identifying important hubs: ✸ Node Importance: Using K-Core and Betweenness Centrality to find critical nodes ✩ Important entities get special attention — their attributes are summarized into new nodes (A). ✸ Community Detection: Grouping related nodes into communities and summarizing them into high-level insights (H). ✩ Each community gets a "headline" overview node (O) for quick retrieval. It's like adding context and intuition to raw facts. 》 🔹 Step 3: Graph Enrichment Knowledge without detail is brittle. So NodeRAG enriches the graph: ✸ Original Text: Full chunks are linked back into the graph (Text nodes, T) ✸ Semantic Edges: Using HNSW for fast, meaningful similarity connections ✩ Only smart nodes are embedded (not everything!) — saving huge storage space. ✩ Dual search (exact + vector) makes retrieval laser-sharp. It’s like turning a 2D map into a 3D living world. 》 🔹 Step 4: Graph Searching Now comes the magic. ✸ Dual Search: First find strong entry points (by name or by meaning) ✸ Shallow Personalized PageRank (PPR): Expand carefully from entry points to nearby relevant nodes. ✩ No wandering into irrelevant parts of the graph. The search is surgical. ✩ Retrieval includes fine-grained semantic units, attributes, high-level elements — everything you need, nothing you don't. It’s like sending out agents into a city — and they return not with everything they saw, but exactly what you asked for, summarized and structured. 》 Results: NodeRAG's Performance Compared to GraphRAG, LightRAG, NaiveRAG, and HyDE — NodeRAG wins across every major domain: Tech, Science, Writing, Recreation, and Finance. NodeRAG isn’t just a better graph. NodeRAG is a new operating system for memory. ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ⫸ꆛ Want to build Real-World AI agents? Join My 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 TODAY! ➠ Build Real-World AI Agents + RAG Pipelines ➠ Learn 3 Tools: LangGraph/LangChain | CrewAI | OpenAI Swarm ➠ Work with Text, Audio, Video and Tabular Data 👉𝗘𝗻𝗿𝗼𝗹𝗹 𝗡𝗢𝗪 (𝟯𝟰% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁): https://lnkd.in/eGuWr4CH | 20 comments on LinkedIn
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
·linkedin.com·
NodeRAG restructures knowledge into a heterograph: a rich, layered, musical graph where each node plays a different role
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Today, Amazon Web Services (AWS) announced the general availability of Amazon Bedrock Knowledge Bases GraphRAG (GraphRAG), a capability in Amazon Bedrock Knowledge Bases that enhances Retrieval-Augmented Generation (RAG) with graph data in Amazon Neptune Analytics. In this post, we discuss the benefits of GraphRAG and how to get started with it in Amazon Bedrock Knowledge Bases.
·aws.amazon.com·
Announcing general availability of Amazon Bedrock Knowledge Bases GraphRAG with Amazon Neptune Analytics | Amazon Web Services
Trends from KGC 2025
Trends from KGC 2025
Last week I was fortunate to attend the Knowledge Graph Conference in NYC! Here are a few trends that span multiple presentations and conversations. - AI and LLM Integration: A major focus [again this year] was how LLMs can be used to enrich knowledge graphs and how knowledge graphs, in turn, can improve LLM outputs. This included using LLMs for entity extraction, verification, inference, and query generation. Many presentations demonstrated how grounding LLMs in knowledge graphs leads to more accurate, contextual, and explainable AI responses. - Semantic Layers and Enterprise Knowledge: There was a strong emphasis on building semantic layers that act as gateways to structured, connected enterprise data. These layers facilitate data integration, governance, and more intelligent AI agents. Decentralized semantic data products (DPROD) were discussed as a framework for internal enterprise data ecosystems. - From Data to Knowledge: Many speakers highlighted that AI is just the “tip of the iceberg” and the true power lies in the data beneath. Converting raw data into structured, connected knowledge was seen as crucial. The hidden costs of ignoring semantics were also discussed, emphasizing the need for consistent data preparation, cleansing, and governance. - Ontology Management and Change: Managing changes and governance in ontologies was a recurring theme. Strategies such as modularization, version control, and semantic testing were recommended. The concept of “SemOps” (Semantic Operations) was discussed, paralleling DevOps for software development. - Practical Tools and Demos: The conference included numerous demos of tools and platforms for building, querying, and visualizing knowledge graphs. These ranged from embedded databases like KuzuDB and RDFox to conversational AI interfaces for KGs, such as those from Metaphacts and Stardog. I especially enjoyed catching up with the Semantic Arts team (Mark Wallace, Dave McComb and Steve Case), talking Gist Ontology and SemOps. I also appreciated the detailed Neptune Q&A I had with Brian O'Keefe, the vision of Ora Lassila and then a chance meeting Adrian Gschwend for the first time, where we connected on LinkML and Elmo as a means to help with bidirectional dataflows. I was so excited by these conversations that I planned to have two team members join me in June at the Data Centric Architecture Workshop Forum, https://www.dcaforum.com/
trends
·linkedin.com·
Trends from KGC 2025
Stardog vectorised SPARQL execution engine
Stardog vectorised SPARQL execution engine
pretty stoked that our paper on the vectorised #SPARQL execution engine in Stardog got accepted to the GRADES-NDA workshop at #SIGMOD2025. It's a cool piece of work describing how modern vectorised join algorithms, more widely known in the SQL world, make graph query processing much more efficient. Talking of up to an order of magnitude difference when it comes to analytical queries and large-scale joins. Hugely proud of my brilliant co-authors Simon Grätzer (the lead engineer on the BARQ project) and Lars Heling. It was their idea to do this work, and I couldn't be more proud that it worked out in the end. The preprint is now on arXiv: https://lnkd.in/eqXtVMqe
ectorised hashtag#SPARQL execution engine in Stardog got accept
·linkedin.com·
Stardog vectorised SPARQL execution engine
Graph algebra
Graph algebra
The best talk during RSA Conference in my mind is: Graphs and Algebras of Defense John Lambert Corporate Vice President, CISO Microsoft What differentiates industry visionary, and average people, is the capability to abstract the theory from practice. John came up with an elegant abstraction of graph “algebra” for cybersecurity defense, that resonates well with my PhD thesis on manifold leanring and graph embedding. The way the algebra operator on cybersecurity graphs is inspiring. I hope more innovations can be sparked by such elegant framework. Leo Meyerovich Alexander Morisse, PhD #GraphThePlanet | 13 comments on LinkedIn
·linkedin.com·
Graph algebra
Aerospike Graph scales efficiently from 200GB to 20TB without performance degradation across multiple real-world identity graph workloads
Aerospike Graph scales efficiently from 200GB to 20TB without performance degradation across multiple real-world identity graph workloads
Discover how #Aerospike Graph overcomes #identityresolution limitations. Download our latest benchmark to: 💡 See how #AerospikeGraph scales efficiently from 200GB to 20TB without performance degradation across multiple real-world identity graph workloads 💡 Learn how to deploy high-performance identity graphs with fewer resources 💡 Use the results to plan your own scale-out graph infrastructure Get the benchmark here: https://lnkd.in/gZimB6Sh #AdTech #MarTech Ishaan Biswas Lyndon Bauto Phil Allsopp Matt Bushell Jim Doty
how hashtag#AerospikeGraph scales efficiently from 200GB to 20TB without performance degradation across multiple real-world identity graph workloads
·linkedin.com·
Aerospike Graph scales efficiently from 200GB to 20TB without performance degradation across multiple real-world identity graph workloads
Ontologies, OWL, SHACL - a baseline | LinkedIn
Ontologies, OWL, SHACL - a baseline | LinkedIn
Ontologies, OWL, SHACL: I was going to react to a comment that came through my feed and turned it into this post as it resonated with questions I am often asked about the mentioned technologies, their uses, and their relations and, more broadly, it concerns a key architectural discussion that I've h
·linkedin.com·
Ontologies, OWL, SHACL - a baseline | LinkedIn
𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗜𝘀 𝗖𝗹𝗲𝗮𝗿: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗪𝗶𝗹𝗹 𝗡𝗘𝗘𝗗 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚
𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗜𝘀 𝗖𝗹𝗲𝗮𝗿: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗪𝗶𝗹𝗹 𝗡𝗘𝗘𝗗 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚
🤺 𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗜𝘀 𝗖𝗹𝗲𝗮𝗿: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗪𝗶𝗹𝗹 𝗡𝗘𝗘𝗗 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚 Why? It combines Multi-hop reasoning, Non-Parameterized / Learning-Based Retrieval, Topology-Aware Prompting. ﹌﹌﹌﹌﹌﹌﹌﹌﹌ 🤺 𝗪𝗵𝗮𝘁 𝗜𝘀 𝗚𝗿𝗮𝗽𝗵-𝗘𝗻𝗵𝗮𝗻𝗰𝗲𝗱 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹-𝗔𝘂𝗴𝗺𝗲𝗻𝘁𝗲𝗱 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 (𝗥𝗔𝗚)? ✩ LLMs hallucinate. ✩ LLMs forget. ✩ LLMs struggle with complex reasoning. Graphs connect facts. They organize knowledge into neat, structured webs. So when RAG retrieves from a graph, the LLM doesn't just guess — it reasons. It follows the map. ﹌﹌﹌﹌﹌﹌﹌﹌﹌ 🤺 𝗧𝗵𝗲 𝟰-𝗦𝘁𝗲𝗽 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄 𝗼𝗳 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚 1️⃣ — User Query: The user asks a question. ("Tell me how Einstein used Riemannian geometry?") 2️⃣ — Retrieval Module: The system fetches the most structurally relevant knowledge from a graph. (Entities: Einstein, Grossmann, Riemannian Geometry.) 3️⃣ — Prompting Module: Retrieved knowledge is reshaped into a golden prompt — sometimes as structured triples, sometimes as smart text. 4️⃣ — Output Response: LLM generates a fact-rich, logically sound answer. ﹌﹌﹌﹌﹌﹌﹌﹌﹌ 🤺 𝗦𝘁𝗲𝗽 𝟭: 𝗕𝘂𝗶𝗹𝗱 𝗚𝗿𝗮𝗽𝗵-𝗣𝗼𝘄𝗲𝗿𝗲𝗱 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 ✩ Use Existing Knowledge Graphs like Freebase or Wikidata — structured, reliable, but static. ✩ Or Build New Graphs From Text (OpenIE, instruction-tuned LLMs) — dynamic, adaptable, messy but powerful. 🤺 𝗦𝘁𝗲𝗽 𝟮: 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗮𝗻𝗱 𝗣𝗿𝗼𝗺𝗽𝘁𝗶𝗻𝗴 𝗔𝗹𝗴𝗼𝗿𝗶𝘁𝗵𝗺𝘀 ✩ Non-Parameterized Retrieval (Deterministic, Probabilistic, Heuristic) ★ Think Dijkstra's algorithm, PageRank, 1-hop neighbors. Fast but rigid. ✩ Learning-Based Retrieval (GNNs, Attention Models) ★ Think "graph convolution" or "graph attention." Smarter, deeper, but heavier. ✩ Prompting Approaches: ★ Topology-Aware: Preserve graph structure — multi-hop reasoning. ★ Text Prompting: Flatten into readable sentences — easier for vanilla LLMs. 🤺 𝗦𝘁𝗲𝗽 𝟯: 𝗚𝗿𝗮𝗽𝗵-𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲𝘀 ✩ Sequential Pipelines: Straightforward query ➔ retrieve ➔ prompt ➔ answer. ✩ Loop Pipelines: Iterative refinement until the best evidence is found. ✩ Tree Pipelines: Parallel exploration ➔ multiple knowledge paths at once. 🤺 𝗦𝘁𝗲𝗽 𝟰: 𝗚𝗿𝗮𝗽𝗵-𝗢𝗿𝗶𝗲𝗻𝘁𝗲𝗱 𝗧𝗮𝘀𝗸𝘀 ✩ Knowledge Graph QA (KGQA): Answering deep, logical questions with graphs. ✩ Graph Tasks: Node classification, link prediction, graph summarization. ✩ Domain-Specific Applications: Biomedicine, law, scientific discovery, finance. ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ Join my 𝗛𝗮𝗻𝗱𝘀-𝗼𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴. Skip the fluff and build real AI agents — fast. 𝗪𝗵𝗮𝘁 𝘆𝗼𝘂 𝗴𝗲𝘁: ✅ Create Smart Agents + Powerful RAG Pipelines ✅ Master 𝗟𝗮𝗻𝗴𝗖𝗵𝗮𝗶𝗻, 𝗖𝗿𝗲𝘄𝗔𝗜 & 𝗦𝘄𝗮𝗿𝗺 – all in one training ✅ Projects with Text, Audio, Video & Tabular Data 𝟰𝟲𝟬+ engineers already enrolled 𝗘𝗻𝗿𝗼𝗹𝗹 𝗻𝗼𝘄 — 𝟯𝟰% 𝗼𝗳𝗳, 𝗲𝗻𝗱𝘀 𝘀𝗼𝗼𝗻: https://lnkd.in/eGuWr4CH | 35 comments on LinkedIn
𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗜𝘀 𝗖𝗹𝗲𝗮𝗿: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗪𝗶𝗹𝗹 𝗡𝗘𝗘𝗗 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚
·linkedin.com·
𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗜𝘀 𝗖𝗹𝗲𝗮𝗿: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗪𝗶𝗹𝗹 𝗡𝗘𝗘𝗗 𝗚𝗿𝗮𝗽𝗵 𝗥𝗔𝗚
SousLesensVocables is a set of tools developed to manage Thesaurus and Ontologies resources through SKOS , OWL and RDF standards and graph visualisation approaches
SousLesensVocables is a set of tools developed to manage Thesaurus and Ontologies resources through SKOS , OWL and RDF standards and graph visualisation approaches
SousLesensVocables is a set of tools developed to manage Thesaurus and Ontologies resources through SKOS , OWL and RDF standards and graph visualisation approaches
·souslesens.github.io·
SousLesensVocables is a set of tools developed to manage Thesaurus and Ontologies resources through SKOS , OWL and RDF standards and graph visualisation approaches
The new AI Risk “ontology”: A Map with No Rules
The new AI Risk “ontology”: A Map with No Rules
A Map with No Rules The new AI Risk “ontology” (AIRO) maps regulatory concepts from the EU AI Act, ISO/IEC 23894, and ISO 31000. But without formal constraints or ontological grounding in a top-level ontology, it reads more like a map with no rules. At first glance, AIRO seems well-structured. It defines entities like “AI Provider,” “AI Subject,” and “Capability,” linking them to legal clauses and decision workflows. But it lacks the logical scaffolding that makes semantic models computable. There are no disjointness constraints, no domain or range restrictions, no axioms to enforce identity or prevent contradiction. For example, if “Provider” and “Subject” are just two nodes in a graph, the system has no way to infer that they must be distinct. There’s nothing stopping an implementation from assigning both roles to the same agent. That’s not an edge case. It’s a missing foundation. This is where formal ontologies matter. Logic is not a luxury. It’s what makes it possible to validate, reason, and automate oversight. Without constraints and grounding in a TLO, semantic structures become decorative. They document language, but not the conditions that govern responsible behavior. If we want regulations that adapts with AI instead of chasing it, we need more than a vocabulary. We need logic, constraints, and ontological structure. #AIRegulation #ResponsibleAI #SemanticGovernance #AIAudits #AIAct #Ontologies #LogicMatters
A Map with No RulesThe new AI Risk “ontology”
·linkedin.com·
The new AI Risk “ontology”: A Map with No Rules
The "Ontology Gap" for property graphs
The "Ontology Gap" for property graphs
I was looking forward to speaking at next week's Knowledge Graph Conference, but I had a stroke in early March, so I've had to cut back my activity quite a lot. This short article talks about the overall problem/opportunity, which underpins the work in LDBC (Linked Data Benchmark Council), relating
·linkedin.com·
The "Ontology Gap" for property graphs
Graph Learning Will Lose Relevance Due To Poor Benchmarks
Graph Learning Will Lose Relevance Due To Poor Benchmarks
📣 Our spicy ICML 2025 position paper: “Graph Learning Will Lose Relevance Due To Poor Benchmarks”. Graph learning is less trendy in the ML world than it was in 2020-2022. We believe the problem is in poor benchmarks that hold the field back - and suggest ways to fix it! We identified three problems: #️⃣ P1: No transformative real-world applications - while LLMs and geometric generative models become more powerful and solve complex tasks every generation (from reasoning to protein folding), how transformative could a GNN on Cora or OGB be? P1 Remedies: The community is overlooking many significant and transformative applications, including chip design and broader ML for systems, combinatorial optimization, and relational data (as highlighted by RelBench). Each of them offers $billions in potential outcomes. #️⃣ P2: While everything can be modeled as a graph, often it should not be. We made a simple experiment and probed a vanilla DeepSet w/o edges and a GNN on Cayley graphs (fixed edges for a certain number of nodes) on molecular datasets and the performance is quite competitive. #️⃣ P3: Bad benchmarking culture (this one hits hard) - it’s a mess :) Small datasets (don’t use Cora and MUTAG in 2025), no standard splits, and in many cases recent models are clearly worse than GCN / Sage from 2020. It gets worse when evaluating generative models. Remedies for P3: We need more holistic benchmarks which are harder to game and saturate - while it’s a common problem for all ML fields, standard graph learning benchmarks are egregiously old and rather irrelevant for the scale of problems doable in 2025. 💡 As a result, it’s hard to build a true foundation model for graphs. Instead of training each model on each dataset, we suggest using GNNs / GTs as processors in the “encoder-processor-decoder” blueprint, train them at scale, and only tune graph-specific encoders/decoders. For example, we pre-trained several models on PCQM4M-v2, COCO-SP, and MalNet Tiny, and fine-tuned them on PascalVOC, Peptides-struct, and Stargazers to find that graph transformers benefit from pre-training. --- The project started around NeurIPS 2024 when Christopher Morris gathered us to discuss the peeve points of graph learning and how to continue to do impactful research in this area. I believe the outcomes appear promising, and we can re-imagine graph learning in 2025 and beyond! Massive work with 12 authors (everybody actually contributed): Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael Bronstein, Mathias Niepert, Bryan Perozzi, and Christopher Morris (Chris you should create a LinkedIn account finally ;)
Graph Learning Will Lose Relevance Due To Poor Benchmarks
·linkedin.com·
Graph Learning Will Lose Relevance Due To Poor Benchmarks