Search GraphNews

Found 10 bookmarks

Custom sorting

New project makes Wikipedia data more accessible to AI | TechCrunch

Called the Wikidata Embedding Project, the system applies a vector-based semantic search to the existing data on Wikipedia and its sister platforms, consisting of nearly 120 million entries.

#KnowledgeGraph #AI #open data #semantics

·techcrunch.com·Oct 7, 2025

New project makes Wikipedia data more accessible to AI | TechCrunch

Wikidata:Embedding Project/October 1 2025 Release - Wikidata

#KnowledgeGraph #semantics #AI #open data

·wikidata.org·Oct 7, 2025

Wikidata:Embedding Project/October 1 2025 Release - Wikidata

Cellosaurus is now available in RDF format

Cellosaurus is now available in RDF format, with a triple store that supports SPARQL queries If this sounds a bit abstract or unfamiliar… 1) RDF stands for Resource Description Framework. Think of RDF as a way to express knowledge using triplets: Subject – Predicate – Object. Example: HeLa (subject) – is_transformed_by (predicate) – Human papillomavirus type 18 (object) These triplets are like little facts that can be connected together to form a graph of knowledge. 2) A triple store is a database designed specifically to store and retrieve these RDF triplets. Unlike traditional databases (tables, rows), triple stores are optimized for linked data. They allow you to navigate connections between biological entities, like species, tissues, genes, diseases, etc. 3) SPARQL is a query language for RDF data. It lets you ask complex questions, such as: - Find all cell lines with a *RAS (HRAS, NRAS, KRAS) mutation in p.Gly12 - Find all Cell lines from animals belonging the order "carnivora" More specifically we now offer from the Tool - API submenu 6 new options: 1) SPARQL Editor (https://lnkd.in/eF2QMsYR). The SPARQL Editor is a tool designed to assist users in developing their SPARQL queries. 2) SPARQL Service (https://lnkd.in/eZ-iN7_e). The SPARQL service is the web service that accepts SPARQL queries over HTTP and returns results from the RDF dataset. 3) Cellosaurs Ontology (https://lnkd.in/eX5ExjMe). An RDF ontology is a formal, structured representation of knowledge. It explicitly defines domain-specific concepts - such as classes and properties - enabling data to be described with meaningful semantics that both humans and machines can interpret. The Cellosaurus ontology is expressed in OWL. 4) Cellosaurus Concept Hopper (https://lnkd.in/e7CH5nj4). The Concept Hopper, is a tool that provides an alternative view of the Cellosaurus ontology. It focuses on a single concept at a time - either a class or a property - and shows how that concept is linked to others within the ontology, as well as how it appears in the data. 5) Cellosaurus dereferencing service (https://lnkd.in/eSATMhGb). The RDF dereferencing service is the mechanism that, given a URI, returns an RDF description of the resource identified by that URI, enabling clients to retrieve structured, machine-readable data about the resource from the web in different formats. 6) Cellosaurus RDF files download (https://lnkd.in/emuEYnMD). This allows you to download the Cellosaurus RDF files in Turtle (ttl) format.

Cellosaurus is now available in RDF format

#KnowledgeGraph #semantics #technical #open data #open source

·linkedin.com·Jun 25, 2025

Cellosaurus is now available in RDF format

The Dataverse Project: 750K FAIR Datasets and a Living Knowledge Graph

"I'm Ukrainian and I'm wearing a suit, so no complaints about me from the Oval Office" - that's the start of my lecture about building Artificial Intelligence with Croissant ML in the Dataverse data platform, for the Bio x AI Hackathon kick-off event in Berlin. https://lnkd.in/ePYHCfJt * 750,000+ FAIR datasets across the world forcing the innovation of the whole data landscape. * A knowledge graph with 50M+ triples. * AI-ready metadata exports. * Qdrant as a vector storage, Google Meta Mistral AI as LLM model providers. * Adrian Gschwend Qlever as fastest triple store for Dataverse knowledge graphs Multilingual, machine-readable, queryable scientific data at scale. If you're interested, you can also apply for the 2-month #BioAgentHack online hackathon: • $125K+ prizes • Mentorship from Biotech and AI leaders • Build alongside top open-science researchers & devs More info: https://lnkd.in/eGhvaKdH

#KnowledgeGraph #research #open source #technical #open data #science

·linkedin.com·Apr 11, 2025

The Dataverse Project: 750K FAIR Datasets and a Living Knowledge Graph

WorldFAIR (D2.3) Cross-Domain Interoperability Framework (CDIF) (Report Synthesising Recommendations for Disciplines and Cross-Disciplinary Research Areas)

The Cross-Domain Interoperability Framework (CDIF) is designed to support FAIR implementation by establishing a ‘lingua franca’, based on existing standards and technologies to support interoperability, in both human- and machine-actionable fashion. CDIF is a set of implementation recommendations, based on profiles of common, domain-neutral metadata standards which are aligned to work together to support core functions required by FAIR. This report presents a core set of five CDIF profiles, which address the most important functions for cross-domain FAIR implementation. Discovery (discovery of data and metadata resources) Data access (specifically, machine-actionable descriptions of access conditions and permitted use) Controlled vocabularies (good practices for the publication of controlled vocabularies and semantic artefacts) Data integration (description of the structural and semantic aspects of data to make it integration-ready) Universals (the description of ‘universal’ elements, time, geography, and units of measurement). Each of these profiles is supported by specific recommendations, including the set of metadata fields in specific standards to use, and the method of implementation to be employed for machine-level interoperability. A further set of topics is examined, establishing the priorities for further work. These include: Provenance (the description of provenance and processing) Context (the description of ‘context’ in the form of dependencies between fields within the data and a description of the research setting) Perspectives on AI (discussing the impacts of AI and the role that metadata can play) Packaging (the creation of archival and dissemination packages) Additional Data Formats (support for some of the data formats not fully supported in the initial release, such as NetCDF, Parquet, and HDF5). In each of these topics, current discussions are documented, and considerations for further work are provided. Visit WorldFAIR online at http://worldfair-project.eu. WorldFAIR is funded by the EC HORIZON-WIDERA-2021-ERA-01-41 Coordination and Support Action under Grant Agreement No. 101058393.

#KnowledgeGraph #semantics #open data

·zenodo.org·Oct 31, 2024

WorldFAIR (D2.3) Cross-Domain Interoperability Framework (CDIF) (Report Synthesising Recommendations for Disciplines and Cross-Disciplinary Research Areas)

Home Page | Open Data Portal | S&P Global Commodity Insights

Establishing this open data portal to share our reference data and schema with customers, stakeholders, and partners under a permissive open data license.

#KnowledgeGraph #semantics #open data

·dunl.org·Sep 9, 2024

Home Page | Open Data Portal | S&P Global Commodity Insights

Linked Data in Production: Moving Beyond Ontologies

Linked Data in Production: Moving Beyond Ontologies - Download as a PDF or view online for free

#KnowledgeGraph #open data

·slideshare.net·Mar 29, 2024

Linked Data in Production: Moving Beyond Ontologies

Confused by SOLID

I keep checking in on the Solid project. But I’m baffled by its lack of functionality. I’ve written up some of my questions.

#KnowledgeGraph #open data #PKG

·blog.ldodds.com·Mar 20, 2024

Confused by SOLID

OpenFact: Factuality Enhanced Open Knowledge Extraction | Transactions of the Association for Computational Linguistics | MIT Press

Abstract. We focus on the factuality property during the extraction of an OpenIE corpus named OpenFact, which contains more than 12 million high-quality knowledge triplets. We break down the factuality property into two important aspects—expressiveness and groundedness—and we propose a comprehensive framework to handle both aspects. To enhance expressiveness, we formulate each knowledge piece in OpenFact based on a semantic frame. We also design templates, extra constraints, and adopt human efforts so that most OpenFact triplets contain enough details. For groundedness, we require the main arguments of each triplet to contain linked Wikidata1 entities. A human evaluation suggests that the OpenFact triplets are much more accurate and contain denser information compared to OPIEC-Linked (Gashteovski et al., 2019), one recent high-quality OpenIE corpus grounded to Wikidata. Further experiments on knowledge base completion and knowledge base question answering show the effectiveness of OpenFact over OPIEC-Linked as supplementary knowledge to Wikidata as the major KG.

#KnowledgeGraph #open data #open source

·direct.mit.edu·Jul 13, 2023

OpenFact: Factuality Enhanced Open Knowledge Extraction | Transactions of the Association for Computational Linguistics | MIT Press

Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information

The scientific work of members of our Department was published in the monograph "Information Technology for Management: Approaches to Improving Business and Society" published by the Springer. The research concerns the automatic assessment of the quality of Wikipedia articles and the reliability of

#KnowledgeGraph #open data

·kie.ue.poznan.pl·Jun 7, 2023

Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information