Building an Open Knowledge Base

The latest progress with work on a Dutch Open Knowledge Base.

By Maurice Vanderfeesten (VU Amsterdam), Nick Veenstra (RUG), Jeffrey Sweeney (Rotterdam School of Management), Tung Tung Chan (Erasmus), Clifford Tatum (SURF), Gül Akcaova (SURF), Darco Jansen (UNL), John Doove (SURF), Wim Hugo (DANS), Armand Guicherit (TU Delft), Alastair Dunning (TU Delft)

03 november 2022

1,238

Leestijd 4 minuten

1 Praat mee

Earlier in the year we wrote about the idea of an Open Knowledge Base (OKB) for the Netherlands. There's been quite a lot of activity since then.

A research report done by Dialogic

An international meeting to discuss some of the ramifications of building an OKB

A workshop held by SURF to explore the next steps for a Dutch version of an OKB

This workshop led to the creation of a business plan being submitted to the rectors of the Dutch universities. The business plan sets various issues to be tackled and requests the finding to do so.

The publication of the Seven Guiding Principles for Research Information (v2.1)

An informal project group experimenting with building a Dutch OKB.

This blog post relates to the last bullet point.

The blog post last year considered the OKB in terms of an ecosystem such as Wikidata. Without denying the strength of Wikidata, the discussion in the summer workshop pushed the concept of a PID graph, as an easier way of getting things set up.

There's a short introduction to PID graphs here, but some of the key points are

It does not create a database but a data graph.
A graph consists of (usually) millions of statements (known as triples). Each statement has the form subject-predicate-object
PID stands for persistent identifier. The subject and object parts of the statement should refer to a persistent identifier
In the scholarly communications landscape, permanent IDs could be things like ORCID (for authors), ROR (for institutions), DOIs (for actual scholarly content)

A Dutch version of a PID graph would work by importing metadata from the university’s current research information system (CRIS) that include permanent identifiers.

Here some data transformation would need to take place, converting the CRIS data into the triple statements needed for the data graph.

There are various tools for the combined task of managing, querying and visualising the PID graph. At the moment, the project group is experimenting with GraphDB lite, but others deserve attention.

Once gathered as a data graph, new connections can be inferred from the collected statements from the different Dutch universities.

Two use cases have been proposed by the group to demonstrate the added value of a national approach

1. Disambiguation: link instances of the same author, institution, or publication with different names - and distinguish between instances of unique authors, institutions, or publications with similar names.

2. ‘Single version of the truth’: provides a holistic view of research-related metadata in a consistent and non-redundant form, eliminating discrepancies between institutional versions

Together, these two use cases provide a more reliable and complete foundation, upon which we can develop the PID graph.

These are obvious examples. The informal project group working on this will experiment with others. They are making use of test datasets from Erasmus, VU Amsterdam, Delft and Groningen CRIS systems.

If the business case is approved at the UNL meeting in December, the project will develop into a formal pilot and continue with the experimentation outlined above.

Members of the informal project group

Maurice Vanderfeesten (VU Amsterdam)
Nick Veenstra (RUG)
Jeffrey Sweeney (Rotterdam School of Managment)
Tung Tung Chan (Erasmus)
Clifford Tatum (SURF)
Gül Akcaova (SURF)
Darco Jansen (UNL)
John Doove (SURF)
Wim Hugo (DANS)
Armand Guicherit (TU Delft)
Alastair Dunning (TU Delft)

Alastair Dunning

Technische Universiteit Delft Head, Research Services

Dit artikel heeft 1 reactie

Meld je aan en praat mee

Als lid van SURF Communities kun je in gesprek gaan met andere leden. Deel jouw eigen ervaringen, vertel iets vanuit je vakgebied of stel vragen.

SURF

04 november 2022 14:48

Interessante ontwikkeling.

Technische Universiteit Delft

Head, Research Services

Building an Open Knowledge Base

Members of the informal project group

Alastair Dunning

Meld je aan en praat mee

Gerelateerde artikelen

Talk to Your Research Data: Introducing ORI Agentic Tools and a Filled DuckLake

Introducing the ORI Monitoring Framework: a starting point for monitoring the Dutch Open Research Information landscape

ORI Community Meet-up | 9 February at Saxion Deventer

Building an Open Knowledge Base

Members of the informal project group

Auteur

Alastair Dunning

Reacties

Meld je aan en praat mee

Reactie van Corno Vromans

Gerelateerde artikelen

Talk to Your Research Data: Introducing ORI Agentic Tools and a Filled DuckLake

Introducing the ORI Monitoring Framework: a starting point for monitoring the Dutch Open Research Information landscape

ORI Community Meet-up | 9 February at Saxion Deventer

Inloggen

Inloggen