Validating Data via Ids and Shacl. Do Both Ways Lead to Rome?

Richard L. Zijdeman, International Institute of Social History
Rick J. Mourits, International Institute of Social History

The Intermediate Data Structure (IDS) is a data dissemination format, that allows researchers to easily exchange data from large (longitudinal) databases. IDS proves a shared vocabulary and data model. In addition, IDS provides various software tools to transpose data in IDS and check whether the underlying data meet the required predefined format. The Semantic Web (Linked Open Data) is similar to IDS, in the sense that it provides vocabularies and data models to structure data in a coherent way across multiple data sources. However, unlike IDS the Semantic Web provides many vocabularies and competing data models. The Shapes Constraint Language (SHACL) provides a way to constrain Semantic Web data to specific vocabularies and data models. In this paper we examine whether we can use SHACL to align Semantic Web data to IDS formatted data. Since the semantic web is expanding and increasingly being used (e.g. WIKIDATA, DBPEDIA), a successful application of SHACL to mimic IDS, would mean that data on the Semantic Web could be aligned to historical longitudinal databases. Also, it would allow IDS to be represented on the Semantic Web, increasing its use and popularity. Specifically, it could help to relate contemporary social science data to social economic and demographic historical enhancing opportunities for comparative research over long periods of time.

No extended abstract or paper available

