Automatic extraction of shapes using sheXer

Abstract

There is an increasing number of projects based on Knowledge Graphs and SPARQL endpoints. These SPARQL endpoints are later queried by final users or used to feed many different kinds of applications. Shape languages, such as ShEx and SHACL, have emerged to guide the evolution of these graphs and to validate their expected topology. However, authoring shapes for an existing knowledge graph is a time-consuming task. The task gets more challenging when dealing with sources, possibly maintained by heterogeneous agents. In this paper, we present sheXer, a system that extracts shapes by mining the graph structure. We offer sheXer as a free Python library capable of producing both ShEx and SHACL content. Compared to other automatic shape extractors, sheXer includes some novel features such as shape inter-linkage and computation of big real-world datasets. We analyze the features and limitations w.r.t. performance with different experiments using the English chapter of DBpedia.

Publication
Knowledge-Based Systems
Date