Approaches to measure class importance in Knowledge Graphs

Abstract

The amount, size, complexity, and importance of Knowledge Graphs (KGs) have increased during the last decade. Many different communities have chosen to publish their datasets using Linked Data principles, which favors the integration of this information with many other sources published using the same principles and technologies. Such a scenario requires to develop techniques of Linked Data Summarization. The concept of a class is one of the core elements used to define the ontologies which sustain most of the existing KGs. Moreover, classes are an excellent tool to refer to an abstract idea which groups many individuals (or instances) in the context of a given KG, which is handy to use when producing summaries of its content. Rankings of class importance are a powerful summarization tool that can be used both to obtain a superficial view of the content of a given KG and to prioritize many different actions over the data (data quality checking, visualization, relevance for search engines…). In this paper, we analyze existing techniques to measure class importance and propose a novel approach called ClassRank. We compare the class usage in SPARQL logs of different KGs with the importance ranking produced by the approaches evaluated. Then, we discuss the strengths and weaknesses of the evaluated techniques. Our experimentation suggests that ClassRank outperforms state-of-the-art approaches measuring class importance.

Publication
PLOS One
Date