Benchmarking Knowledge Graphs on the Web

5 years ago by Dr. rer. nat. Michael Röder

The growing importance of knowledge graphs means that much effort is being invested into the enhancement of systems that process knowledge graphs. This effort not only aims to add new functionalities to the set of existing Semantic Web tools; a main effort of developers and researchers within the community is to enhance the stability and scalability of Linked Data systems. However, while new functionalities can be easily tested, the evaluation of a system’s performance improvement can be more difficult. To this end, benchmarks play a key role when it comes to comparable and repeatable evaluation of a system’s performance.

For different stakeholders of a system, benchmarks offer different advantages. For a customer that has to choose a system for a certain task, a benchmark can be used to determine which of the available systems fits best to the customer's use case. For the developer of a system, a benchmark can give insights into the advantages and disadvantages of a system’s implementation. Apart from that, it can help to structure a field, e.g., by defining a common API for systems that try to fulfill the same task. Additionally, benchmarks can better suggest where the system’s performance is located than other systems. For researchers, benchmarks can point towards a direction that needs further research to enhance existing solutions.

Since benchmarks play an important role, it is crucial to be aware of existing benchmarks. For this purpose, we created a benchmarks overview that is available for knowledge-graph processing systems. Our paper is available on arxiv.org. It covers all steps of the Linked Data lifecycle, shows how each step can be benchmarked and which benchmarks already exist. Additionally, several existing benchmarking frameworks focusing on knowledge graphs are briefly described. Feedback for this paper is very welcome and should be sent to Michael Röder.