Production-ready project (September 2014 - Ongoing)

General Entity Annotation Benchmark Framework

About the project

GERBIL is a general Linked Data benchmarking system (formerly used for entity annotation systems based on the BAT-Framework). GERBIL offers an easy-to-use web-based platform for the agile comparison of annotators using multiple datasets and uniform measuring approaches. To add a tool to GERBIL, all the end user has to do is to provide a URL to a REST interface to its tool which abides by a given specification. The integration and benchmarking of the tool against user-specified datasets is then carried out automatically by the GERBIL platform.

If you want to know more, please have a look at our novel paper about GERBIL at the Semantic Web Journal
We also used GERBIL for benchmarking system for question answering

Available Annotators

	BAT-FRAMEWORK	GERBIL 1.0.0	GERBIL1.2.5	Experiment	Paper
AGDISTIS	(✔)	✔	✔	D2KB	Link
AIDA	✔	✔	✔	A2KB	Link
Babely		✔	✔	A2KB	Link
CETUS			✔	OKE Task 2	Link
CETUS (FOX)			✔	OKE Task 2	Link
DBpedia Spotlight	✔	✔	✔	A2KB	Link
Dexter		✔	✔	A2KB	Link
DoSeR (*)			✔	D2KB	Link
entityclassifier.eu NER			✔	A2KB	Link
FOX			✔	OKE Task 1	Link
FRED			✔	OKE Task 1	Link
FREME NER			✔	OKE Task 1	Link
KEA		✔	✔	A2KB	Link
NERD-ML		✔	✔	A2KB	Link
NERFGUN			✔	D2KB	Link
OpenTapioca			✔	A2KB	Link
PBOH			✔	D2KB	Link
TagMe 2	✔	✔	✔	A2KB	Link
WAT		✔	✔	A2KB	Link
xLisa			✔	A2KB	Link

(*) Annotator isn't available any more

Supported Experiments for each Annotator

The following table lists the annotators that are currently available and the experiment types they support. Note that some of the A2KB annotators support the D2KB experiment by offering an own API method. Other A2KB annotators can be chosen for a D2KB experiment as well as described in the wiki. However, since the comparison might not be fair, we marked these annotators with (✔) in the table. The same is done for Entity Typing.

	A2KB, C2KB, Entity Recognition	D2KB	Entity Typing	OKE TASK 1	OKE TASK 2	RT2KB	RE
AGDISTIS		✔
AIDA	✔	(✔)
Babely	✔	✔
CETUS					✔
CETUS (FOX)					✔
DBpedia Spotlight	✔	✔	✔	✔		✔
Dexter	✔	(✔)
DoSeR (*)		✔
entityclassifier.eu NER	✔	(✔)
FOX	✔	(✔)	(✔)	✔		✔	✔
FRED	✔	(✔)	(✔)	✔		✔
FREME NER	✔	✔	✔	✔		✔
KEA	✔	✔
NERD-ML	✔	(✔)
NERFGUN		✔
OpenTapioca	✔	✔
PBOH		✔
TagMe 2	✔	(✔)
WAT	✔	✔
xLisa	✔	(✔)

(*) Annotator isn't available any more

Available Datasets

The following table lists the datasets that are currently available and the experiment types they support.

	A2KB, C2KB, D2KB, Entity Recognition	Entity Typing	OKE TASK 1	OKE TASK 2	RT2KB	RE	Paper
ACE2004	✔						Link
AIDA/CoNLL-Complete	✔						Link
AIDA/CoNLL-Test A	✔						Link
AIDA/CoNLL-Test B	✔						Link
AIDA/CoNLL-Training	✔						Link
AQUAINT	✔						-
CoNLL2003							Link
DBpediaSpotlight	✔						Link
Derczynski	✔						Link
ERD2014	✔						Link
GERDAQ-Dev	✔						Link
GERDAQ-Test	✔						Link
GERDAQ-TrainingA	✔						Link
GERDAQ-TrainingB	✔						Link
IITB	✔						Link
KORE50	✔						Link
MSNBC	✔						Link
Microposts2013-Test		✔			✔		Link
Microposts2013-Train		✔			✔		Link
Microposts2014-Test	✔						Link
Microposts2014-Train	✔						Link
Microposts2015-Test	✔						Link
Microposts2015-Train	✔						Link
Microposts2016-Test	✔						Link
Microposts2016-Train	✔						Link
N3-RSS-500	✔						Link
N3-Reuters-128	✔						Link
OKE 2015 Task 1	✔	✔	✔		✔		Link
OKE 2015 Task 2				✔			Link
OKE 2016 Task 1	✔	✔	✔		✔		Link
OKE 2016 Task 2				✔			Link
OKE 2017 Task 1	✔						Link
OKE 2017 Task 2	✔						Link
OKE 2017 Task 3	✔	✔	✔		✔		Link
OKE 2018 Task 1	✔						Link
OKE 2018 Task 2	✔						Link
OKE 2018 Task 3						✔	Link
OKE 2018 Task 4	✔					✔	Link
Ritter	✔	✔			✔		Link
Senseval 2	✔						Link
Senseval 3	✔						Link
UMBC-Test	✔	✔			✔		Link
UMBC-Train	✔	✔			✔		Link
WSDM 2012	✔						Link

Long term stability

The idea of GERBIL emerged in September 2014 when a couple of articles released at the same time claimed to be state-of-the-art. Especially, those approaches were not easily comparable due to their heterogeneous set-up, dataset use and evaluation metrics. Thus, we decided to build GERBIL and extend the BAT-Framework to break the barriers for people not able to write source code.

GERBIL is now more than 3 years old and has hosted more than 50.000 experiments. It is currently hosted at the research and development unit of the University Leipzig Computation Center and the Paderborn University which keep daily backups to ensure long-term quotability.

The survey data from our paper can be found at GERBIL's GitHub repository.

Contributors

The main developer of the project is Michael Röder.

We thank Ricardo Usbeck for the initial creation of the project and the development of the main idea. We also thank Lixi Conrads for the large amount of development that they invested into the project.

Other people who contributed to the project are (in alphabetic order):

Ciro Baron (University Leipzig, Germany)
Lukas Blübaum (DICE group, Germany)
Andreas Both (R&D, Unister GmbH, Germany)
Martin Brümmer (University Leipzig, Germany)
Diego Ceccarelli (Unversity Pisa, Italy)
Marco Cornolti (University of Pisa, Italy)
Didier Cherix (R&D, Unister GmbH, Germany)
Bernd Eickmann (R&D, Unister GmbH, Germany)
Paolo Ferragina (University of Pisa, Italy)
Christiane Lemke (R&D, Unister GmbH, Germany)
Andrea Moro (Sapienza University of Rome, Italy)
Roberto Navigli (Sapienza University of Rome, Italy)
Francesco Piccinno (University of Pisa, Italy)
Giuseppe Rizzo (EURECOM, France)
Harald Sack (HPI Potsdam, Germany)
René Speck (DICE group, Germany)
Nikit Srivastava (DICE group, Germany)
Raphaël Troncy (EURECOM, France)
Jörg Waitelonis (HPI Potsdam, Germany)
Lars Wesemann (R&D, Unister GmbH, Germany)

We also thank all the contributers on Github.

Maintainer

Michael Röder

Staff

Hardik Shetty Neha Pokharel Rishikesh Yadav

GERBIL KE

Learn more

GERBIL QA

Learn more

GERBIL KBC

Learn more

Publications

A General Benchmarking Framework for Text Generation

By Diego Moussallem, Paramjot Kaur, Thiago Ferreira, Chris van der Lee, Anastasia Shimorina, Felix Conrads, Michael Röder, René Speck, Claire Gardent, Simon Mille, Nikolai Ilinykh, Axel-Cyrille Ngonga Ngomo

Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+), 2020, #inproceedings

Benchmarking Question Answering Systems

By Ricardo Usbeck, Michael Röder, Michael Hoffmann, Felix Conrads, Jonathan Huthmann, Axel-Cyrille Ngonga Ngomo, Christian Demmler, Christina Unger

Semantic Web, 2019, #article

GERBIL - Benchmarking Named Entity Recognition and Linking consistently

By Michael Röder, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo

Semantic Web, 2018, #article

The Scalable Question Answering Over Linked Data (SQA) Challenge 2018

By Giulio Napolitano, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo

Semantic Web Challenges - 5th SemWebEval Challenge at ESWC 2018, Heraklion, Greece, June 3-7, 2018, Revised Selected Papers, 2018, #inproceedings

Techreport for GERBIL 1.2.2 - V1

By Michael Röder, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo

, 2016, #techreport

GERBIL -- General Entity Annotation Benchmark Framework

By Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, Marco Cornolti, Didier Cherix, Bernd Eickmann, Paolo Ferragina, Christiane Lemke, Andrea Moro, Roberto Navigli, Francesco Piccinno, Giuseppe Rizzo, Harald Sack, René Speck, Raphaël Troncy, Jörg Waitelonis, Lars Wesemann

24th WWW conference, 2015, #inproceedings

GERBIL

About the project

Available Annotators

Supported Experiments for each Annotator

Available Datasets

Long term stability

Contributors

Maintainer

Staff

See also

GERBIL KE

GERBIL QA

GERBIL KBC

Publications

A General Benchmarking Framework for Text Generation

Benchmarking Question Answering Systems

GERBIL - Benchmarking Named Entity Recognition and Linking consistently

The Scalable Question Answering Over Linked Data (SQA) Challenge 2018

Techreport for GERBIL 1.2.2 - V1

GERBIL -- General Entity Annotation Benchmark Framework

Developing a Sustainable Platform for Entity Annotation Benchmarks

Evaluating Entity Annotators Using GERBIL

A General Benchmarking Framework for Text Generation

Benchmarking Question Answering Systems

GERBIL - Benchmarking Named Entity Recognition and Linking consistently

The Scalable Question Answering Over Linked Data (SQA) Challenge 2018

Techreport for GERBIL 1.2.2 - V1