Skip to content

How to add a new Annotator

MichaelRoeder edited this page Nov 6, 2014 · 16 revisions

At the moment, there are two possibilities to add a new wrapper for an annotator to GERBIL.

1. Find the correct category

First, you need to find the correct category for your annotator with respect to the paper of Cornolti et al.[1]:

  • C2W - The annotator adds a list of concepts to a given text (without their position inside the text)
  • Rc2W - The same as C2W but ranks the concepts regarding their importants for the text
  • Sa2W - The same as C2W but adds a score to every concept
  • D2W - The annotator gets a text with marked named entities and links these named entities to Wikipedia
  • A2W - The annotator gets a text, searches for named entities and links them to Wikipedia
  • Sa2W - The same as A2W but every named entity gets a score

[1] http://dl.acm.org/citation.cfm?id=2488411 .

2. Implementing an Adapter

At first you need to implement an adapter that will be used to communicate with the annotator.

Solution A: a BAT-Framework adapter (deprecated)

This is the "old" way to add an annotator. You have to write an adapter for your system implementing its category, e.g., 'Sa2WSystem'. But for this implementation you have to take a closer look at the BAT-Framework itself. Therefore, you can examine already existing adapters like the one for Spotlight or AGDISTIS.

Note Using this solution forces you to perform step 3.

Solution B: a NIF based web service

For this solution a simple NIF based web service is implemented with which the GERBIL system communicates while the web service acts as a wrapper of the annotator.

First, you have to get the gerbil.nif.transfer library that can be downloaded from http://139.18.2.164/mroeder/gerbil/gerbil.nif.transfer-0.0.1.zip . If you are using maven and extracted the zip file, you can install it locally using

mvn install:install-file -Dfile=target/gerbil.nif.transfer-0.0.1.jar -Dpackaging=jar -Djavadoc=target/gerbil.nif.transfer-0.0.1-javadoc.jar -Dsources=target/gerbil.nif.transfer-0.0.1-sources.jar -DpomFile=pom.xml

The GERBIL project has a branch with the name SpotWrapNifWS4Test which can be used as an example of such a NIF based web service. Inside this example only the class org.aksw.gerbil.ws4test.SpotlightResource has to be copied and adapted in the following way.

        // ... this is only the parsing of an incoming document
        Document document;
        try {
            document = parser.getDocumentFromNIFReader(inputReader);
        } catch (Exception e) {
            LOGGER.error("Exception while reading request.", e);
            return "";
        }
        // If your system is only for entity linking, the document object
        // should already contain a list of markings
        List<Marking> markings = document.getMarkings();
        String text = document.getText();

        // Now we have the text and a list of markings (this could be
        // empty or contain Span objects which would mark the named
        // entities inside the text) and could call you system for
        // performing the entity linking task...

        // ... as result a list of NamedEntity or ScoredNamedEntity objects
        // should be created for the A2W or Sa2W tasks respectively. For
        // C2W, Rc2W or Sc2W you should create a list of Annotations or
        // ScoredAnnotations
        List<Marking> entities = new ArrayList<Marking>(markings.size());
        entities.add(new NamedEntity( ... ));

        // ... this new list is added to the document and the document is
        // send back to GERBIL
        document.setMarkings(entities);
        String nifDocument = creator.getDocumentAsNIFString(document);
        return nifDocument;

After deploying the web service, it is already able to communicate with GERBIL without performing the third step. You simply have to insert its URL at the configuration screen of an experiment.

3. Adding the annotator permanently (optional)

If you have chosen solution B at the former step, you won't need to perform this step since you already can add your annotator as a NIF-based web service during the configuration of a GERBIL experiment. If you want to add it permanently or you have chosen solution A, you will have to create an AnnotatorConfiguration. For this step you can simply take a look at the already existing configurations of Spotlight or AGDISTIS.

Afterwards, you will have to add your annotator configuration to the getInstance() method of the class org.aksw.gerbil.utils.AnnotatorMapping. We know that this solution is not really good and in the future weeks we will replace it by a novel solution.