Skip to content

own-pt/rte-sick

Repository files navigation

https://wiki.cimec.unitn.it/tiki-index.php?page=CLIC
https://zenodo.org/record/2787612#.Xpc-Sy85T1A


The idea is to run the sentences A and B from the SICK.txt corpus
through FreeLing, and use its tokenization to feed SynaxNet to obtain
the UD dependencies.  We generate a CONLL file with the SynaxNet
output, augmented with the lemmas and senses obtained from FreeLing.

We needed to disable FreeLing's MWE module.

We also add the original POS tag from Freeling to the MISC field so we
can compare to SyntaxNet's.  The POS tag from Freeling is relevant to
understand the sense selected.

To reproduce:

1. Download SICK.txt from here:
   http://clic.cimec.unitn.it/composes/sick.html

2. Install Freeling 4.0 from:

http://nlp.cs.upc.edu/freeling/

3. Install SyntaxNet from: 

https://github.com/tensorflow/models/tree/master/syntaxnet

4. Download the English UD pre-trained model from:

https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md

5. Download the SUMO knowledge base from:

https://github.com/ontologyportal/sumo

6. Fix the paths in env.sh, process-freeling.py, and parse.sh to point
   to your local installation paths.

7. Run the stats.sh script

See README in pairs/ and derivated/ for more information.

Releases

No releases published

Packages

No packages published

Languages