Skip to content

Latest commit

 

History

History
30 lines (21 loc) · 1.19 KB

File metadata and controls

30 lines (21 loc) · 1.19 KB

Hierarchical evaluation measures

Implementation of hierarchical F-measure (hF), hierarchical precision (hP) and hierarchical recall (hR) and exact precision. The script was developed for evaluation of type quality in DBpedia.

#Example execution

python computeHmeasures.py "en.lhd.core.2014.nt" "dbpedia_2014.owl" "gs3-toDBpedia2014.nt" "en.lhd.core.gs3.log"

Input files - Gold standard datasets

The datasets are described in our JWS paper

Additional details can be found here http://ner.vse.cz/datasets/linkedhypernyms/

Output

reading gs
reading predicted
finished reading input datasets
total instances in groundtruth:1033.0
total instances in intersection of groundtruth and prediction:402.0
hP:0.864357864358
hR:0.370553665326
hF:0.518726997186
Precision (exact):0.654228855721