This repo contains the code of the following paper:
Improving Aspect Term Extraction with Bidirectional Dependency Tree Representation. Huaishao Luo, Tianrui Li, Bing Liu, Bin Wang and Herwig Unger. TASLP, 2019, 27(7):1201-1212.
This paper proposes a bidirectional dependency tree network to extract dependency structure features from the given sentences to improve the performance of aspect term extraction (ATE).
- python 2.7
- tensorflow==1.2.0
pip install -r requirements.txt
python main.py --do_preprocess
python main.py --do_train --do_evaluate --lr 0.001 --batch_size 20 --nepochs 100
See main.py for more training arguments.
Here is an example to briefly introduce the structure information we need to organize in our model progress. It is complex due to the arbitrary layer's number and dependent nodes' number in a dependency tree. No matter in bottom-up manner or top-down manner, we need to firstly compute the nodes which lie in the bottom of the layers.
Sentence
Speaking_0 of_1 the_2 browser_3 ,_4 it_5 too_6 has_7 problems_8 ._9
Bottom-up
[8, 3, 5, 7, 0, 2, 6, 9, 4, 1]
[u',', u'.', u'it', u'of', u'problems', u'the', u'too', u'browser', u'Speaking', u'has']
[[], [], [], [], [], [], [], [3, 5], [7], [8, 0, 2, 6, 4, 1]]
[[], [], [], [], [], [], [], [u'case', u'det'], [u'nmod'], [u'advcl', u'punct', u'nsubj', u'advmod', u'dobj', u'punct']]
[0, 0, 0, 0, 0, 0, 0, 2, 1, 6]
[4, 9, 5, 1, 8, 2, 6, 3, 0, 7]
2nd: new sequence used to model bottom-up dependency information
1st: id of new sequence used to map back to orignal sequence
3rd: dependent token id of each token (new sequence)
4th: dependent token of each token (new sequence)
5th: dependent token number of each token (new sequence)
6th: id of orignal sequence used to map to new sequence
Top-down
[1, 6, 4, 2, 9, 7, 3, 0, 5, 8]
[u'has', u'Speaking', u'browser', u'too', u'the', u'problems', u'of', u'it', u'.', u',']
[[], [0], [1], [0], [2], [0], [2], [0], [0], [0]]
[[u'I_ROOT'], [u'I_advcl'], [u'I_nmod'], [u'I_advmod'], [u'I_det'], [u'I_dobj'], [u'I_case'], [u'I_nsubj'], [u'I_punct'], [u'I_punct']]
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[7, 0, 3, 6, 2, 8, 1, 5, 9, 4]
Each line has the same meaning with the Bottom-up
Dependency tree (with red orginal id)
If the paper or this repository is helpful, please cite as:
@Article{Luo2019,
author = {Huaishao Luo and Tianrui Li and Bing Liu and Bin Wang and Herwig Unger},
title = {Improving Aspect Term Extraction with Bidirectional Dependency Tree Representation},
journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
year = {2019},
volume = {27},
number = {7},
pages = {1201-1212},
doi = {10.1109/TASLP.2019.2913094}
}