This repository contains the unified Chinese discourse dependency dataset described in the paper Unifying Discourse Resources with Dependency Framework. The dataset is now composed of SciCDTB developed by Peking University and SU-CDTB_{dep} converted from CDTB developed by Soochow University.
If you find our data useful in your research, please consider citing:
@article{
title={Unifying Discourse Resources with Dependency Framework},
author={Cheng, Yi and Li, Sujian and Li, Yueyuan},
journal={arXiv preprint arXiv:2101.00167}
}
We have been authorized by Soochow University to release SU-CDTB_{dep}. If you use this dataset, please also cite the original CDTB paper Building chinese discourse corpus with connective-driven dependency tree structure.
@inproceedings{
title = "Building {C}hinese Discourse Corpus with Connective-driven Dependency Tree Structure",
author = "Li, Yancui and Feng, Wenhe and Sun, Jing and Kong, Fang and Zhou, Guodong",
booktitle = "Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing ({EMNLP})",
year = "2014",
publisher = "Association for Computational Linguistics",
doi = "10.3115/v1/D14-1224",
pages = "2105--2114",
}
If you hope to get our converted HIT-CDTB_{dep} discourse dependency corpus, please first get the authorization of HIT-CDTB.