The two packackages below need to install manually.
-
For TorchFly, you need to install apex first.
# modified the error due to cuda version git clone https://github.com/qywu/apex cd apex pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
-
You need to install OpenCC (https://github.com/BYVoid/OpenCC) for text pre-processing.
sudo apt-get install opencc pip install opencc-python-reimplemented
For the rest packages, use:
pip install requirements.txt
You can download our processed version and extract it under /data
. [Google Drive]
Or you can download PHED and Headline Generation datasets separately and process them with code under /notebooks
.
See /notebooks/Train PAS.ipynb
We will later try to upload to huggingface's custom model repo. For now, you can try google colab example: https://colab.research.google.com/drive/1cvBSt2uF7hYL1feDGt0dkCxIeaVXQs5x