Skip to content

RUCAIBox/JiuZhang

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JiuZhang

This is the official PyTorch implementation for the paper:

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

Overview

We propose JiuZhang, which is developed based on the Transformer architecture, consisting of a shared Transformer encoder, a decoder for the understanding tasks ($U$-decoder) and a decoder for the generation tasks ($G$-decoder). And we design a curriculum pre-training approach to improving the understanding of mathematical knowledge and logic, from basic to advanced courses.

Requirements

torch==1.10.0
transformers==4.10.0
datasets==1.11.0
jieba

Dataset

Datasets cannot be shared temporarily for some commercial reasons.

Curriculum Pre-Training

Base Model

Please download the initial model from https://huggingface.co/fnlp/cpt-base.

Scripts

We put the training scripts of the three courses in stage 1, 2 and 3 respectively. You can run pre-training with single GPU by:

bash scripts/stage_{1 or 2 or 3}.sh

or run distributed data paralle pre-training with multiple GPUs by:

bash scripts/stage_{1 or 2 or 3}_ddp.sh

Arguments

You can check more details about training arguments in the official docs of huggingface. We explain some special arguments here.

  • model_name_or_path - Directory of model checkpoint for weights initialization. Put your downloaded base model here.
  • data_path - Your pre-processed training data saved in Dataset format. We save the problem statement and answer process in the 'content' and 'analysis' keys.
  • add_token_path - There may be some important words in your corpus that cannot be correctly split by the tokenizer of the pre-trained model, such as mathematical symbols. You can add them to the vocab by this argument and train the embedding from scratch.

Citation

Please consider citing our paper if you use our codes.

@inproceedings{zhao2022jiuzhang,
  title={JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding},
  author={Zhao, Wayne Xin and Zhou, Kun and Gong, Zheng and Zhang, Beichen and Zhou, Yuanhang and Sha, Jing and Chen, Zhigang and Wang, Shijin and Liu, Cong and Wen, Ji-Rong},
  booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={4571--4581},
  year={2022}
}

About

Our code will be public soon .

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published