JiuZhang

This is the official PyTorch implementation for the paper:

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

Overview

We propose JiuZhang, which is developed based on the Transformer architecture, consisting of a shared Transformer encoder, a decoder for the understanding tasks ($U$-decoder) and a decoder for the generation tasks ($G$-decoder). And we design a curriculum pre-training approach to improving the understanding of mathematical knowledge and logic, from basic to advanced courses.

Requirements

torch==1.10.0
transformers==4.10.0
datasets==1.11.0
jieba

Dataset

Datasets cannot be shared temporarily for some commercial reasons.

Curriculum Pre-Training

Base Model

Please download the initial model from https://huggingface.co/fnlp/cpt-base.

Scripts

We put the training scripts of the three courses in stage 1, 2 and 3 respectively. You can run pre-training with single GPU by:

bash scripts/stage_{1 or 2 or 3}.sh

or run distributed data paralle pre-training with multiple GPUs by:

bash scripts/stage_{1 or 2 or 3}_ddp.sh

Arguments

You can check more details about training arguments in the official docs of huggingface. We explain some special arguments here.

model_name_or_path - Directory of model checkpoint for weights initialization. Put your downloaded base model here.
data_path - Your pre-processed training data saved in Dataset format. We save the problem statement and answer process in the 'content' and 'analysis' keys.
add_token_path - There may be some important words in your corpus that cannot be correctly split by the tokenizer of the pre-trained model, such as mathematical symbols. You can add them to the vocab by this argument and train the embedding from scratch.

Citation

Please consider citing our paper if you use our codes.

@inproceedings{zhao2022jiuzhang,
  title={JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding},
  author={Zhao, Wayne Xin and Zhou, Kun and Gong, Zheng and Zhang, Beichen and Zhou, Yuanhang and Sha, Jing and Chen, Zhigang and Wang, Shijin and Liu, Cong and Wen, Ji-Rong},
  booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={4571--4581},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
figure		figure
models		models
scripts		scripts
README.md		README.md
pretrain_basic.py		pretrain_basic.py
pretrain_logic.py		pretrain_logic.py
pretrain_sc.py		pretrain_sc.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JiuZhang

Overview

Requirements

Dataset

Curriculum Pre-Training

Base Model

Scripts

Arguments

Citation

About

Releases

Packages

Languages

RUCAIBox/JiuZhang

Folders and files

Latest commit

History

Repository files navigation

JiuZhang

Overview

Requirements

Dataset

Curriculum Pre-Training

Base Model

Scripts

Arguments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages