Multilingual Contrastive Decoding via Language-agnostic Layers Skipping

Overview

This repository shares the code and data of our latest work Multilingual Contrastive Decoding via Language-agnostic Layers Skipping.

In this work, we find a critical problem of the recent amateur-free contrastive decoding method, DoLa, while working in non-English languages. Then, we purpose a better amateur-free contrastive decoding approach, Skip Layer, and achieve significant performance improve on both English and Multilingual reasoning benchmarks.

Usage

Our code is based on the transformers library, you should install the dependency to run the code.

pip install -r requirements.txt

We provide a simple integration for the transformers library. The LLaMA, Mistral and other similar models should be supported. To load the model with contrastive decoding algorithm, you can use the following code:

from utils.setup_models import setup_model
from transformers import LlamaForCausalLM
model = setup_model(
    algorithm='sl-h', # ['direct', 'vanilla', 'dola', 'sl-h', 'sl-d']
    model_dir='path/to/llama',
    # prefix=..., # need by sl-d
    # amateur_model_dir=..., # need by 'vanilla'
    model_cls=LlamaForCausalLM, # need by 'sl-h' and 'sl-d'
)

# For the "vanilla" algorithm, you should duplicate the input for expert and amateur model:
# model.generate(**tokenizer(['hello'] * 2))

# For other algorithms, just use as a regular llama model
model.generate(**tokenizer(['hello']))

You can find more about integration in file utils/setup_models.py.

Experiment

To replicate the experiments conducted in our paper, you can use the following code:

bash run.sh

Note: The result can be different due to the numerical presicion and different evaluation environment. To get more similar result to our paper, you can enable the FP32 inference (see ENABLE_FP32 in run.sh).

The prediction results will be located in eval/outputs. The accuracy of the results can be compute by the script compute_accuracy.py.

Citation

If you find this repository helpful, feel free to cite our paper.

@article{zhu2024multilingual,
  title={Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping},
  author={Zhu, Wenhao and Liu, Sizhe and Huang, Shujian and She, Shuaijie and Wendler, Chris and Chen, Jiajun},
  journal={arXiv preprint arXiv:2407.10795},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
utils		utils
README.md		README.md
compute_accuracy.py		compute_accuracy.py
prompt_aqua.yaml		prompt_aqua.yaml
prompt_mwp.yaml		prompt_mwp.yaml
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Contrastive Decoding via Language-agnostic Layers Skipping

Overview

Usage

Experiment

Citation

About

Releases

Packages

Languages

NJUNLP/SkipLayerCD

Folders and files

Latest commit

History

Repository files navigation

Multilingual Contrastive Decoding via Language-agnostic Layers Skipping

Overview

Usage

Experiment

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages