GitHub - nolongerprejudice/tgbi-x: multilingual translation gender bias

Template and example code for paper: Towards Cross-Lingual Generalization of Translation Gender Bias (ACM FaccT 2021)

Data files

All files are plain txt, UTF-8 encoding, and each word/sentence was seperated by new line('\n').

word_list:

occupation word list used in template (187 words): EN, KR, TL
adjective word list used in template (62 words): EN
noun word list used in template (68 words): KR, TL

template: sentences given to translators

gold_standard:reference sentences compared with output sentences

Evaludation Demo

Both bleu&bertScore were executed on Linux and Python 3.6+.

bertScore:

You can find our example on Google Colab.

Note

a GPU is usually neccessary.
the max length is limited to 510(512 after adding cls/sep) as we used bert-base-multilingual-cased as our default model.

bleu:

Our bleu evalation example can be found here

Authors

*Won Ik Cho
*Jiwon Kim
Jaeyeong Yang
Nam Soo Kim

*: equally contributed

If you find this repo useful, please cite this:

@inproceedings{10.1145/3442188.3445907, author = {Cho, Won Ik and Kim, Jiwon and Yang, Jaeyeong and Kim, Nam Soo}, title = {Towards Cross-Lingual Generalization of Translation Gender Bias}, year = {2021}, url = {https://doi.org/10.1145/3442188.3445907}, }

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
gold_standard		gold_standard
template		template
word_list		word_list
README.md		README.md
demo_bert_bleu.zip		demo_bert_bleu.zip
figure-3.png		figure-3.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data files

Evaludation Demo

Authors

About

Releases

Packages

nolongerprejudice/tgbi-x

Folders and files

Latest commit

History

Repository files navigation

Data files

Evaludation Demo

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages