NLLB-200 Distilled-350M_en2ko

The NLLB-200 model showed outstanding performance in translation task and contributed to solving problems with low-resource languages. Despite their efforts, it is still hard to run 600M or more than 1B model for those who have not enough computing environment. So I made much smaller model that expertized translaing English to Korean. you can also run it with cpu (No mixed-precision, No Quantization). [우당탕탕 개발일지]

Model

Model: model is based on NLLB-200 600M
- Parameters: 350,537,728 (350M)
- Encoder layers: 12 -> 3
- Decoder layers: 12 -> 3
- FFN dimension: 4096 (same)
- Embed dimension: 1024 (same)
- Vocab size: 256206 (same)

Data

Training Data: NLLB dataset
- created based on metadata for mined bitext released by Meta AI
- 15M en-ko parallel dataset
- stopes mining library
- encoded with LASER3
Evaluation Data: Flores-200 dataset

Metric

CPU: Intel (R) Xeon(R) CPU @ 2.20GHz (16 cores)
GPU: NVIDIA L4 24GB

	#Params	chrF(++)	GPU Inference time (s)	CPU Inference time (s)
NLLB-200 3.3B	3.3B	34.3	0.98 s	4.65 s
NLLB-200 1.3B	1.3B	32.1	0.89 s	2.46 s
NLLB-200 600M	600M	32	0.43 s	1.52 s
NLLB-200 350M (ours)	350M	24.6	0.24 s	1.43 s

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', forced_bos_token_id=256098)
tokenizer = AutoTokenizer.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', src_lang='eng_Latn', tgt_lang='kor_Hang')

inputs = tokenizer('[YOUR_INPUT]', return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0]))

License

CC-BY-NC 4.0

Citation

@misc{,
  title={NLLB-200 distilled_350M_en-ko},
  author={Saechan Oh},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLLB-200 Distilled-350M_en2ko

Model

Data

Metric

Usage

License

Citation

About

Releases

Packages

Languages

License

newfull5/NLLB-200-Distilled-350M-en-ko

Folders and files

Latest commit

History

Repository files navigation

NLLB-200 Distilled-350M_en2ko

Model

Data

Metric

Usage

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages