What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? #80

shandou · 2022-09-08T16:06:42Z

I have been testing pre-trained Opus-MT models ported to transformers library for python implementation. Specifically, I am using opus-mt-en-fr for English to French translation. And the tokenizer and translation model is loaded via MarianTokenizer and MarianMTModels--similar to code examples shown here on huggingface. Strangely, for the same pre-trained model translating the same English input on an identical machine, I have observed anywhere between 80+ ms and (whopping) 4 s per translation (example input = "kiwi strawberry").

Wonder if anyone has observed similar behaviours, and what could cause such a wide variation? Thank you very much!

jorgtied · 2023-01-12T08:13:48Z

Maybe asking people at huggingface and the transformers git repo would help?

artyomboyko · 2023-01-12T08:32:50Z

Good afternoon. Hypothetically, maybe the CPU or GPU load affected the performance of the model? Have you tried to monitor the load on the hardware component while performing measurements?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? #80

What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? #80

shandou commented Sep 8, 2022

jorgtied commented Jan 12, 2023

artyomboyko commented Jan 12, 2023

What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? #80

What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? #80

Comments

shandou commented Sep 8, 2022

jorgtied commented Jan 12, 2023

artyomboyko commented Jan 12, 2023