[Feature Request/Help] BLEURT model -> PyTorch #224

adamwlev · 2020-05-30T18:30:40Z

Hi, I am interested in porting google research's new BLEURT learned metric to PyTorch (because I wish to do something experimental with language generation and backpropping through BLEURT). I noticed that you guys don't have it yet so I am partly just asking if you plan to add it (@thomwolf said you want to do so on Twitter).

I had a go of just like manually using the checkpoint that they publish which includes the weights. It seems like the architecture is exactly aligned with the out-of-the-box BertModel in transformers just with a single linear layer on top of the CLS embedding. I loaded all the weights to the PyTorch model but I am not able to get the same numbers as the BLEURT package's python api. Here is my colab notebook where I tried https://colab.research.google.com/drive/1Bfced531EvQP_CpFvxwxNl25Pj6ptylY?usp=sharing . If you have any pointers on what might be going wrong that would be much appreciated!

Thank you muchly!

manikbhandari · 2020-10-06T16:40:12Z

Is there any update on this?

Thanks!

ohmeow · 2020-12-27T21:31:38Z

Hitting this error when using bleurt with PyTorch ...

UnrecognizedFlagError: Unknown command line flag 'f'

... and I'm assuming because it was built for TF specifically. Is there a way to use this metric in PyTorch?

yjernite · 2021-01-04T09:53:32Z

We currently provide a wrapper on the TensorFlow implementation: https://huggingface.co/metrics/bleurt

We have long term plans to better handle model-based metrics, but they probably won't be implemented right away

@adamwlev it would still be cool to add the BLEURT checkpoints to the transformers repo if you're interested, but that would best be discussed there :)

closing for now

LoraIpsum · 2021-09-02T15:02:17Z

Hi there. We ran into the same problem this year (converting BLEURT to PyTorch) and thanks to @adamwlev found his colab notebook which didn't work but served as a good starting point. Finally, we made it work by doing just two simple conceptual fixes:

Transposing 'kernel' layers instead of 'dense' ones when copying params from the original model;
Taking pooler_output as a cls_state in forward function of the BleurtModel class.

Plus few minor syntactical fixes for the outdated parts. The result is still not exactly the same, but is very close to the expected one (1.0483 vs 1.0474).

Find the fixed version here (fixes are commented): https://colab.research.google.com/drive/1KsCUkFW45d5_ROSv2aHtXgeBa2Z98r03?usp=sharing

lucadiliello · 2023-01-19T15:46:58Z

I created a new model based on transformers that can load every BLEURT checkpoints released so far. https://github.com/lucadiliello/bleurt-pytorch

vaiibhavgupta · 2023-08-26T17:38:48Z

@LoraIpsum Thanks for sharing your work here. However, I'm unable to reproduce the results. That's strange because you are. FYI, I am trying to convert a finetuned BLEURT to PyTorch. Any suggestions on how I can reproduce results?

adamwlev changed the title ~~[Feature Request/Help]~~ [Feature Request/Help] BLEURT model -> PyTorch May 30, 2020

yjernite self-assigned this Jun 2, 2020

thomwolf added the enhancement New feature or request label Jun 20, 2020

yjernite closed this as completed Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request/Help] BLEURT model -> PyTorch #224

[Feature Request/Help] BLEURT model -> PyTorch #224

adamwlev commented May 30, 2020

manikbhandari commented Oct 6, 2020

ohmeow commented Dec 27, 2020

yjernite commented Jan 4, 2021

LoraIpsum commented Sep 2, 2021

lucadiliello commented Jan 19, 2023

vaiibhavgupta commented Aug 26, 2023

[Feature Request/Help] BLEURT model -> PyTorch #224

[Feature Request/Help] BLEURT model -> PyTorch #224

Comments

adamwlev commented May 30, 2020

manikbhandari commented Oct 6, 2020

ohmeow commented Dec 27, 2020

yjernite commented Jan 4, 2021

LoraIpsum commented Sep 2, 2021

lucadiliello commented Jan 19, 2023

vaiibhavgupta commented Aug 26, 2023