TF-OPT attention mask fixes #25238

Rocketknight1 · 2023-08-01T14:50:27Z

With apologies for the delay, this PR should hopefully resolve the issues in #24637. @abb128 can you please try installing from this PR and verify if it resolves your issues? You can install from this PR with:

pip install --upgrade git+https://github.com/huggingface/transformers.git@tf_opt_fixes

Fixes #24637

HuggingFaceDocBuilderDev · 2023-08-01T15:08:33Z

The documentation is not available anymore as the PR was closed or merged.

Rocketknight1 · 2023-08-28T16:54:44Z

No response, but we should probably merge anyway. Pinging @amyeroberts for core maintainer review!

amyeroberts

Thanks for fixing this!

Just a question about the checks and values the inputs can have in _prepare_decoder_attention_mask

src/transformers/models/opt/modeling_tf_opt.py

amyeroberts · 2023-08-29T11:38:42Z

src/transformers/models/opt/modeling_tf_opt.py

+        _, seq_length = input_shape
+        tf.debugging.assert_equal(
+            seq_length + past_key_values_length,
+            shape_list(attention_mask)[1],


Is this check robust? From the diff it looks like attention_mask can be None

Yes! The TFOPTDecoder layer checks for None attention masks and replaces them with tf.ones. That happens before _prepare_decoder_attention_mask is called. The earlier code had an if attention_mask is not None branch that was just always taken as a result.

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Rocketknight1 · 2023-09-05T12:17:51Z

@amyeroberts Sorry for the delay, I lost track of this one!

amyeroberts

Thanks for fixing!

* stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Rocketknight1 added 2 commits July 13, 2023 12:12

stash commit

111b6fb

More OPT updates

fc4ff4e

amyeroberts reviewed Aug 29, 2023

View reviewed changes

Update src/transformers/models/opt/modeling_tf_opt.py

f92f458

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

amyeroberts approved these changes Sep 6, 2023

View reviewed changes

Rocketknight1 merged commit 842e99f into main Sep 6, 2023
3 checks passed

Rocketknight1 deleted the tf_opt_fixes branch September 6, 2023 12:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF-OPT attention mask fixes #25238

TF-OPT attention mask fixes #25238

Rocketknight1 commented Aug 1, 2023

HuggingFaceDocBuilderDev commented Aug 1, 2023 •

edited

Loading

Rocketknight1 commented Aug 28, 2023

amyeroberts left a comment

amyeroberts Aug 29, 2023

Rocketknight1 Sep 5, 2023

Rocketknight1 commented Sep 5, 2023

amyeroberts left a comment

TF-OPT attention mask fixes #25238

TF-OPT attention mask fixes #25238

Conversation

Rocketknight1 commented Aug 1, 2023

HuggingFaceDocBuilderDev commented Aug 1, 2023 • edited Loading

Rocketknight1 commented Aug 28, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Aug 29, 2023

Choose a reason for hiding this comment

Rocketknight1 Sep 5, 2023

Choose a reason for hiding this comment

Rocketknight1 commented Sep 5, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 1, 2023 •

edited

Loading