Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit "t" and correct prev non blank so that task=search works #69

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions common/models/transducer/transducer_fullsum.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,11 +229,15 @@ def make(self, encoder: LayerRef):
blank_idx = self.ctx.blank_idx

rec_decoder = {
"am0": {"class": "gather_nd", "from": _base(encoder), "position": "prev:t"}, # [B,D]
"t_": {"class": "eval", "from": ["prev:t", "enc_seq_len"], "eval": 'tf.minimum(source(0), source(1)-1)'},
"am0": {"class": "gather_nd", "from": _base(encoder), "position": "t_"}, # [B,D]
"am": {"class": "copy", "from": "am0" if search else "data:source"},

"prev_output_wo_b": {
"class": "masked_computation", "unit": {"class": "copy", "initial_output": 0},
"from": "prev:output_", "mask": "prev:output_emit", "initial_output": 0},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Why is this needed? Esp in search, this should have no effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev:output_ doesn't guarantee non_blank during search. Both are sparse, but it messes up the embedding that happens in slow_rnn.

I get something like this:

TensorFlow exception: indices[0] = 1056 is not in [0, 1056)
         [[node output/rec/slow_rnn/masked/input_embed/linear/embedding_lookup (defined at /Users/mikel/setups/rt4/returnn/returnn/tf/layers/basic.py:1468) ]]

Errors may have originated from an input operation.
Input Source operations connected to node output/rec/slow_rnn/masked/input_embed/linear/embedding_lookup:
 output/rec/slow_rnn/masked/input_embed/linear/py_print_1/Identity (defined at /Users/mikel/setups/rt4/returnn/returnn/tf/util/basic.py:6245)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is strange. Haven't we used it always like this in the other configs as well? Why did the problem never occur? Also, I have used exactly this config, and it did not occur for me. How can that be?

What TensorFlow version do you use?

Also, maybe we should fix MaskedComputationLayer instead? This can only happen for frames for slow_rnn which will actually not be used (due to the masking). It does not really matter what we calculate in those masked-out frames. We could simply fix the input for the masked-out frames.

But first I want to understand better why this happens now and not before, and not for me.

Copy link
Contributor Author

@jotix16 jotix16 Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't we used it always like this in the other configs as well?

It looks the same in other configs, I don't get it either.

What TensorFlow version do you use?

2.4.1

This can only happen for frames for slow_rnn which will actually not be used (due to the masking)

Exactly.

But first I want to understand better why this happens now and not before, and not for me.

Here are my logs.

"prev_out_non_blank": {
"class": "reinterpret_data", "from": "prev:output_", "set_sparse_dim": target.get_num_classes()},
"class": "reinterpret_data", "from": "prev_output_wo_b", "set_sparse_dim": target.get_num_classes()},

"slow_rnn": self.slow_rnn.make(
prev_sparse_label_nb="prev_out_non_blank",
Expand All @@ -252,7 +256,7 @@ def make(self, encoder: LayerRef):

"output": {
"class": 'choice',
'target': target.key, # note: wrong! but this is ignored both in full-sum training and in search
'target': target.key if train else None,
'beam_size': beam_size,
'from': "output_log_prob_wb", "input_type": "log_prob",
"initial_output": 0,
Expand Down