You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The TF variable name scopes in RETURNN are determined by the layer names and layer hierarchy.
In PyTorch, the model variable names are determined by the module hierarchy (i.e. the attrib names).
When we use the module concept here to define some model, the mapping to RETURNN layers might not always yield the same variable name scopes.
Consider some code like this:
if cfg_option:
y = mod(x)
else:
with Ctx():
y = mod(x)
Maybe Ctx is Cond (#24) or Loop (#16).
Depending on how Ctx works, the absolute layer name of mod might be different on whether cfg_option is enabled.
Originally, I thought this would not be a problem.
However, when you save the model checkpoint with cfg_option disabled, and then later want to load the model with cfg_option enabled, I think the user would expect this to work. And this requires that the variables to match.
So far to the problem.
On possible solutions:
I think it is not possible in general to always create the RETURNN layer hierarchy such that it matches. Depending on what Ctx is, it needs to be wrapped inside another layer (e.g. RecLayer). If mod is some Linear instance, this would yield different variable names.
One potential solution is when we allow to define a custom TF name (variable) scope for a layer. Then in the second case, RecLayer can specify to not consume any new TF name scope (i.e. flat), and then it would work.
The text was updated successfully, but these errors were encountered:
The TF variable name scopes in RETURNN are determined by the layer names and layer hierarchy.
In PyTorch, the model variable names are determined by the module hierarchy (i.e. the attrib names).
When we use the module concept here to define some model, the mapping to RETURNN layers might not always yield the same variable name scopes.
Consider some code like this:
Maybe
Ctx
isCond
(#24) orLoop
(#16).Depending on how
Ctx
works, the absolute layer name ofmod
might be different on whethercfg_option
is enabled.Originally, I thought this would not be a problem.
However, when you save the model checkpoint with
cfg_option
disabled, and then later want to load the model withcfg_option
enabled, I think the user would expect this to work. And this requires that the variables to match.So far to the problem.
On possible solutions:
I think it is not possible in general to always create the RETURNN layer hierarchy such that it matches. Depending on what
Ctx
is, it needs to be wrapped inside another layer (e.g.RecLayer
). Ifmod
is someLinear
instance, this would yield different variable names.One potential solution is when we allow to define a custom TF name (variable) scope for a layer. Then in the second case,
RecLayer
can specify to not consume any new TF name scope (i.e. flat), and then it would work.The text was updated successfully, but these errors were encountered: