Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent variable name scopes for modules #25

Closed
albertz opened this issue Aug 10, 2021 · 0 comments
Closed

Consistent variable name scopes for modules #25

albertz opened this issue Aug 10, 2021 · 0 comments
Milestone

Comments

@albertz
Copy link
Member

albertz commented Aug 10, 2021

The TF variable name scopes in RETURNN are determined by the layer names and layer hierarchy.

In PyTorch, the model variable names are determined by the module hierarchy (i.e. the attrib names).

When we use the module concept here to define some model, the mapping to RETURNN layers might not always yield the same variable name scopes.
Consider some code like this:

if cfg_option:
  y = mod(x)
else:
  with Ctx():
    y = mod(x)

Maybe Ctx is Cond (#24) or Loop (#16).
Depending on how Ctx works, the absolute layer name of mod might be different on whether cfg_option is enabled.

Originally, I thought this would not be a problem.
However, when you save the model checkpoint with cfg_option disabled, and then later want to load the model with cfg_option enabled, I think the user would expect this to work. And this requires that the variables to match.

So far to the problem.
On possible solutions:

I think it is not possible in general to always create the RETURNN layer hierarchy such that it matches. Depending on what Ctx is, it needs to be wrapped inside another layer (e.g. RecLayer). If mod is some Linear instance, this would yield different variable names.

One potential solution is when we allow to define a custom TF name (variable) scope for a layer. Then in the second case, RecLayer can specify to not consume any new TF name scope (i.e. flat), and then it would work.

@albertz albertz added this to the first-release milestone Oct 20, 2021
This was referenced Oct 20, 2021
albertz added a commit to rwth-i6/returnn that referenced this issue Oct 30, 2021
Can be used for layers
and also subnetworks (including RecLayer).

Can also be used as a method for param sharing.

Also for:
rwth-i6/returnn_common#25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant