Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does xFormers support weight tying? #169

Closed
erip opened this issue Jan 3, 2022 · 2 comments
Closed

Does xFormers support weight tying? #169

erip opened this issue Jan 3, 2022 · 2 comments

Comments

@erip
Copy link
Contributor

erip commented Jan 3, 2022

❓ Questions and Help

Tying weights between encoder and decoder embedding layers is often useful for convergence and task performance. Is there a mechanism for sharing weights between the encoder and decoder?

@blefaudeux
Copy link
Contributor

Good question, thanks @erip ! As of now there are basically two sides of xformers, one is a part zoo (at different altitudes), the other one is around the factories, which combine these parts programatically.

If you use the first path (build the model yourself given parts), then tying weights is certainly possible, as you would do with any other model. It's not supported with the factory right now though, but could certainly be done around that. Thoughts all ?

blefaudeux added a commit that referenced this issue Jan 3, 2022
blefaudeux added a commit that referenced this issue Jan 5, 2022
* tentative implementation of #169

* added unit testing

* Improve on the doc
@erip
Copy link
Contributor Author

erip commented Jan 5, 2022

Closed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants