-
Notifications
You must be signed in to change notification settings - Fork 633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does xFormers support weight tying? #169
Comments
Good question, thanks @erip ! As of now there are basically two sides of xformers, one is a part zoo (at different altitudes), the other one is around the factories, which combine these parts programatically. If you use the first path (build the model yourself given parts), then tying weights is certainly possible, as you would do with any other model. It's not supported with the factory right now though, but could certainly be done around that. Thoughts all ? |
Closed! |
❓ Questions and Help
Tying weights between encoder and decoder embedding layers is often useful for convergence and task performance. Is there a mechanism for sharing weights between the encoder and decoder?
The text was updated successfully, but these errors were encountered: