You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since log alpha is a tensor, what is being passed to the optimizer is the same tensor multiple times. I don't think that is the intended behavior. In the to() function below, it is overridden with the correct initialization for the optimizer:
UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information.
But as github.com/pytorch/pytorch/issues/40967 details, the net effect is the tensor log_alpha gets updated num_task times at each step, since all copies belong to the same param_group.
Thanks for reporting this. I suppose the overall effect is that the alpha learning rate is essentially multiplied by the number of tasks being trained. This should be about as simple as just replacing that one line of code, so we should definitely fix this.
There is a potential bug in how alpha optimizer is initialized in MTSAC. During init we have:
Since log alpha is a tensor, what is being passed to the optimizer is the same tensor multiple times. I don't think that is the intended behavior. In the
to()
function below, it is overridden with the correct initialization for the optimizer:PyTorch recognizes the parameters are duplicates:
But as github.com/pytorch/pytorch/issues/40967 details, the net effect is the tensor log_alpha gets updated
num_task
times at each step, since all copies belong to the same param_group.A quick test can show that:
@abhi-iyer
The text was updated successfully, but these errors were encountered: