You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have several questions about your bestrq code, and I would appreciate any clarifications.
Why do you use mask_emb? The paper mentions that when masking the input, it replaces the masked frames with white noise. In your implementation, you are using a learnable tensor called mask_emb to replace the frames with.
What is the intuition behind this choice? Does it have something to do with the <MASK> token from BERT?
Does this approach work better than just adding random noise?
Why do you use regularization on the input?
The text was updated successfully, but these errors were encountered:
@SatenHarutyunyan
1 We have chosen to use random noise better than mask_emb, but no time to submit pr yet
2 L2 regularization is better than no l2 regularization 。 And the l2 loss function is a commonly used method to avoid overfitting
Thanks for answers.
2) Do I understand correctly that you set regularization on spectrogram directly (input) and there are no parameters to regularize? Wont the gradient be 0 on the regularization component of the loss?
Using L2 regularization in the input can make the model less likely to over-rely on certain input features (mask noise), thereby improving the generalization performance of the model.
Thank you for the implementation!
I have several questions about your bestrq code, and I would appreciate any clarifications.
Why do you use mask_emb? The paper mentions that when masking the input, it replaces the masked frames with white noise. In your implementation, you are using a learnable tensor called mask_emb to replace the frames with.
<MASK>
token from BERT?Why do you use regularization on the input?
The text was updated successfully, but these errors were encountered: