Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on memory consumption for CRD loss when the dataset is very large #40

Open
TMaysGGS opened this issue Mar 3, 2021 · 3 comments

Comments

@TMaysGGS
Copy link

TMaysGGS commented Mar 3, 2021

Hi,

Thank you for your great work, which helps me a lot.

I want to ask about the CRD contrast memory. In class ContrastMemory, there will be 2 buffers generated as 2 random tensor, for each is of the shape (number of data, number of features). Assume the number of features is 128, this buffer will become really huge when training with a large dataset, such as Glint 360k. Actually I tried to use CRD for my face recognition project and there are 17091657 pictures in this dataset, which leads to a outbread use for the GPU memory and there is no room for training.

I wonder if you can tell me if I am understanding this part right, and if I am right, is there any solution for this problem? Thanks.

@Xinxinatg
Copy link

Hey TMays I am coming accross the same issue here. Have you been able to solve it?

@TMaysGGS
Copy link
Author

TMaysGGS commented Sep 1, 2021

Hey TMays I am coming accross the same issue here. Have you been able to solve it?

Sry not yet. Since the original distillation method is useful enough, I do not add any extra loss for my training for now.

@HobbitLong
Copy link
Owner

Hi, @TMaysGGS , sry this is a late reply, and maybe you have figured it out. But if you are interested, there are two solutions:

  • you can use the Momentum Encoder trick in MoCo paper, and then you will only need a fixed-length queue.
  • if you can maintain a large batch size, then you can directly perform contrastive loss without memory buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants