Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications Needed: Test Performance, Batch Size, and Initial Weights #3

Closed
Amordia opened this issue May 28, 2024 · 1 comment
Closed

Comments

@Amordia
Copy link

Amordia commented May 28, 2024

Dear Authors,

Thank you for making your work and code available to the community. We have encountered a few issues while reproducing the results from the paper using the provided repository. We would appreciate your guidance on the following points:

  1. Test Performance Discrepancy:

    • When using the provided pre-trained weights and running the test script, we observed that the performance is consistently lower by one point compared to the results reported in the paper. Could you please provide insights into any additional steps or configurations that might be necessary to achieve the reported performance?
  2. Batch Size:

    • In the paper, the batch size is mentioned as 8, while the default batch size in the provided code is set to 16. Could you please confirm which batch size was used for the results reported in the paper? Additionally, if there were any specific reasons for this discrepancy, we would appreciate an explanation.
  3. Initialization of Weights:

    • It is not clear from the provided documentation whether the initial weights in the training process are randomly initialized or if they use pre-trained weights. Could you please clarify the weight initialization strategy used in your experiments?

Any assistance or clarification on these points would be highly appreciated. Thank you for your time and support.

Best regards,
Amordia

@Xu3XiWang
Copy link
Owner

Xu3XiWang commented Jun 20, 2024

  1. Sorry but I don't know the reason for the test performance discrepancy. Maybe it is becasue the gpu difference? We test the model on RTX 3090.
  2. We have trained the model with different batch sizes and found that the size of the batch does not significantly affect performance. If GPU allows, we recommend using a larger batch size for faster training.
  3. We used the checkpoints pre-trained on ImageNet by MAE in this url which is same as CounTR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants