Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Accuracy is different #3

Open
kaijieshi7 opened this issue May 27, 2023 · 10 comments
Open

Test Accuracy is different #3

kaijieshi7 opened this issue May 27, 2023 · 10 comments

Comments

@kaijieshi7
Copy link

Hi, I train the hico-sg1 on qpic, and the testmap is 25 at 6 epoch. But in paper, it's 22.08.
image

@KentaroTakemoto
Copy link
Collaborator

Thank you very much for using HICO-DET-SG by yourself! 

As we mentioned in our response to Issue #2, we updated the HICO-DET-SG after DistShift 2022. In our own evaluation of QPIC on the current HICO-DET-SG split 1, test mAP was 24.53, which is similar to yours.  

For more details, please refer to the revised version of the paper which is going to be submitted to arXiv this week. 

@kaijieshi7
Copy link
Author

Ok, thank you very much.

@kaijieshi7
Copy link
Author

Wait, if you update. Can you mention it here?

@KentaroTakemoto
Copy link
Collaborator

Sure! Thank you for your advice.

@kaijieshi7
Copy link
Author

Hi, my best testMap of SG1 is around 27%, this is my hypeparameter.
image

Here is the figure of map
image

You mentioned yours is around 24. Is that because you do not use pretrained model or batchsize of one iter is 8?

@KentaroTakemoto
Copy link
Collaborator

KentaroTakemoto commented Jun 2, 2023

Hi, thank you very much for your patience and sharing your results!

The revised version is now available on arXiv!

About your last question, I think the main reasons are the base model and the batch size.
In our experiments, ResNet-101 was used as the base model and the batch size was 16 because the original QPIC paper reported that they used these settings to achieve the highest mAP on HICO dataset.

@kaijieshi7
Copy link
Author

Ok, I used pretrained resnet-50 and get 0.271067686, 0.260221244, 0.3193315 on three HICO-DET-SG datasets, which is totally different from your new arxiv's results. I do not know what happened.

@KentaroTakemoto
Copy link
Collaborator

We used ResNet-101 and batch size 16 because the original QPIC paper recommended them as the best ones. Yet, in the out-of-distribution tests in general, many factors affect the performance differently from the in-distribution tests. Could you try the following two?

  1. Train and test QPIC with ResNet-101 and batch size 16 on HICO-DET-SG. (Carry out the experiments in the exact same setting as ours in your environments.)

  2. Train and test QPIC with ResNet-50 and batch size 8 on the original HICO-DET.

@kaijieshi7
Copy link
Author

The parameter batch-size is 8, I used two gpus. So, it's 16 images per iter in my experiment.
I will do the experiment later.

@KentaroTakemoto
Copy link
Collaborator

Thank you very much!

The conditions of our experiments are detailed in Subsection 4.2 in our preprint. If you find any other differences, please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants