Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running instruction #1

Closed
Epi-NMT opened this issue Oct 10, 2019 · 13 comments
Closed

Running instruction #1

Epi-NMT opened this issue Oct 10, 2019 · 13 comments

Comments

@Epi-NMT
Copy link

Epi-NMT commented Oct 10, 2019

Thanks for you implementation.

I try to reproduce the result on PACS dataset.
I have put the train/test/val.txt file in sourceonly/(art_painting/c/p/s) file, which is require to do.
I simply try some commonds:

  1. bash pasc.sh: it shows unrecognized arguments: -test 0
  2. python alex_tun_rep.py: shows No such file or directory: 'weights/pacs_sketch.npy'
  3. python alex_run_baseline.py: shows 'MNISTcnn' object has no attribut 'loss' (why the pacs file involves MNIST?)
  4. python alex_run_top.py: No such file or directory: 'representations/photo_train_rep.npy'
  5. python alex_run_top_adv.py: No such file or directory: 'representations/sketch_train_rep.npy'

Can you write a instruction that how to reproduce the results in the paper on PACS dataset.

Thank you very much.

@HaohanWang
Copy link
Owner

HaohanWang commented Oct 10, 2019

Hi, sorry for the confusion raised, we tried to make sure the scripts are the same as when we submitted the paper to guarantee every number replicable so that it's a little bit messy.

Here are some more clear instructions.

  1. The main procedure:

    1. We first use alex_run_baseline.py to extract representations, which loads a pretrained weight called "bvlc_alexnet.npy" (a standard pretrained weight). It will generate multiple files in representations/
    2. Then we run the alex_run_top.py and it will use the generated representations to train the top layer, which makes the prediction.
  2. Other details for replication:

    1. Feel free to use the files here to directly continue with my extracted representations, so all one needs will be to run alex_run_top.py. We will suggest this to avoid accumulated numeric differences due to tensorflow updates or hardware variation.
    2. To run the experiment for different domain, one needs to manually update line 105 at alex_run_top.py, choices are "photo", "sketch", "cartoon", or "art", the only hyperparameter we have played is the learning rate at line 135, choices are roughly "1e-3", "1e-4", or "1e-5". I cannot remember these learning rate choices exactly, but this should not really be an issue for the reasons that:
      (a) I believe any choice will give you a competitive performance.
      (b) each run will only take around 10 minutes, some efforts of tuning should not be a big concern.
  3. Other related issues:

    1. We do not need .sh file, I think it's in it for a mistake. I will delete it later.
    2. We no longer need alex_run_rep.py or weights/pacs_sketch.npy
    3. MNISTcnn is just the name of the model, suggesting that the top layer is roughly the same architecture as what is in our MNIST experiment.
    4. I just checked, and it does not seem to have the issue of "no attribute loss" when I ran it. Also, as we can see that, the loss is defined at line 192 and line 217 at alex_cnn_top.py

Thank you raising this issue, feel free to let me know how it goes.

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 10, 2019

I download your extracted representations and follow your instruction, it works right now.

Thank you very much.

@Epi-NMT Epi-NMT closed this as completed Oct 10, 2019
@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 20, 2019

I try to extract the representation of VLCS, but get stuck.

Can you check the file PACS/alex_run_baseline.py, line 131 "model.loss"?

In alex_cnn_baseline.py, the class MNISTcnn has no loss attribute.

@HaohanWang
Copy link
Owner

Hi, I think it was here at Line 188.

Sorry that I wasn't clear before, but this was only the base AlexNet used to extract the representations, so the hex_flag should be set to False when running it. (-hex 0)

Please try it and feel free to let me know if there are further issues.

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 22, 2019

Thanks for your help.
I can run alex_run_baseline and extract representations right now.
But I am still confused about the number of epochs in alex_run_baseline and alex_run_top.
Because I don't see 25000 epochs in paper.

@HaohanWang
Copy link
Owner

HaohanWang commented Oct 22, 2019

In our experiments, we used 10 epochs for the representation extraction and 100 epochs for the top layer. On a modern GPU, this process should take roughly 600 seconds.
(The number 25000 was set by the co-author when she tried to replicate other reported results, but later we noticed this much faster heuristic. I updated the script to avoid further confusion.)

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 22, 2019

Yes, that is what reported in the paper and what I used. The result shocked me.

Basically, for each target domain: photo art cartoon sketch, I run alex_run_baseline.py with 10 epochs.
Then for each target domain, I run alex_run_top.py with 100 epochs.

The "Best Validated Model Prediction Accuracy" I got are follows:
photo: 89.92;
art: 74.51;
cartoon: 79.99;
sketch: 74.51;

The same situation occurs when I implement on VLCS dataset.
Did I do something wrong about the procedure?
If I didn't do anything wrong, it may because of the tensorflow, numpy updates, like you mentioned before.

@HaohanWang
Copy link
Owner

If these are accuracy, then these results are way much better than what we reported in the paper, and probably even better than the state-of-the-art nowadays.

Maybe some double-check suggestions:

  • Did you tune the learning rates? Back then I felt like a good choice of learning rate will do much better, but we limited our tuning process.
  • For each target domain, both the "base" and the "top" need to be run. (But this should already be the case in our script).
  • Nowadays, people also use ResNet on this data set, did you accidentally replace the AlexNet with ResNet?

Feel free to let me know how it goes.

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 23, 2019

These results were produced with learning rate 1e-4, I tried other learning rates such as 5e-5, 1e-5, just a matter of 10 more epochs.

I think my training procedure was right: run "base"for each target domain, generated 12x4 .npy files in ./representations/ and then run "top" for each domain.

The underlying network still AlexNet.


I redo it this morning, the 4 results of "top" were over 95%. The result it's like the testing data is also used for training.
I think there must be something wrong, is there anything to do with the samples list in train/test/val.txt file, due to the crazy accuracy?
Anyway, I will delete everything and try it from the beginning later today.

@HaohanWang
Copy link
Owner

Yeah, maybe it's also related to how you run it.
We run it through the command line, and every time it's a different session, so the weights of different target domains are not shared.
Could it be because some IDEs or ipython run these in the same session, so the weights used in the previous target domain are re-used, so that testing samples are leaked?
Or did you accidentally turn on -l (load params) so that it loads the parameters of the previous run?

Basically, every run should generate the same results, if the accuracy gets higher and higher, it looks like the weights are being re-used.

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 25, 2019

I run it the same way as you do.
I delete everything and rerun the experiment.
I trained it around 5 times.

Each time, I run "alex_run_baseline" with different target domain order(1: P A C S, 2: S A C P...), then "alex_run_top" for each domain. remove everything in "./representations" after obtaining 4 accuracies.

But I got weird accuracies as well:
photo: 82.3; art: 64.7; cartoon: 68.7; sketch: 75.89.

  1. the training order does not affect the result, these results are very stable.
  2. photo can not go over 83(DeepAll over 86);
  3. sketch never lower than 74(only 70 on resnet in other DG works, a much better result produced by alexnet seems no sense)
  4. no matter the action is 'store_true' or 'store_false' for the args of "--load_params" in "alex_run_baseline.py", does not effect the result.

If the parameters is re-used, the result should be affected by different training order, but it doesn't.

Maybe you can provide me your .txt files. Because if we are running the same code, It seems to be the last uncertainty.

@HaohanWang
Copy link
Owner

Sure, I just sent you through email.

@Epi-NMT
Copy link
Author

Epi-NMT commented Oct 28, 2019

Thank you, I found out where the mistake was, and I already extended the experiment on VLCS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants