Evaluation #11

Ha0Tang · 2020-11-25T11:22:13Z

Hi, can you give instructions on how to evaluate the model with the three metrics you used? Thanks.

ennauata · 2020-11-25T18:07:50Z

See more details on Section 3 [here] and Section 1-2 [here]

The realism score comes from a user study, compatibility from the graph edit distance (See evaluation_parallel.py) and diversity from FID (See compute_FID.py - I recommend using this [here])

Ha0Tang · 2020-11-25T18:09:20Z

Thank you so much:)

Ha0Tang · 2020-11-27T15:22:58Z

@ennauata after I running python compute_FID.py, I got two folders, i.e., the fake folder containing 50000 images, and the real folder containing only 5000 images, dose this correct?

ennauata · 2020-11-28T00:33:13Z

This sounds correct @Ha0Tang. One way to compute FID would be generating one sample for each graph (5K fake) and comparing with the corresponding GT (5k real). Another way is generate multiple samples for each graph (say 10x5k=50k fake) and compare with GT graph (5k real). The later one is a bit trickier, but maybe the best we can do for measuring diversity for the same graphs, because we have only one GT for each graph.

Ha0Tang · 2020-11-29T12:42:45Z

I see @ennauata, now I have another question. Do we need to train 5 models to evaluate 5 different groups? Specifically,

To evaluate 1-3, we need to train a model on 4-6, 7-9, 10-12, and 13+;
To evaluate 4-6, we need to train a model on 1-3, 7-9, 10-12, and 13+;
To evaluate 7-9, we need to train a model on 1-3, 4-6, 10-12, and13+;
To evaluate 10-12, we need to train a model on 1-3, 4-6, 7-9, and13+;
To evaluate 13+, we need to train a model on 1-3, 4-6, 7-9, and 10-12.

Does this correct?

ennauata · 2020-11-29T22:46:44Z

Yes, one for targeting each group.

Ha0Tang · 2020-11-30T13:50:46Z

@ennauata thanks, now I have two more questions. What does --num_variations mean? and how to select an image to compare with other methods in Fig. 6 of your paper since you generate 4 images for each input graph?

ennauata · 2020-11-30T18:15:04Z

--num_variations is the number of samples per input graph to generate. You could control num_variations for creating an image like Figure 6. For selecting some samples, you could filter out undesired samples. A quick way to do that is:

target_graphs = [1, 3, 4]
if g not in target_graphs:
    continue
...

Ha0Tang · 2020-11-30T20:59:50Z

Thanks again. Can you share the generated results of other baselines (i.e., CNN-only, GCN, Ashual et al., and Johnson et al.) with me? My email is hao.tang@unitn.it, in this way, I can directly compare these methods in my paper. Thanks a lot.

Ha0Tang · 2020-12-08T00:35:27Z

@ennauata when I run python evaluate_parallel.py, I get the following error:

Traceback (most recent call last):
File "evaluate_parallel.py", line 239, in
run_parallel(graphs)
File "evaluate_parallel.py", line 225, in run_parallel
results += Parallel(n_jobs=num_cores)(delayed(processInput)(G_pred, G_true, _id) for G_pred, G_true, _id in graphs[lower:upper])
File "/home/ht/anaconda3/envs/pytorch/lib/python3.6/site-packages/joblib/parallel.py", line 1061, in call
self.retrieve()
File "/home/ht/anaconda3/envs/pytorch/lib/python3.6/site-packages/joblib/parallel.py", line 940, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/ht/anaconda3/envs/pytorch/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File "/home/ht/anaconda3/envs/pytorch/lib/python3.6/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/home/ht/anaconda3/envs/pytorch/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
RuntimeError: can't start new thread

Any ideas to fix this bug?

ennauata · 2020-12-08T17:31:52Z

@Ha0Tang, this is a workaround I found for computing the metrics in parallel. My knowledge is limited in this area and I am not sure why this code is not working on your machine. I would recommend removing the parallel computation to make it work or find some python library that works in your machine for computing the metrics in parallel.

spencer0shaw · 2021-06-16T19:02:35Z

Hello! When I try to run evaluate_parrallel.py and variation_bbs_with_target_graph_segments_suppl.py. why the output is blank even using the model provided by you? Thank you!

mikrocosmoss · 2024-06-04T06:07:01Z

Hello,author,Can you share the generated results of other baselines (i.e., CNN-only, GCN, Ashual et al., and Johnson et al.) with me? My email is 1256829298@qq.com!

Ha0Tang closed this as completed Nov 25, 2020

Ha0Tang reopened this Nov 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation #11

Evaluation #11

Ha0Tang commented Nov 25, 2020

ennauata commented Nov 25, 2020 •

edited

Loading

Ha0Tang commented Nov 25, 2020

Ha0Tang commented Nov 27, 2020

ennauata commented Nov 28, 2020

Ha0Tang commented Nov 29, 2020

ennauata commented Nov 29, 2020

Ha0Tang commented Nov 30, 2020

ennauata commented Nov 30, 2020 •

edited

Loading

Ha0Tang commented Nov 30, 2020

Ha0Tang commented Dec 8, 2020

ennauata commented Dec 8, 2020

spencer0shaw commented Jun 16, 2021

mikrocosmoss commented Jun 4, 2024

Evaluation #11

Evaluation #11

Comments

Ha0Tang commented Nov 25, 2020

ennauata commented Nov 25, 2020 • edited Loading

Ha0Tang commented Nov 25, 2020

Ha0Tang commented Nov 27, 2020

ennauata commented Nov 28, 2020

Ha0Tang commented Nov 29, 2020

ennauata commented Nov 29, 2020

Ha0Tang commented Nov 30, 2020

ennauata commented Nov 30, 2020 • edited Loading

Ha0Tang commented Nov 30, 2020

Ha0Tang commented Dec 8, 2020

ennauata commented Dec 8, 2020

spencer0shaw commented Jun 16, 2021

mikrocosmoss commented Jun 4, 2024

ennauata commented Nov 25, 2020 •

edited

Loading

ennauata commented Nov 30, 2020 •

edited

Loading