-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fvd compute question #10
Comments
Hello, the following is my understanding. The approach is really inconsistent, and that's really a potential problem. However, the calculation of metrics such as FVD may be aligned with the usual used resolution of the dataset. In this article, the data set is read at the same resolution as the data, for example 64 for SMMNIST, 64 for KTH, 64 for BAIR, and 128 for CityScapes. It is observed that the videos do not scale to calculate FVD, SSIM and PSNR after the data is read. In addition to calculating the LPIPS were scaled down to 128 . And 128 is just about CityScapes. We only report on the CityScapes using LPIPS in this checkpoints test, in fact, all data sets capture LPIPS in this checkpoints test, perhaps because the code is scaled to 128 resolution. So SMMNIST, KTH and BAIR did not report in the paper. It is also worth noting that in the LPIPS official code readme sample, 64 is used instead. So I think the resolution of the evaluation function is related to the common resolution of the data set. 您好,下面是我的理解。处理方法确实不统一,这确实是一个潜在的问题。但FVD等评估指标的计算,可能和需要研究的数据集通常的使用分辨率进行对齐即可。在本文中,数据集和数据读取时的分辨率一致,例如对于SMMNIST是64,KTH是64,BAIR是64,CityScapes是128。 观察到代码中并没有在数据读取后再进行放缩去计算FVD、SSIM、PSNR。除了计算LPIPS进行了放缩到128。而128仅仅和CitySpaces有关。本文只报告了CitySpaces数据集的LPIPS,但实际上根据本文作者在checkpoints里打包的文件来看,所有数据集都做了关于LPIPS的统计,也许是由于代码是缩放到128分辨率的,所以SMMNIST、KTH、BAIR没有报告在论文中。另外值得注意的是,在LPIPS官方代码readme样例中,使用却是64。所以我认为评估函数的分辨率大小和数据集的常用分辨率有关。
mcvd-pytorch/configs/kth64_big.yml Line 57 in 451da2e
mcvd-pytorch/configs/cityscapes.yml Line 58 in 451da2e
Line 58 in 451da2e
mcvd-pytorch/datasets/kth_convert.py Lines 35 to 56 in 226a3fd
mcvd-pytorch/runners/ncsn_runner.py Lines 1918 to 1953 in 226a3fd
mcvd-pytorch/runners/ncsn_runner.py Lines 1759 to 1774 in 226a3fd
mcvd-pytorch/runners/ncsn_runner.py Lines 1427 to 1430 in 226a3fd
|
In StyleGAN-V,they resize the input image to 128x128 to compute the fvd metric.
But the official fvd metric use 224x224 as input.
What is the size of the input image in your work? It feels like everyone's treatment is different. Thank you!
The text was updated successfully, but these errors were encountered: