Outputs of inception network seems to be batch size dependent #43

nicolas-dufour · 2023-03-14T16:52:12Z

As mentionned in this issue Lightning-AI/torchmetrics#1620, the inception network seems to suffer from different results depending from the batch_size

Something i tried that seems to solve this issue is to run the network in float64

This batch independence seems to have considerable differences in FIDs as mentionned in Lightning-AI/torchmetrics#1620 the batch bias in FID seems to be higher from small batch-sizes. If we compute FID between 2 uniformly sampled distributions with a 1000 points each, if we compute it with a batch size of 1000, we get FID 1.9 but if we compute it with batch-size=2, then the FID is 10. Since we sample from the same distribution, FID should be as close too zero as possible.

Possible Fix

inception = inception.to(torch.float64)
imgs = imgs.to(torch.float64)
inception(imgs)

toshas · 2023-03-20T18:59:06Z

Thanks for this report! Is there an insight into the nature of this issue? Of course, it is good that conversion to float64 fixes the issue, but IMO it cannot be considered a clear-cut fix. What would the actual fix look like? Does it happen on all GPUs/cudnns, or is it a narrower issue?
If this is something stemming from the drivers or hardware, then, of course fixing it with conversion to double should be an option. However, it is not clear to me that this is the only way to go and how involved the other ways are.

nicolas-dufour · 2023-03-21T09:53:48Z

I've tried it on a CUDA 11.8 with pytorch 1.13 running on a Nvidia 3090

toshas · 2023-04-30T23:18:24Z

I have checked a way to specify the double type in all feature extractors. I also checked in a separate test suite to track down discrepancies arising from batch size. Thanks for the report! 0.4.0 release soon

nicolas-dufour · 2023-05-01T09:38:48Z

Thanks for solving this!

toshas closed this as completed in 028118e Apr 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outputs of inception network seems to be batch size dependent #43

Outputs of inception network seems to be batch size dependent #43

nicolas-dufour commented Mar 14, 2023

toshas commented Mar 20, 2023

nicolas-dufour commented Mar 21, 2023

toshas commented Apr 30, 2023

nicolas-dufour commented May 1, 2023

Outputs of inception network seems to be batch size dependent #43

Outputs of inception network seems to be batch size dependent #43

Comments

nicolas-dufour commented Mar 14, 2023

toshas commented Mar 20, 2023

nicolas-dufour commented Mar 21, 2023

toshas commented Apr 30, 2023

nicolas-dufour commented May 1, 2023