Non-zero FID value for two exactly same distributions #283

tazwar22 · 2021-12-15T18:17:11Z

Describe the bug
Running the FID computation on two distributions which are exactly the same leads to non-zero values. For example, if I use the 10,000 examples of CIFAR-10 test set as one distribution and the same set again as the other distribution, I end up with a non-trivial value of ~7.x.
Another example would be attainable by repeating the above, but with Normal(150,8) distributions (no particular reason for the parameters). The FID value in this case is once again non-trivial (==0.98). I have tested the cases with another implementation of FID in PyTorch (mseitzer) and have obtained values in the e-12 range (which makes more sense).

To Reproduce
Steps to reproduce the behavior: (Normal Distribution example)

dist1 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
dist2 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
piq.FID()(dist1, dist2)

Expected behavior
The return values for such cases should be approximately zero. (See note about other PyTorch distribution above).
With the mseitzer implementation, I get a value of -6.53e-13 for the Normal vs Normal case described above.

Additional context
I noticed another issue #277 with the exact same behavior, but since it was closed, I wanted to highlight the discrepancy I noticed in this case.

The text was updated successfully, but these errors were encountered:

zakajd · 2021-12-15T19:27:25Z

Thanks for raising bug report! I'll investigate this issue.

zakajd · 2021-12-24T11:48:42Z

@tazwar22 I tried to reproduce the behaviour and found out that both PIQ and mseitzer implementations are consistently predict big values for similar high-dimensional distributions.

# !pip install pytorch-fid

import piq
import torch
import numpy as np

# Code from github.com/mseitzer/pytorch-fid
from pytorch_fid.fid_score import calculate_frechet_distance

dist1_np = np.random.normal(150, 8.0, size=(100000, 500))
dist2_np = np.random.normal(150, 8.0, size=(100000, 500))

dist1_np_mu = np.mean(dist1_np, axis=0)
dist1_np_sigma = np.cov(dist1_np, rowvar=False)

dist2_np_mu = np.mean(dist2_np, axis=0)
dist2_np_sigma = np.cov(dist2_np, rowvar=False)

mseitzer_output = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist2_np_mu, dist2_np_sigma)
print(f'{mseitzer_output:0.4f}')

dist1_pt = torch.tensor(dist1_np)
dist2_pt = torch.tensor(dist2_np)
piq_output = piq.FID()(dist1_pt, dist2_pt)
print(piq_output)

>>> 81.0782
>>>tensor(81.0783, dtype=torch.float64)

You mentioned getting -6.53e-13 for the Normal vs Normal case which does happen when passing exact same values:

mseitzer_output_same = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist1_np_mu, dist1_np_sigma)
# print(f'{mseitzer_output_same:0.4f}')
print(mseitzer_output_same)

piq_output_same = piq.FID()(dist1_pt, dist1_pt)
print(piq_output_same)

>>>> -1.3096723705530167e-10
>>>> tensor(0.0002, dtype=torch.float64)

You can notice than in this case PIQ value is significantly larger in scale, but still very close to zero.

snk4tr · 2021-12-29T13:52:06Z

@zakajd please add a comment to the code of the FID metric with a description of the possibly counterintuitive behaviour.

tazwar22 added the bug Something isn't working label Dec 15, 2021

zakajd self-assigned this Dec 15, 2021

zakajd mentioned this issue Jun 20, 2022

Results are different from other FID sources #320

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-zero FID value for two exactly same distributions #283

Non-zero FID value for two exactly same distributions #283

tazwar22 commented Dec 15, 2021

zakajd commented Dec 15, 2021

zakajd commented Dec 24, 2021

snk4tr commented Dec 29, 2021

Non-zero FID value for two exactly same distributions #283

Non-zero FID value for two exactly same distributions #283

Comments

tazwar22 commented Dec 15, 2021

zakajd commented Dec 15, 2021

zakajd commented Dec 24, 2021

snk4tr commented Dec 29, 2021