Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-zero FID value for two exactly same distributions #283

Open
tazwar22 opened this issue Dec 15, 2021 · 3 comments
Open

Non-zero FID value for two exactly same distributions #283

tazwar22 opened this issue Dec 15, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@tazwar22
Copy link

Describe the bug
Running the FID computation on two distributions which are exactly the same leads to non-zero values. For example, if I use the 10,000 examples of CIFAR-10 test set as one distribution and the same set again as the other distribution, I end up with a non-trivial value of ~7.x.
Another example would be attainable by repeating the above, but with Normal(150,8) distributions (no particular reason for the parameters). The FID value in this case is once again non-trivial (==0.98). I have tested the cases with another implementation of FID in PyTorch (mseitzer) and have obtained values in the e-12 range (which makes more sense).

To Reproduce
Steps to reproduce the behavior: (Normal Distribution example)

  1. dist1 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
  2. dist2 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
  3. piq.FID()(dist1, dist2)

Expected behavior
The return values for such cases should be approximately zero. (See note about other PyTorch distribution above).
With the mseitzer implementation, I get a value of -6.53e-13 for the Normal vs Normal case described above.

Additional context
I noticed another issue #277 with the exact same behavior, but since it was closed, I wanted to highlight the discrepancy I noticed in this case.

@tazwar22 tazwar22 added the bug Something isn't working label Dec 15, 2021
@zakajd zakajd self-assigned this Dec 15, 2021
@zakajd
Copy link
Collaborator

zakajd commented Dec 15, 2021

Thanks for raising bug report! I'll investigate this issue.

@zakajd
Copy link
Collaborator

zakajd commented Dec 24, 2021

@tazwar22 I tried to reproduce the behaviour and found out that both PIQ and mseitzer implementations are consistently predict big values for similar high-dimensional distributions.

# !pip install pytorch-fid

import piq
import torch
import numpy as np

# Code from github.com/mseitzer/pytorch-fid
from pytorch_fid.fid_score import calculate_frechet_distance

dist1_np = np.random.normal(150, 8.0, size=(100000, 500))
dist2_np = np.random.normal(150, 8.0, size=(100000, 500))

dist1_np_mu = np.mean(dist1_np, axis=0)
dist1_np_sigma = np.cov(dist1_np, rowvar=False)

dist2_np_mu = np.mean(dist2_np, axis=0)
dist2_np_sigma = np.cov(dist2_np, rowvar=False)

mseitzer_output = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist2_np_mu, dist2_np_sigma)
print(f'{mseitzer_output:0.4f}')

dist1_pt = torch.tensor(dist1_np)
dist2_pt = torch.tensor(dist2_np)
piq_output = piq.FID()(dist1_pt, dist2_pt)
print(piq_output)

>>> 81.0782
>>>tensor(81.0783, dtype=torch.float64)

You mentioned getting -6.53e-13 for the Normal vs Normal case which does happen when passing exact same values:

mseitzer_output_same = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist1_np_mu, dist1_np_sigma)
# print(f'{mseitzer_output_same:0.4f}')
print(mseitzer_output_same)

piq_output_same = piq.FID()(dist1_pt, dist1_pt)
print(piq_output_same)

>>>> -1.3096723705530167e-10
>>>> tensor(0.0002, dtype=torch.float64)

You can notice than in this case PIQ value is significantly larger in scale, but still very close to zero.

@snk4tr
Copy link
Contributor

snk4tr commented Dec 29, 2021

@zakajd please add a comment to the code of the FID metric with a description of the possibly counterintuitive behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
3 participants