Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Student T tests fail on GPU [bug] #2227

Closed
lsgos opened this issue Dec 18, 2019 · 1 comment · Fixed by #2228
Closed

Student T tests fail on GPU [bug] #2227

lsgos opened this issue Dec 18, 2019 · 1 comment · Fixed by #2228
Assignees
Labels

Comments

@lsgos
Copy link
Contributor

lsgos commented Dec 18, 2019

sample_shape = torch.Size([])

    def rsample(self, sample_shape=torch.Size()):
        shape = self._extended_shape(sample_shape)
        X = torch.empty(shape, dtype=self.df.dtype, device=self.df.device).normal_()
        Z = self._chi2.rsample(sample_shape)
        Y = X * torch.rsqrt(Z / self.df).unsqueeze(-1)
>       return self.loc + self.scale_tril.matmul(Y.unsqueeze(-1)).squeeze(-1)
E       RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm

pyro/distributions/multivariate_studentt.py:74: RuntimeError

This issue was discussed in #2226 - running make test on the dev branch errors out for me if running on a machine with cuda. I am guessing this hasn't shown up in the CI because it uses a cpu only machine.

I think this bug is pretty simple - it happens because, as we can see in the above snippet, y inherits its device from self.df, and in the fixture, self.df is set to a scalar value. This is not converted into a tensor by the tensors_default_to context manager, and so isn't sent to the gpu.

I fixed this in #2226 by changing the fixture, but @fritzo suggested that it might suggest a missing coercion rather than a change to the fixture, so that change in the PR was reverted and I am opening this issue instead.

@fritzo fritzo added the bug label Dec 18, 2019
@fritzo
Copy link
Member

fritzo commented Dec 18, 2019

We really do want to test with Python scalars as inputs. This appears to be a bug in MultivariateStudentT which incorrectly coerces df without examining the device of other tenors:

df, = broadcast_all(df)  # this line is buggy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants