-
-
Notifications
You must be signed in to change notification settings - Fork 984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete download of MNIST data #1875
Comments
I think the observations library is trying to download the files from http://yann.lecun.com/exdb/mnist/. Is this an intermittent failure, or does it fail on multiple retries? I do not have a better solution for this other than us moving away from observations. @null-a - I think only the AIR example is using observations. Do we have any other sources for the multi-MNIST dataset? The dataset generation code seems straightforward, so if not, we can write our own data loader for this. Also see #1871. |
It seemed to fail on multiple retries. And IIRC the same error came up outside of docker yesterday. Looks like the download might be coming from here... pyro/pyro/contrib/examples/util.py Line 15 in 835aa31
The original output seemed to indicate that the download was successful, with the load attempt then failing because |
The AIR example does not use |
I'm not sure what's going on here, this is working consistently for me at the moment. (
@neerajprad I'm not aware of a canonical version of this, everyone seems to roll their own from MNIST, and of those I don't know of any that are hosted in a suitable way.
That's right. In my original implementation I just generated the data on demand from the MNIST data available via pytorch, only later was this switched over to |
This time I saved the initial error output, which might be more helpful $ make build pyro_branch=dev pytorch_branch=release python_version=3.6
$ make run pyro_branch=dev pytorch_branch=release python_version=3.6
$ pip install numpy==1.15.0 # prevent binary warnings that are escalated into errors #1873
$ curl -O https://storage.googleapis.com/cvdf-datasets/mnist/train-images-idx3-ubyte.gz
$ echo $?
0
$ make test-examples > make_text_examples.txt 2>&1 |
Not blocked by this but I'm reluctant to add more environment workarounds. For now I'll remove As an aside, I've seen it work well to run the CI tests inside the docker container(s). IIRC, we factored out a base container that is rebuilt infrequently, and on each push, rebuilt the application container with |
Thanks, @null-a, that will be great! The observations library is deprecated in favor of tensorflow datasets which doesn't have multi MNIST at the moment, and it might be best to avoid adding tensorflow as a dependency anyways. I think we can just reinstate your original dataset creation utility in |
I wonder if the original issue @mattwescott is running into is a race condition that occurs when trying to fetch a data set from multiple processes? Trying this locally, if I clear out the directory in which the VAE example caches data ( |
@null-a : I think you are right, and that is exactly what is happening in this case since we have multiple tests for each example (there are two for AIR). This probably leads to a race condition when two subprocesses are trying to download to the same directory. I think we should just change our EDIT: Running with xdist halves the amount of time it takes to run |
Description
make test-examples
fails to fully download mnist data dependency for the air exampleDetails
On
osx==10.14.1
anddocker==18.09.2
Output
Resulting files in
.data
The text was updated successfully, but these errors were encountered: