Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error FailToDownloadPythonDependencies on mnist example #82

Closed
mwbryant opened this issue Nov 7, 2022 · 6 comments
Closed

Error FailToDownloadPythonDependencies on mnist example #82

mwbryant opened this issue Nov 7, 2022 · 6 comments
Labels
dataset Related to `burn-dataset`

Comments

@mwbryant
Copy link

mwbryant commented Nov 7, 2022

Ubuntu 20.04 LTS

On my machine when running the mnist example I get this message:

thread 'main' panicked at 'called Result::unwrap() on an Err value: FailToDownloadPythonDependencies("pillow, numpy | error: No such file or directory (os error 2)")', burn-dataset/src/source/huggingface/mnist.rs:40:14
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace`

I tracked this down to https://github.com/burn-rs/burn/blob/0110ccc96471a404479c5e67801f22309bcf7682/burn-dataset/src/source/huggingface/downloader.rs#L196 which calls the command python when I believe it requires python3

The fix for me was to apt install python-is-python3

I then had problems with pandas requring a higher version of numpy. The fix for that was python -m pip install --upgrade numpy

I believe both these are not problems with the library code but with the installation instructions (of which there appear to be none) and might be worth mentioning when those are created.

@nathanielsimard
Copy link
Member

Totally, this is due to an integration with huggingface to load any datasets from the hub. It requires python and some libraries, everything is supposed to be automatic, but nothing is documented in the case something goes wrong. Also, nothing is in place to create a virtual environment just for burn.

@nathanielsimard nathanielsimard added the dataset Related to `burn-dataset` label Nov 7, 2022
@hscspring
Copy link

hscspring commented Nov 11, 2022

@mwbryant yes, python3 is needed, not only for the downloader but also the deps installing. Specified, line118 and line196 in the burn-dataset/src/source/huggingfacedownloader.rs should modify to your python, I think it's better to use >=python3.7.

Actually, I think it might no need to do that, a command line flag /path/to/dataset/ might be better.

@John0x
Copy link

John0x commented Nov 17, 2022

I think burn would benefit from ditching python all together. I'm experiencing issues with it as well. Not having python involved was the primary reason for my interest in this project :(
I have python 3.11.0 but it's failing to build pyarrow

@nathanielsimard
Copy link
Member

Python is only used to download datasets. When it's done, Rust is used to load and process everything data related. I agree that it would be a nice to avoid Python for the default examples.

However, having an integration with hugginface datasets is a plus though, it's an easy access to thousands of datasets with a reasonable Rust API easy to maintain.

@antimora
Copy link
Collaborator

I had the same problem when using with Mac. I believe the general problem is that instead of pip, we should refer to pip3. Most systems now use pip3 unless someone explicitly specified aliases like this apt install python-is-python3.

Also we should add some system checks in the dataset package.

@antimora
Copy link
Collaborator

antimora commented Apr 2, 2023

This is fixed by #185

@antimora antimora closed this as completed Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataset Related to `burn-dataset`
Projects
None yet
Development

No branches or pull requests

5 participants