Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error run keras in docker #207

Closed
uwesterr opened this issue Dec 2, 2017 · 25 comments
Closed

error run keras in docker #207

uwesterr opened this issue Dec 2, 2017 · 25 comments

Comments

@uwesterr
Copy link

uwesterr commented Dec 2, 2017

running keras in docker on Ubuntu 16.04 docker file https://hub.docker.com/r/rocker/verse/.
after installing keras as described on https://keras.rstudio.com when issuing the following command

> model <- keras_model_sequential()

i receive the error message

Error in initialize_python(required_module, use_environment) :
Python shared library '/usr/lib/libpython2.7.so' not found, Python bindings not loaded.

i would appreciate as to what i can do or if there is a dockerfile which avoids this problem.
thanks for your effort to make keras available for R, makes a big difference for me

@jjallaire
Copy link
Member

I can reproduce this as well with just this code:

library(reticulate)
py_config()

@eddelbuettel I saw there was a /usr/lib/python2.7 directory however found no libpython2.7.so (nor any other .so file) there. Is there a Python shared library in rocker and if so where is it? (tried the normal stuff to find it but no luck).

@jjallaire
Copy link
Member

I discovered that the following additional apt-get dependencies are necessary to run tensorflow/keras in rocker:

sudo apt-get install libpython2.7 python-pip python-virtualenv

@eddelbuettel
Copy link
Contributor

Hm, pip to install and a virtual env is required?

When you say rocker which of our dozen+ images did you have in mind? r-base? It would only have the minimal Python support due to the already-included Python from the basic distro container it builds on.

Now we had good luck and adoption with the two containers for your two key products. Shall we look into a 'deep learning' rocker container with tf and keras?

@uwesterr
Copy link
Author

uwesterr commented Dec 2, 2017

i use rocker/verse. i thing quite a few people would love a docker for deep learning including tf and keras. since it is not too easy to get gpu support running on Ubuntu (well i am not what you would call a software genius) and it is quite easy to get nvidia-docker up and running it would be great if a image

  • RStudio
  • keras package with tf backend
  • with gpu support

would exist. i tried to build such an image but failed...

@uwesterr
Copy link
Author

uwesterr commented Dec 2, 2017

@jjallaire
trying the following
uwe@uwe-System-Product-Name:~$ docker exec -it 71c72ed1935b bash root@71c72ed1935b:/# sudo apt-get install libpython2.7 python-pip python-virtualenv
ends up with

root@71c72ed1935b:/# sudo apt-get install libpython2.7 python-pip python-virtualenv
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Note, selecting 'libpython2.7-minimal' for regex 'libpython2.7'
Note, selecting 'libpython2.7-stdlib' for regex 'libpython2.7'
Package python-virtualenv is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Unable to locate package python-pip
E: Package 'python-virtualenv' has no installation candidate

@jjallaire
Copy link
Member

Yes I think a deep learning rocker container would be awesome! (esp. w/ the CUDA stuff pre-configured).

The python-virtualenv is required because the install_tensorflow() and install_keras() functions both create a virtual environment for the installation of Python packages (so they and their dependencies don't cross with whatever other Python setup is there).

That said, calling install_tensorflow() and install_keras() is actually not necessary, as in a pre-baked container they could be installed at the system level (e.g. this is what we do for the shinyapps.io containers: https://github.com/rstudio/shinyapps-package-dependencies/blob/master/packages/tensorflow/install).

Note that we still need libpython2.7 for reticulate (as would any package that attempts to interact with the Python interpreter as a shared library).

@jjallaire
Copy link
Member

I had to do sudo apt-get update in order to get those packages recognized.

@eddelbuettel
Copy link
Contributor

Yes apt and friends to not know when the indices are old. So mentally always do

apt-get update && apt-get install  ...

or whichevre command you need.

@uwesterr
Copy link
Author

uwesterr commented Dec 2, 2017

with

library(keras)
install_keras()

i get

> model <- keras_model_sequential()
Using TensorFlow backend.

with

library(keras)
install_keras()
install_keras(tensorflow = "gpu")

i get the following error

model <- keras_model_sequential()
Using TensorFlow backend.
Error: ImportError: Traceback (most recent call last):
 File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
 from tensorflow.python.pywrap_tensorflow_internal import *
 File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
 _pywrap_tensorflow_internal = swig_import_helper()
File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above

so one step further but not done yet. any idea what i can do?

@eddelbuettel
Copy link
Contributor

Make sure cublas gets installed?

edd@brad:~$ apt-cache show libcublas8.0
Package: libcublas8.0
Priority: extra
Section: multiverse/libs
Installed-Size: 40717
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Debian NVIDIA Maintainers <pkg-nvidia-devel@lists.alioth.debian.org>
Architecture: amd64
Source: nvidia-cuda-toolkit
Version: 8.0.44-3
Depends: libc6 (>= 2.3.3), libgcc1 (>= 1:3.0), libstdc++6 (>= 4.1.1)
Filename: pool/multiverse/n/nvidia-cuda-toolkit/libcublas8.0_8.0.44-3_amd64.deb
Size: 20726770
MD5sum: a683828861ee7402cee7090ab6b333dc
SHA1: 5dc8822d778853d1035b96aeb0d2a4ca1ab9d462
SHA256: a2eea40c47efbe2d1f96f5737fc74eede1c7fa1a57b4ee6377e3e1437fa1aa9d
Description-en: NVIDIA cuBLAS Library
 The Compute Unified Device Architecture (CUDA) enables NVIDIA
 graphics processing units (GPUs) to be used for massively parallel
 general purpose computation.
 .
 The cuBLAS library is an implementation of BLAS (Basic Linear Algebra
 Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access
 the computational resources of NVIDIA Graphics Processing Unit (GPU), but
 does not auto-parallelize across multiple GPUs.
 .
 This package contains the cuBLAS runtime library.
Description-md5: 5d0c77d8f2c8429e53892a3a70d407c4
Multi-Arch: same
Homepage: http://www.nvidia.com/CUDA
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Origin: Ubuntu

edd@brad:~$

(That is on my 17.04 laptop)

@uwesterr
Copy link
Author

uwesterr commented Dec 2, 2017

installed cublas with
wget http://de.archive.ubuntu.com/ubuntu/pool/multiverse/n/nvidia-cuda-toolkit/libcublas8.0_8.0.44-3_amd64.deb
and
dpkg -i libcublas8.0_8.0.44-3_amd64.deb
i got


root@71c72ed1935b:/# apt-cache show libcublas8.0
Package: libcublas8.0
Status: install ok installed
Priority: extra
Section: non-free/libs
Installed-Size: 40717
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Architecture: amd64
Multi-Arch: same
Source: nvidia-cuda-toolkit
Version: 8.0.44-3
Depends: libc6 (>= 2.3.3), libgcc1 (>= 1:3.0), libstdc++6 (>= 4.1.1)
Description: NVIDIA cuBLAS Library
 The Compute Unified Device Architecture (CUDA) enables NVIDIA
 graphics processing units (GPUs) to be used for massively parallel
 general purpose computation.
 .
 The cuBLAS library is an implementation of BLAS (Basic Linear Algebra
 Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access
 the computational resources of NVIDIA Graphics Processing Unit (GPU), but
 does not auto-parallelize across multiple GPUs.
 .
 This package contains the cuBLAS runtime library.
Description-md5: 5d0c77d8f2c8429e53892a3a70d407c4
Original-Maintainer: Debian NVIDIA Maintainers <pkg-nvidia-devel@lists.alioth.debian.org>
Homepage: http://www.nvidia.com/CUDA

and now get

> model <- keras_model_sequential()
Using TensorFlow backend.
Error: ImportError: Traceback (most recent call last):
  File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/rstudio/.virtualenvs/r-tensorflow/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
abo

hmm, i guess now i need to install libcusolver.so.8.0, right?

thanks for your help!

@jjallaire
Copy link
Member

The people who maintain the rocker images aren't warranting that tensorflow will work out of the box. In this case I think you need to keep googling for the missing dependencies and then installing them manually (e.g. https://www.google.com/search?q=debian+libcusolver.so.8.0)

@eddelbuettel
Copy link
Contributor

And if you do have a list get in touch so that we may incorporate this.

@uwesterr
Copy link
Author

uwesterr commented Dec 3, 2017

installed cublas with
wget http://de.archive.ubuntu.com/ubuntu/pool/multiverse/n/nvidia-cuda-toolkit/libcublas8.0_8.0.44-3_amd64.deb
and
``dpkg -i libcublas8.0_8.0.44-3_amd64.deb
i got

wget http://ftp.fau.de/ubuntu/pool/multiverse/n/nvidia-cuda-toolkit/libcusolver8.0_8.0.61-3_amd64.deb
dpkg -i libcusolver8.0_8.0.61-3_amd64.deb 

get ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

installed libcudart

wget http://ftp.fau.de/ubuntu/pool/multiverse/n/nvidia-cuda-toolkit/libcudart8.0_8.0.61-3_amd64.deb
dpkg -i libcudart8.0_8.0.61-3_amd64.deb

get ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

and there i got stuck, could not find a way to install libcuda

@jjallaire
Copy link
Member

If you are attempting to install a GPU installation of TensorFlow on Ubuntu you should follow the documentation posted on the TensorFlow for R website to ensure you have the CUDA dependencies correctly installed an configured: https://tensorflow.rstudio.com/tools/installation_gpu.html#ubuntu

@uwesterr
Copy link
Author

uwesterr commented Dec 3, 2017

i thought they should be all available since i use nvidia-docker, am i wrong?

@uwesterr
Copy link
Author

uwesterr commented Dec 3, 2017

just to make sure i did run the nvidia-docker i rerun the installation
nvidia-docker run -d -p 8787:8787 -e ROOT=TRUE rocker/verse

uwe@uwe-System-Product-Name:~/keras/docker$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
a2baf71e31ef        rocker/verse        "/init"             23 seconds ago      Up 22 seconds       0.0.0.0:8787->8787/tcp   practical_joliot

start bash shell in container
nvidia-docker exec -it a2baf71e31ef bash
open rstudio in browser
http://localhost:8787/

install keras

devtools::install_github("rstudio/keras")
  namespace ‘reticulate’ 1.3.1 is being loaded, but >= 1.3.1.9001 is required
ERROR: lazy loading failed for package ‘keras’
* removing ‘/usr/local/lib/R/site-library/keras’
Installation failed: Command failed (1)

install reticulate
devtools::install_github("rstudio/reticulate")
and then again

devtools::install_github("rstudio/keras")
* DONE (keras)
library(keras)
install_keras(tensorflow = "gpu")
Error: Prerequisites for installing TensorFlow not available.

Please install the following Python packages before proceeding: pip, virtualenv

sudo apt-get update
apt-get install python-pip
pip install virtualenv # RUN in bash extra shell, not within Rstudio shell

install_keras(tensorflow = "gpu")
Installation complete.
Restarting R session...
library(keras)
model <- keras_model_sequential()
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

same error message as last time, next step will be use the keras docker https://github.com/fchollet/keras/tree/master/docker and install RStudio there, i let you know how it went

@jjallaire
Copy link
Member

nvidia-docker as I understand simply makes the NVIDIA GPUs available the container. It doesn't do software installation or configuration as that is entirely application and library version specific.

@uwesterr
Copy link
Author

uwesterr commented Dec 3, 2017

@jjallaire , i will try tomorrow to install the CUDA dependencies, my approach to run from the keras docker did not work eithernvidia-docker run -it --user root -v /home/uwe/keras/docker/data:/data --env KERAS_BACKEND=tensorflow keras bash
apt-get install nano
install RStudio following instructions from
https://www.r-bloggers.com/installing-rstudio-server-on-ubuntu-server/

nano /etc/apt/sources.list
add
deb http://cran.univ-paris1.fr/bin/linux/ubuntu trusty/

then in shell

gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | apt-key add -
apt-get update
apt-get install r-base r-base-dev

apt-get update
Installing RStudio Server

folllow RStudio instructions https://www.rstudio.com/products/rstudio/download-server/

apt-get install gdebi-core
wget https://download2.rstudio.org/rstudio-server-1.1.383-amd64.deb
gdebi rstudio-server-1.1.383-amd64.deb 

and then i get an error i could not resolve:


RStudio Server
 RStudio is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, and workspace management.
Do you want to install the software package? [y/N]:y
Get:1 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 sudo amd64 1.8.16-0ubuntu1.5 [390 kB]                                                        
Get:2 http://archive.ubuntu.com/ubuntu xenial/main amd64 psmisc amd64 22.21-2.1build1 [48.0 kB]                                                               
Fetched 438 kB in 0s (0 B/s)                                                                                                                                  
debconf: delaying package configuration, since apt-utils is not installed
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package sudo.
(Reading database ... 25598 files and directories currently installed.)
Preparing to unpack .../sudo_1.8.16-0ubuntu1.5_amd64.deb ...
Unpacking sudo (1.8.16-0ubuntu1.5) ...
Selecting previously unselected package psmisc.
Preparing to unpack .../psmisc_22.21-2.1build1_amd64.deb ...
Unpacking psmisc (22.21-2.1build1) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up sudo (1.8.16-0ubuntu1.5) ...
Setting up psmisc (22.21-2.1build1) ...
Selecting previously unselected package rstudio-server.
(Reading database ... 25671 files and directories currently installed.)
Preparing to unpack rstudio-server-1.1.383-amd64.deb ...
Unpacking rstudio-server (1.1.383) ...
Setting up rstudio-server (1.1.383) ...
groupadd: group 'rstudio-server' already exists
rsession: no process found
root@140a39416592:/src# rserver[14832]: ERROR system error 10 (No child processes); OCCURRED AT: rstudio::core::Error rstudio::server::app_armor::enforceRestricted() /home/ubuntu/rstudio/src/cpp/server/ServerAppArmor.cpp:90; LOGGED FROM: int main(int, char* const*) /home/ubuntu/rstudio/src/cpp/server/ServerMain.cpp:513 

that is as much as i can take for a weekend, i will continue tomorrow with starting from rocker/verse and adding CUDA dependencies correctly installed an configured: https://tensorflow.rstudio.com/tools/installation_gpu.html#ubuntu.
thanks for your suppport

@uwesterr
Copy link
Author

uwesterr commented Dec 4, 2017

ttried to install the tensorflow prerequisits in the container rocker/verse
seems like this task is beyond me
nvidia-docker run --name rstudio-keras-gpu -d -p 8788:8787 -e ROOT=TRUE rocker/verse

start bash shell in container
nvidia-docker exec -it 2216974bb353 bash

install tensorflow prerequisits
https://tensorflow.rstudio.com/installation_gpu.html#prerequisites
but first
apt-get update
First, install the CUDA Toolkit v8.0:
sudo apt-get install linux-headers-$(uname -r)
doesnt work
https://unix.stackexchange.com/questions/328655/cant-install-linux-headers-kali-linux/332966
The package linux-headers-4.6.0-kali1-amd64 is no longer available on the regularly kali-linux repository, it should be upgraded to the 4.8.x version.
List the available linux-headers and linux-image through apt-cache search :
apt-cache search linux-headers
Then install the correct package e,g ( this is an example , it depends on the previous output command) :

apt-get install linux-headers-amd64
apt-get install gnupg2 

wget -qO - http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub | sudo apt-key add -

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get update
apt-get install cuda

i get

root@2216974bb353:/# apt-get install cuda
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 cuda : Depends: cuda-9-0 (>= 9.0.176) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

@uwesterr uwesterr closed this as completed Dec 4, 2017
@uwesterr uwesterr reopened this Dec 4, 2017
@uwesterr
Copy link
Author

uwesterr commented Dec 4, 2017

Try to install cuDNN

wget  http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/libcudnn6_6.0.21-1+cuda8.0_amd64.deb
dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
apt-get install libcupti-dev # did not work 
root@2216974bb353:/# apt-get install libcupti-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libcupti-dev
wget http://ftp.fau.de/ubuntu/pool/multiverse/n/nvidia-cuda-toolkit/libcupti-dev_8.0.61-3_amd64.deb
dpkg -i libcupti-dev_8.0.61-3_amd64.deb
Selecting previously unselected package libcupti-dev:amd64.
(Reading database ... 75381 files and directories currently installed.)
Preparing to unpack libcupti-dev_8.0.61-3_amd64.deb ...
Unpacking libcupti-dev:amd64 (8.0.61-3) ...
dpkg: dependency problems prevent configuration of libcupti-dev:amd64:
 libcupti-dev:amd64 depends on libcupti8.0 (= 8.0.61-3); however:
  Package libcupti8.0 is not installed.

dpkg: error processing package libcupti-dev:amd64 (--install):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 libcupti-dev:amd64

looks like the next dead end.
any suggestions on what i am missing out?
thanks and a great start into the week

@uwesterr
Copy link
Author

uwesterr commented Dec 4, 2017

at long last i got RStudio running with keras in GPU mode
i followed a blog
Kubernetes Minikube #4 – R/RStudio Server with Tensorflow on (external) GPU

https://stefanopicozzi.blog/2017/11/17/kubernetes-minikube-4-r-rstudio-server-with-tensorflow-on-external-gpu/

wget https://raw.githubusercontent.com/StefanoPicozzi/MLOps/master/tensorflow/Dockerfile
docker build -t stefanopicozzi/tf-rstudio .
docker images | grep tf-rstudio
  1. Basic NVIDIA Docker R Test
    Now check that the container is correctly configured for R/GPU integration by performing this basic smoke test.
    nvidia-docker run -it -p 8787:8787 stefanopicozzi/tf-rstudio /bin/bash
    R # starts R, the following commands are all in the r command line of the bash shell
install.packages("gpuR")

install.packages('Rcpp')
install.packages('devtools')
devtools::install_github("rstudio/reticulate")
devtools::install_github("rstudio/keras")
library(keras)
install_keras(tensorflow = "gpu")
install.packages('readr')
install.packages('tokenizers')

 source("https://raw.githubusercontent.com/rstudio/keras/master/vignettes/examples/lstm_text_generation.R", echo=TRUE)

that worked so, stop R and start RStudio
quit()
try to start rstudio

sudo rstudio-server start
root@a9f12bf264bc:/notebooks# rstudio-server start
root@a9f12bf264bc:/notebooks# rserver[4094]: ERROR system error 98 (Address already in use); OCCURRED AT: rstudio::core::Error rstudio::core::http::initTcpIpAcceptor(rstudio::core::http::SocketAcceptorService<rstudio_boost::asio::ip::tcp>&, const string&, const string&) /home/ubuntu/rstudio/src/cpp/core/include/core/http/TcpIpSocketUtils.hpp:103; LOGGED FROM: int main(int, char* const*) /home/ubuntu/rstudio/src/cpp/server/ServerMain.cpp:445
could start RStudio in browser
http://localhost:8787/

install_keras(tensorflow = "gpu")
> model <- keras_model_sequential()
Using TensorFlow backend.

so it looks like it is working.
are you interested in the image at all? i could upload to dockerhub if you want.
thanks again for your help, i still think that an official docker from rstudio for DL with keras and GPU support would be appreciated very much by quite a lot of people.

@jjallaire
Copy link
Member

@eddelbuettel Would you take a pull request for another rocker derived image as described here?

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Dec 4, 2017

I think we could do that, and as discussed, it work make a lot of sense. I am trying to remember if I have a box with NVidia somewhere to actually test this.

This (from my standard desktop) should do, right? It is not a dedicated gpu card but driven by a multi-display nvidia.

.....@.......:~$ lsmod | grep nvidia
nvidia_uvm            671744  0
nvidia_drm             45056  1
nvidia_modeset        843776  7 nvidia_drm
nvidia              13004800  626 nvidia_modeset,nvidia_uvm
drm_kms_helper        151552  1 nvidia_drm
drm                   352256  4 nvidia_drm,drm_kms_helper
.....@.......:~$

@uwesterr
Copy link
Author

uwesterr commented Dec 4, 2017

issue closed, effort to create DL docker was started, see rocker-org/rocker#273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants