Skip to content

Commit

Permalink
[CUDA] upgrade opencv in stable diffusion demo (#22470)
Browse files Browse the repository at this point in the history
### Description
(1) Upgrade opencv
(2) Add some comments about onnxruntime-gpu installation

### Motivation and Context
opencv-python was locked to an older version, which has security
vulnerabilities: see #22445
for more info
  • Loading branch information
tianleiwu authored Oct 22, 2024
1 parent c1f7485 commit 8a04ab4
Show file tree
Hide file tree
Showing 7 changed files with 47 additions and 76 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,8 @@ docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:24.04-p
```

#### Build onnxruntime from source
The cuDNN in the container might not be compatible with official onnxruntime-gpu package, it is recommended to build from source instead.
This step is optional. Please look at [install onnxruntime-gpu](https://onnxruntime.ai/docs/install/#python-installs) if you do not want to build from source.

After launching the docker, you can build and install onnxruntime-gpu wheel like the following.
```
export CUDACXX=/usr/local/cuda/bin/nvcc
git config --global --add safe.directory '*'
Expand All @@ -60,9 +59,17 @@ If the GPU is not A100, change `CMAKE_CUDA_ARCHITECTURES=80` in the command line
If your machine has less than 64GB memory, replace `--parallel` by `--parallel 4 --nvcc_threads 1 ` to avoid out of memory.

#### Install required packages
First, remove older version of opencv to avoid error like `module 'cv2.dnn' has no attribute 'DictValue'`:
```
pip uninstall -y $(pip list --format=freeze | grep opencv)
rm -rf /usr/local/lib/python3.10/dist-packages/cv2/
apt-get update
DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv
```

```
cd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion
python3 -m pip install -r requirements-cuda12.txt
python3 -m pip install -r requirements/cuda12/requirements.txt
python3 -m pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
```

Expand Down Expand Up @@ -136,15 +143,18 @@ conda activate py310

### Setup Environment (CUDA) without docker

First, we need install CUDA 11.8 or 12.1, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html) 8.5 or above, and [TensorRT 8.6.1](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine.
First, we need install CUDA 11.8 or 12.x, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html), and [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine.

The verison of CuDNN can be found in https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements.
The version of TensorRT can be found in https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#requirements.

#### CUDA 11.8:

In the Conda environment, install PyTorch 2.1 or above, and other required packages like the following:
In the Conda environment, install PyTorch 2.1 up to 2.3.1, and other required packages like the following:
```
pip install torch --index-url https://download.pytorch.org/whl/cu118
pip install torch>=2.1,<2.4 --index-url https://download.pytorch.org/whl/cu118
pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
pip install -r requirements-cuda11.txt
pip install -r requirements/cuda11/requirements.txt
```

For Windows, install nvtx like the following:
Expand All @@ -157,77 +167,40 @@ We cannot directly `pip install tensorrt` for CUDA 11. Follow https://github.com
For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead. Like `pip install tensorrt-8.6.1.6.windows10.x86_64.cuda-11.8\tensorrt-8.6.1.6\python\tensorrt-8.6.1-cp310-none-win_amd64.whl`.

#### CUDA 12.*:
The official package of onnxruntime-gpu 1.16.* is built for CUDA 11.8. To use CUDA 12.*, you will need [build onnxruntime from source](https://onnxruntime.ai/docs/build/inferencing.html).

```
git clone --recursive https://github.com/Microsoft/onnxruntime.git
cd onnxruntime
pip install cmake
pip install -r requirements-dev.txt
```
Follow [example script for A100 in Ubuntu](https://github.com/microsoft/onnxruntime/blob/26a7b63716e3125bfe35fe3663ba10d2d7322628/build_release.sh)
or [example script for RTX 4090 in Windows](https://github.com/microsoft/onnxruntime/blob/8df5f4e0df1f3b9ceeb0f1f2561b09727ace9b37/build_trt.cmd) to build and install onnxruntime-gpu wheel.

Then install other python packages like the following:
The official package of onnxruntime-gpu 1.19.x is built for CUDA 12.x. You can install it and other python packages like the following:
```
pip install torch --index-url https://download.pytorch.org/whl/cu121
pip install onnxruntime-gpu
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
pip install -r requirements-cuda12.txt
pip install -r requirements/cuda12/requirements.txt
```
Finally, `pip install tensorrt` for Linux. For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead.

### Setup Environment (ROCm)

It is recommended that the users run the model with ROCm 5.4 or newer and Python 3.10.
It is recommended that the users run the model with ROCm 6.2 or newer and Python 3.10. You can follow the following to install ROCm 6.x: https://rocmdocs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html
Note that Windows is not supported for ROCm at the moment.

```
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/torch-1.12.1%2Brocm5.4-cp38-cp38-linux_x86_64.whl
pip install torch-1.12.1+rocm5.4-cp38-cp38-linux_x86_64.whl
pip install -r requirements-rocm.txt
pip install -r requirements/rocm/requirements.txt
```

AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/).
AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/).

#### Install onnxruntime-rocm

Here is an example to build onnxruntime from source with Rocm 5.4.2 in Ubuntu 20.04, and install the wheel.

(1) Install [ROCm 5.4.2](https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.4.2/page/How_to_Install_ROCm.html). Note that the version is also used in PyTorch 2.0 ROCm package.

(2) Install some tools used in build:
```
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
wget \
zip \
ca-certificates \
build-essential \
curl \
libcurl4-openssl-dev \
libssl-dev \
python3-dev
pip install numpy packaging "wheel>=0.35.1"
wget --quiet https://github.com/Kitware/CMake/releases/download/v3.26.3/cmake-3.26.3-linux-x86_64.tar.gz
tar zxf cmake-3.26.3-linux-x86_64.tar.gz
export PATH=${PWD}/cmake-3.26.3-linux-x86_64/bin:${PATH}
```

(3) Build and Install ONNX Runtime
One option is to install prebuilt wheel from https://repo.radeon.com/rocm/manylinux like:
```
git clone https://github.com/microsoft/onnxruntime
cd onnxruntime
sh build.sh --config Release --use_rocm --rocm_home /opt/rocm --rocm_version 5.4.2 --build_wheel
pip install build/Linux/Release/dist/*.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl
pip install onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl
```

You can also follow the [official docs](https://onnxruntime.ai/docs/build/eps.html#amd-rocm) to build with docker.
If you want to use latest version of onnxruntime, you can build from source with Rocm 6.x following https://onnxruntime.ai/docs/build/eps.html#amd-rocm.
When the build is finished, you can install the wheel:`pip install build/Linux/Release/dist/*.whl`.

### Export ONNX pipeline
This step will export stable diffusion 1.5 to ONNX model in float32 using script from diffusers.

It is recommended to use PyTorch 1.12.1 or 1.13.1 in this step. Using PyTorch 2.0 will encounter issue in exporting onnx.

```
curl https://raw.githubusercontent.com/huggingface/diffusers/v0.15.1/scripts/convert_stable_diffusion_checkpoint_to_onnx.py > convert_sd_onnx.py
python convert_sd_onnx.py --model_path runwayml/stable-diffusion-v1-5 --output_path ./sd_v1_5/fp32
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
-r requirements.txt
-r ../requirements.txt

# For CUDA 12.*, you will need build onnxruntime-gpu from source and install the wheel. See README.md for detail.
# See https://onnxruntime.ai/docs/install/#python-installs for installation. The latest one in pypi is for cuda 12.
# onnxruntime-gpu>=1.16.2

py3nvml

# The version of cuda-python shall be compatible with installed CUDA version.
# For demo of TensorRT excution provider and TensortRT.
cuda-python>=12.1.0
cuda-python==11.8.0

# For windows, cuda-python need the following
pywin32; platform_system == "Windows"

# For windows, run `conda install -c conda-forge nvtx` instead
nvtx; platform_system != "Windows"

# Please install PyTorch 2.1 or above for 12.1 using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu121
# Please install PyTorch >=2.1 and <2.4 for CUDA 11.8 like the following:
# pip install torch==2.3.1 --index-url https://download.pytorch.org/whl/cu118

# Run the following command to install some extra packages for onnx graph optimization for TensorRT manually.
# pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
Original file line number Diff line number Diff line change
@@ -1,22 +1,21 @@
-r requirements.txt
-r ../requirements.txt

# Official onnxruntime-gpu 1.16.1 is built with CUDA 11.8.
onnxruntime-gpu>=1.16.2
onnxruntime-gpu>=1.19.2

py3nvml

# The version of cuda-python shall be compatible with installed CUDA version.
# For demo of TensorRT excution provider and TensortRT.
cuda-python==11.8.0
cuda-python>=12.1.0

# For windows, cuda-python need the following
pywin32; platform_system == "Windows"

# For windows, run `conda install -c conda-forge nvtx` instead
nvtx; platform_system != "Windows"

# Please install PyTorch 2.1 or above for CUDA 11.8 using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu118
# Please install PyTorch 2.4 or above using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu124

# Run the following command to install some extra packages for onnx graph optimization for TensorRT manually.
# pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,4 @@ controlnet_aux==0.0.9
optimum==1.20.0
safetensors
invisible_watermark
# newer version of opencv-python migth encounter module 'cv2.dnn' has no attribute 'DictValue' error
opencv-python==4.8.0.74
opencv-python-headless==4.8.0.74
opencv-python-headless
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-r ../requirements.txt
# Install onnxruntime-rocm that is built from source (https://onnxruntime.ai/docs/build/eps.html#amd-rocm)
Original file line number Diff line number Diff line change
Expand Up @@ -200,11 +200,15 @@ stages:
nvcr.io/nvidia/pytorch:22.11-py3 \
bash -c ' \
set -ex; \
pip uninstall -y $(pip list --format=freeze | grep opencv); \
rm -rf /usr/local/lib/python3.8/dist-packages/cv2/; \
apt-get update; \
DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv; \
python3 --version; \
python3 -m pip install --upgrade pip; \
python3 -m pip install /Release/*.whl; \
pushd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion; \
python3 -m pip install -r requirements-cuda11.txt; \
python3 -m pip install -r requirements/cuda11/requirements.txt; \
python3 -m pip install --upgrade polygraphy onnx-graphsurgeon ; \
echo Generate an image guided by a text prompt; \
python3 demo_txt2img.py --framework-model-dir /model_cache --seed 1 --deterministic "astronaut riding a horse on mars" ; \
Expand Down

0 comments on commit 8a04ab4

Please sign in to comment.