Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] upgrade opencv in stable diffusion demo #22470

Merged
merged 3 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,8 @@
```

#### Build onnxruntime from source
The cuDNN in the container might not be compatible with official onnxruntime-gpu package, it is recommended to build from source instead.
This step is optional. Please look at [install onnxruntime-gpu](https://onnxruntime.ai/docs/install/#python-installs) if you do not want to build from source.

After launching the docker, you can build and install onnxruntime-gpu wheel like the following.
```
export CUDACXX=/usr/local/cuda/bin/nvcc
git config --global --add safe.directory '*'
Expand All @@ -60,9 +59,17 @@
If your machine has less than 64GB memory, replace `--parallel` by `--parallel 4 --nvcc_threads 1 ` to avoid out of memory.

#### Install required packages
First, remove older version of opencv to avoid error like `module 'cv2.dnn' has no attribute 'DictValue'`:
```
pip uninstall -y $(pip list --format=freeze | grep opencv)
rm -rf /usr/local/lib/python3.10/dist-packages/cv2/
apt-get update
DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv
```

```
cd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion
python3 -m pip install -r requirements-cuda12.txt
python3 -m pip install -r requirements/cuda12/requirements.txt
python3 -m pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
```

Expand Down Expand Up @@ -136,15 +143,18 @@

### Setup Environment (CUDA) without docker

First, we need install CUDA 11.8 or 12.1, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html) 8.5 or above, and [TensorRT 8.6.1](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine.
First, we need install CUDA 11.8 or 12.x, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html), and [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine.

The verison of CuDNN can be found in https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements.

Check warning on line 148 in onnxruntime/python/tools/transformers/models/stable_diffusion/README.md

View workflow job for this annotation

GitHub Actions / Optional Lint

[misspell] reported by reviewdog 🐶 "verison" is a misspelling of "version" Raw Output: ./onnxruntime/python/tools/transformers/models/stable_diffusion/README.md:148:4: "verison" is a misspelling of "version"
The version of TensorRT can be found in https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#requirements.

#### CUDA 11.8:

In the Conda environment, install PyTorch 2.1 or above, and other required packages like the following:
In the Conda environment, install PyTorch 2.1 up to 2.3.1, and other required packages like the following:
```
pip install torch --index-url https://download.pytorch.org/whl/cu118
pip install torch>=2.1,<2.4 --index-url https://download.pytorch.org/whl/cu118
pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
pip install -r requirements-cuda11.txt
pip install -r requirements/cuda11/requirements.txt
```

For Windows, install nvtx like the following:
Expand All @@ -157,77 +167,40 @@
For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead. Like `pip install tensorrt-8.6.1.6.windows10.x86_64.cuda-11.8\tensorrt-8.6.1.6\python\tensorrt-8.6.1-cp310-none-win_amd64.whl`.

#### CUDA 12.*:
The official package of onnxruntime-gpu 1.16.* is built for CUDA 11.8. To use CUDA 12.*, you will need [build onnxruntime from source](https://onnxruntime.ai/docs/build/inferencing.html).

```
git clone --recursive https://github.com/Microsoft/onnxruntime.git
cd onnxruntime
pip install cmake
pip install -r requirements-dev.txt
```
Follow [example script for A100 in Ubuntu](https://github.com/microsoft/onnxruntime/blob/26a7b63716e3125bfe35fe3663ba10d2d7322628/build_release.sh)
or [example script for RTX 4090 in Windows](https://github.com/microsoft/onnxruntime/blob/8df5f4e0df1f3b9ceeb0f1f2561b09727ace9b37/build_trt.cmd) to build and install onnxruntime-gpu wheel.

Then install other python packages like the following:
The official package of onnxruntime-gpu 1.19.x is built for CUDA 12.x. You can install it and other python packages like the following:
```
pip install torch --index-url https://download.pytorch.org/whl/cu121
pip install onnxruntime-gpu
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
pip install -r requirements-cuda12.txt
pip install -r requirements/cuda12/requirements.txt
```
Finally, `pip install tensorrt` for Linux. For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead.

### Setup Environment (ROCm)

It is recommended that the users run the model with ROCm 5.4 or newer and Python 3.10.
It is recommended that the users run the model with ROCm 6.2 or newer and Python 3.10. You can follow the following to install ROCm 6.x: https://rocmdocs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html
Note that Windows is not supported for ROCm at the moment.

```
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/torch-1.12.1%2Brocm5.4-cp38-cp38-linux_x86_64.whl
pip install torch-1.12.1+rocm5.4-cp38-cp38-linux_x86_64.whl
pip install -r requirements-rocm.txt
pip install -r requirements/rocm/requirements.txt
```

AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/).
AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/).

#### Install onnxruntime-rocm

Here is an example to build onnxruntime from source with Rocm 5.4.2 in Ubuntu 20.04, and install the wheel.

(1) Install [ROCm 5.4.2](https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.4.2/page/How_to_Install_ROCm.html). Note that the version is also used in PyTorch 2.0 ROCm package.

(2) Install some tools used in build:
```
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
wget \
zip \
ca-certificates \
build-essential \
curl \
libcurl4-openssl-dev \
libssl-dev \
python3-dev
pip install numpy packaging "wheel>=0.35.1"
wget --quiet https://github.com/Kitware/CMake/releases/download/v3.26.3/cmake-3.26.3-linux-x86_64.tar.gz
tar zxf cmake-3.26.3-linux-x86_64.tar.gz
export PATH=${PWD}/cmake-3.26.3-linux-x86_64/bin:${PATH}
```

(3) Build and Install ONNX Runtime
One option is to install prebuilt wheel from https://repo.radeon.com/rocm/manylinux like:
```
git clone https://github.com/microsoft/onnxruntime
cd onnxruntime
sh build.sh --config Release --use_rocm --rocm_home /opt/rocm --rocm_version 5.4.2 --build_wheel
pip install build/Linux/Release/dist/*.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl
pip install onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl
```

You can also follow the [official docs](https://onnxruntime.ai/docs/build/eps.html#amd-rocm) to build with docker.
If you want to use latest version of onnxruntime, you can build from source with Rocm 6.x following https://onnxruntime.ai/docs/build/eps.html#amd-rocm.
When the build is finished, you can install the wheel:`pip install build/Linux/Release/dist/*.whl`.

### Export ONNX pipeline
This step will export stable diffusion 1.5 to ONNX model in float32 using script from diffusers.

It is recommended to use PyTorch 1.12.1 or 1.13.1 in this step. Using PyTorch 2.0 will encounter issue in exporting onnx.

```
curl https://raw.githubusercontent.com/huggingface/diffusers/v0.15.1/scripts/convert_stable_diffusion_checkpoint_to_onnx.py > convert_sd_onnx.py
python convert_sd_onnx.py --model_path runwayml/stable-diffusion-v1-5 --output_path ./sd_v1_5/fp32
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
-r requirements.txt
-r ../requirements.txt

# For CUDA 12.*, you will need build onnxruntime-gpu from source and install the wheel. See README.md for detail.
# See https://onnxruntime.ai/docs/install/#python-installs for installation. The latest one in pypi is for cuda 12.
# onnxruntime-gpu>=1.16.2

py3nvml

# The version of cuda-python shall be compatible with installed CUDA version.
# For demo of TensorRT excution provider and TensortRT.
cuda-python>=12.1.0
cuda-python==11.8.0

# For windows, cuda-python need the following
pywin32; platform_system == "Windows"

# For windows, run `conda install -c conda-forge nvtx` instead
nvtx; platform_system != "Windows"

# Please install PyTorch 2.1 or above for 12.1 using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu121
# Please install PyTorch >=2.1 and <2.4 for CUDA 11.8 like the following:
# pip install torch==2.3.1 --index-url https://download.pytorch.org/whl/cu118

# Run the following command to install some extra packages for onnx graph optimization for TensorRT manually.
# pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
Original file line number Diff line number Diff line change
@@ -1,22 +1,21 @@
-r requirements.txt
-r ../requirements.txt

# Official onnxruntime-gpu 1.16.1 is built with CUDA 11.8.
onnxruntime-gpu>=1.16.2
onnxruntime-gpu>=1.19.2

py3nvml

# The version of cuda-python shall be compatible with installed CUDA version.
# For demo of TensorRT excution provider and TensortRT.
cuda-python==11.8.0
cuda-python>=12.1.0

# For windows, cuda-python need the following
pywin32; platform_system == "Windows"

# For windows, run `conda install -c conda-forge nvtx` instead
nvtx; platform_system != "Windows"

# Please install PyTorch 2.1 or above for CUDA 11.8 using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu118
# Please install PyTorch 2.4 or above using one of the following commands:
# pip3 install torch --index-url https://download.pytorch.org/whl/cu124

# Run the following command to install some extra packages for onnx graph optimization for TensorRT manually.
# pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,4 @@ controlnet_aux==0.0.9
optimum==1.20.0
safetensors
invisible_watermark
# newer version of opencv-python migth encounter module 'cv2.dnn' has no attribute 'DictValue' error
opencv-python==4.8.0.74
opencv-python-headless==4.8.0.74
opencv-python-headless
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-r ../requirements.txt
# Install onnxruntime-rocm that is built from source (https://onnxruntime.ai/docs/build/eps.html#amd-rocm)
Original file line number Diff line number Diff line change
Expand Up @@ -200,11 +200,15 @@ stages:
nvcr.io/nvidia/pytorch:22.11-py3 \
bash -c ' \
set -ex; \
pip uninstall -y $(pip list --format=freeze | grep opencv); \
rm -rf /usr/local/lib/python3.8/dist-packages/cv2/; \
apt-get update; \
DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv; \
python3 --version; \
python3 -m pip install --upgrade pip; \
python3 -m pip install /Release/*.whl; \
pushd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion; \
python3 -m pip install -r requirements-cuda11.txt; \
python3 -m pip install -r requirements/cuda11/requirements.txt; \
python3 -m pip install --upgrade polygraphy onnx-graphsurgeon ; \
echo Generate an image guided by a text prompt; \
python3 demo_txt2img.py --framework-model-dir /model_cache --seed 1 --deterministic "astronaut riding a horse on mars" ; \
Expand Down
Loading