diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md b/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md index 9c1c31626066d..edef0d3ee5453 100644 --- a/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md +++ b/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md @@ -40,9 +40,8 @@ docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:24.04-p ``` #### Build onnxruntime from source -The cuDNN in the container might not be compatible with official onnxruntime-gpu package, it is recommended to build from source instead. +This step is optional. Please look at [install onnxruntime-gpu](https://onnxruntime.ai/docs/install/#python-installs) if you do not want to build from source. -After launching the docker, you can build and install onnxruntime-gpu wheel like the following. ``` export CUDACXX=/usr/local/cuda/bin/nvcc git config --global --add safe.directory '*' @@ -60,9 +59,17 @@ If the GPU is not A100, change `CMAKE_CUDA_ARCHITECTURES=80` in the command line If your machine has less than 64GB memory, replace `--parallel` by `--parallel 4 --nvcc_threads 1 ` to avoid out of memory. #### Install required packages +First, remove older version of opencv to avoid error like `module 'cv2.dnn' has no attribute 'DictValue'`: +``` +pip uninstall -y $(pip list --format=freeze | grep opencv) +rm -rf /usr/local/lib/python3.10/dist-packages/cv2/ +apt-get update +DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv +``` + ``` cd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion -python3 -m pip install -r requirements-cuda12.txt +python3 -m pip install -r requirements/cuda12/requirements.txt python3 -m pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com ``` @@ -136,15 +143,18 @@ conda activate py310 ### Setup Environment (CUDA) without docker -First, we need install CUDA 11.8 or 12.1, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html) 8.5 or above, and [TensorRT 8.6.1](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine. +First, we need install CUDA 11.8 or 12.x, [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html), and [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) in the machine. + +The verison of CuDNN can be found in https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements. +The version of TensorRT can be found in https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#requirements. #### CUDA 11.8: -In the Conda environment, install PyTorch 2.1 or above, and other required packages like the following: +In the Conda environment, install PyTorch 2.1 up to 2.3.1, and other required packages like the following: ``` -pip install torch --index-url https://download.pytorch.org/whl/cu118 +pip install torch>=2.1,<2.4 --index-url https://download.pytorch.org/whl/cu118 pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com -pip install -r requirements-cuda11.txt +pip install -r requirements/cuda11/requirements.txt ``` For Windows, install nvtx like the following: @@ -157,77 +167,40 @@ We cannot directly `pip install tensorrt` for CUDA 11. Follow https://github.com For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead. Like `pip install tensorrt-8.6.1.6.windows10.x86_64.cuda-11.8\tensorrt-8.6.1.6\python\tensorrt-8.6.1-cp310-none-win_amd64.whl`. #### CUDA 12.*: -The official package of onnxruntime-gpu 1.16.* is built for CUDA 11.8. To use CUDA 12.*, you will need [build onnxruntime from source](https://onnxruntime.ai/docs/build/inferencing.html). - -``` -git clone --recursive https://github.com/Microsoft/onnxruntime.git -cd onnxruntime -pip install cmake -pip install -r requirements-dev.txt -``` -Follow [example script for A100 in Ubuntu](https://github.com/microsoft/onnxruntime/blob/26a7b63716e3125bfe35fe3663ba10d2d7322628/build_release.sh) -or [example script for RTX 4090 in Windows](https://github.com/microsoft/onnxruntime/blob/8df5f4e0df1f3b9ceeb0f1f2561b09727ace9b37/build_trt.cmd) to build and install onnxruntime-gpu wheel. - -Then install other python packages like the following: +The official package of onnxruntime-gpu 1.19.x is built for CUDA 12.x. You can install it and other python packages like the following: ``` -pip install torch --index-url https://download.pytorch.org/whl/cu121 +pip install onnxruntime-gpu +pip install torch --index-url https://download.pytorch.org/whl/cu124 pip install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com -pip install -r requirements-cuda12.txt +pip install -r requirements/cuda12/requirements.txt ``` Finally, `pip install tensorrt` for Linux. For Windows, pip install the tensorrt wheel in the downloaded TensorRT zip file instead. ### Setup Environment (ROCm) -It is recommended that the users run the model with ROCm 5.4 or newer and Python 3.10. +It is recommended that the users run the model with ROCm 6.2 or newer and Python 3.10. You can follow the following to install ROCm 6.x: https://rocmdocs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html Note that Windows is not supported for ROCm at the moment. ``` -wget https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/torch-1.12.1%2Brocm5.4-cp38-cp38-linux_x86_64.whl -pip install torch-1.12.1+rocm5.4-cp38-cp38-linux_x86_64.whl -pip install -r requirements-rocm.txt +pip install -r requirements/rocm/requirements.txt ``` -AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-5.4/). +AMD GPU version of PyTorch can be installed from [pytorch.org](https://pytorch.org/get-started/locally/) or [AMD Radeon repo](https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/). #### Install onnxruntime-rocm -Here is an example to build onnxruntime from source with Rocm 5.4.2 in Ubuntu 20.04, and install the wheel. - -(1) Install [ROCm 5.4.2](https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.4.2/page/How_to_Install_ROCm.html). Note that the version is also used in PyTorch 2.0 ROCm package. - -(2) Install some tools used in build: -``` -sudo apt-get update -sudo apt-get install -y --no-install-recommends \ - wget \ - zip \ - ca-certificates \ - build-essential \ - curl \ - libcurl4-openssl-dev \ - libssl-dev \ - python3-dev -pip install numpy packaging "wheel>=0.35.1" -wget --quiet https://github.com/Kitware/CMake/releases/download/v3.26.3/cmake-3.26.3-linux-x86_64.tar.gz -tar zxf cmake-3.26.3-linux-x86_64.tar.gz -export PATH=${PWD}/cmake-3.26.3-linux-x86_64/bin:${PATH} -``` - -(3) Build and Install ONNX Runtime +One option is to install prebuilt wheel from https://repo.radeon.com/rocm/manylinux like: ``` -git clone https://github.com/microsoft/onnxruntime -cd onnxruntime -sh build.sh --config Release --use_rocm --rocm_home /opt/rocm --rocm_version 5.4.2 --build_wheel -pip install build/Linux/Release/dist/*.whl +wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl +pip install onnxruntime_rocm-1.18.0-cp310-cp310-linux_x86_64.whl ``` -You can also follow the [official docs](https://onnxruntime.ai/docs/build/eps.html#amd-rocm) to build with docker. +If you want to use latest version of onnxruntime, you can build from source with Rocm 6.x following https://onnxruntime.ai/docs/build/eps.html#amd-rocm. +When the build is finished, you can install the wheel:`pip install build/Linux/Release/dist/*.whl`. ### Export ONNX pipeline This step will export stable diffusion 1.5 to ONNX model in float32 using script from diffusers. -It is recommended to use PyTorch 1.12.1 or 1.13.1 in this step. Using PyTorch 2.0 will encounter issue in exporting onnx. - ``` curl https://raw.githubusercontent.com/huggingface/diffusers/v0.15.1/scripts/convert_stable_diffusion_checkpoint_to_onnx.py > convert_sd_onnx.py python convert_sd_onnx.py --model_path runwayml/stable-diffusion-v1-5 --output_path ./sd_v1_5/fp32 diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-rocm.txt b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-rocm.txt deleted file mode 100644 index c0a925e25b941..0000000000000 --- a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-rocm.txt +++ /dev/null @@ -1,5 +0,0 @@ --r requirements.txt -# Install onnxruntime-rocm or onnxruntime_training -# Build onnxruntime-rocm from source -# Directly install pre-built onnxruntime/onnxruntime-training rocm python package is not possible at the moment. -# TODO: update once we have public pre-built packages diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda12.txt b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda11/requirements.txt similarity index 64% rename from onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda12.txt rename to onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda11/requirements.txt index 4aa88cdf92309..bbc62ca4cbd18 100644 --- a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda12.txt +++ b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda11/requirements.txt @@ -1,13 +1,13 @@ --r requirements.txt +-r ../requirements.txt -# For CUDA 12.*, you will need build onnxruntime-gpu from source and install the wheel. See README.md for detail. +# See https://onnxruntime.ai/docs/install/#python-installs for installation. The latest one in pypi is for cuda 12. # onnxruntime-gpu>=1.16.2 py3nvml # The version of cuda-python shall be compatible with installed CUDA version. # For demo of TensorRT excution provider and TensortRT. -cuda-python>=12.1.0 +cuda-python==11.8.0 # For windows, cuda-python need the following pywin32; platform_system == "Windows" @@ -15,8 +15,8 @@ pywin32; platform_system == "Windows" # For windows, run `conda install -c conda-forge nvtx` instead nvtx; platform_system != "Windows" -# Please install PyTorch 2.1 or above for 12.1 using one of the following commands: -# pip3 install torch --index-url https://download.pytorch.org/whl/cu121 +# Please install PyTorch >=2.1 and <2.4 for CUDA 11.8 like the following: +# pip install torch==2.3.1 --index-url https://download.pytorch.org/whl/cu118 # Run the following command to install some extra packages for onnx graph optimization for TensorRT manually. # pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda11.txt b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda12/requirements.txt similarity index 73% rename from onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda11.txt rename to onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda12/requirements.txt index dc6592fc2fa54..89562e920ac00 100644 --- a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements-cuda11.txt +++ b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/cuda12/requirements.txt @@ -1,13 +1,12 @@ --r requirements.txt +-r ../requirements.txt -# Official onnxruntime-gpu 1.16.1 is built with CUDA 11.8. -onnxruntime-gpu>=1.16.2 +onnxruntime-gpu>=1.19.2 py3nvml # The version of cuda-python shall be compatible with installed CUDA version. # For demo of TensorRT excution provider and TensortRT. -cuda-python==11.8.0 +cuda-python>=12.1.0 # For windows, cuda-python need the following pywin32; platform_system == "Windows" @@ -15,8 +14,8 @@ pywin32; platform_system == "Windows" # For windows, run `conda install -c conda-forge nvtx` instead nvtx; platform_system != "Windows" -# Please install PyTorch 2.1 or above for CUDA 11.8 using one of the following commands: -# pip3 install torch --index-url https://download.pytorch.org/whl/cu118 +# Please install PyTorch 2.4 or above using one of the following commands: +# pip3 install torch --index-url https://download.pytorch.org/whl/cu124 # Run the following command to install some extra packages for onnx graph optimization for TensorRT manually. # pip3 install --upgrade polygraphy onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements.txt b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/requirements.txt similarity index 65% rename from onnxruntime/python/tools/transformers/models/stable_diffusion/requirements.txt rename to onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/requirements.txt index 1857b366194ec..8c9f0ba0f21be 100644 --- a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements.txt +++ b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/requirements.txt @@ -15,6 +15,4 @@ controlnet_aux==0.0.9 optimum==1.20.0 safetensors invisible_watermark -# newer version of opencv-python migth encounter module 'cv2.dnn' has no attribute 'DictValue' error -opencv-python==4.8.0.74 -opencv-python-headless==4.8.0.74 +opencv-python-headless diff --git a/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/rocm/requirements.txt b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/rocm/requirements.txt new file mode 100644 index 0000000000000..21b100fb61f17 --- /dev/null +++ b/onnxruntime/python/tools/transformers/models/stable_diffusion/requirements/rocm/requirements.txt @@ -0,0 +1,2 @@ +-r ../requirements.txt +# Install onnxruntime-rocm that is built from source (https://onnxruntime.ai/docs/build/eps.html#amd-rocm) diff --git a/tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml b/tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml index ad763277c732e..3ee4375329069 100644 --- a/tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml +++ b/tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml @@ -200,11 +200,15 @@ stages: nvcr.io/nvidia/pytorch:22.11-py3 \ bash -c ' \ set -ex; \ + pip uninstall -y $(pip list --format=freeze | grep opencv); \ + rm -rf /usr/local/lib/python3.8/dist-packages/cv2/; \ + apt-get update; \ + DEBIAN_FRONTEND="noninteractive" apt-get install --yes python3-opencv; \ python3 --version; \ python3 -m pip install --upgrade pip; \ python3 -m pip install /Release/*.whl; \ pushd /workspace/onnxruntime/python/tools/transformers/models/stable_diffusion; \ - python3 -m pip install -r requirements-cuda11.txt; \ + python3 -m pip install -r requirements/cuda11/requirements.txt; \ python3 -m pip install --upgrade polygraphy onnx-graphsurgeon ; \ echo Generate an image guided by a text prompt; \ python3 demo_txt2img.py --framework-model-dir /model_cache --seed 1 --deterministic "astronaut riding a horse on mars" ; \