Dockerize CI + Release builds #1234

powderluv · 2022-08-16T11:34:09Z

Gets both CI and Release builds integrated in one workflow.

Tested the callout with in-tree / out-of-tree and torch-mlir
release packages

TODO: add the correct CMake commands in the functions.
Mount ccache and pip cache as required

Out to build all the CI and Release builds in one go:

anush@denali ~/github/torch-mlir % TM_PACKAGES="out-of-tree in-tree torch-mlir" TM_TORCH_SRC=ON TM_PYTHON_VERSIONS="cp310-cp310"  ./build_tools/python_deploy/build_linux_packages.sh                                   
                                                                                                                                                                                                                     
Setting torch-mlir Python Package version to: 0.0.1                                                                                                                                                                  
Running on host                                                                                                                                                                                                      
Launching docker image stellaraccident/manylinux2014_x86_64-bazel-5.1.0:latest                                                                                                                                       
Outputting to /home/anush/github/torch-mlir/build_tools/python_deploy/wheelhouse                                                                                                                                     
Setting torch-mlir Python Package version to: 0.0.1                                                                                                                                                                  
Running in docker                                                                                                                                                                                                    
Using python versions: cp310-cp310                                                                                                                                                                                   
******************** BUILDING PACKAGE out-of-tree ********************                                                                                                                                               
:::: Python version Python 3.10.4                                                                                                                                                                                    
:::: Clean build dir out-of-tree ecp310-cp310                                                                                                                                                                        
:::: Build out-of-tree Torch from source: ON                                                                                                                                                                         
:::: Test out-of-tree                                                                                                                                                                                                
******************** BUILDING PACKAGE in-tree ********************                                                                                                                                                   
:::: Python version Python 3.10.4                                                                                                                                                                                    
:::: Clean build dir in-tree ecp310-cp310                                                                                                                                                                            
:::: Build in-tree Torch from source: ON                                                                                                                                                                             
:::: Test in-tree                                                                                                                                                                                                    
******************** BUILDING PACKAGE torch-mlir ********************                                                                                                                                                
:::: Python version Python 3.10.4                                                                                                                                                                                    
:::: Clean wheels torch_mlir cp310-cp310                                                                                                                                                                             
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/nightly/cpu                                                                                                                            
Looking in links: https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html                                                                                                                                    
Collecting torch                                                                                                                                                                                                     
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-1.13.0.dev20220816%2Bcpu-cp310-cp310-linux_x86_64.whl (195.1 MB)                                                                                    
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 195.1/195.1 MB 9.9 MB/s eta 0:00:00                                                                                                                                     
Collecting numpy                                                                                                                     
....
  adding 'torch_mlir-0.0.1.dist-info/WHEEL'
  adding 'torch_mlir-0.0.1.dist-info/top_level.txt'
  adding 'torch_mlir-0.0.1.dist-info/RECORD'
  removing build/bdist.linux-x86_64/wheel
  Building wheel for torch-mlir (pyproject.toml): finished with status 'done'
  Created wheel for torch-mlir: filename=torch_mlir-0.0.1-cp310-cp310-linux_x86_64.whl size=35159325 sha256=3f00551a9f426b8f58457a1306a143ac9f66865113dbaf2050b082e2ae543654
  Stored in directory: /root/.cache/pip/wheels/b9/1a/c9/fd0f4b77f13d9149c3c25bd33ce4a61807bcc222a8b04a5ea9
Successfully built torch-mlir
WARNING: You are using pip version 22.0.4; however, version 22.2.2 is available.
You should consider upgrading via the '/opt/python/cp310-cp310/bin/python -m pip install --upgrade pip' command.

If we want to use Ubuntu 22.04 for the CI:

anush@denali ~/github/torch-mlir % TM_PACKAGES="out-of-tree in-tree" TM_TORCH_SRC=ON TM_PYTHON_VERSIONS="cp310-cp310" TM_DOCKER_IMAGE="ubuntu:22.04" ./build_tools/python_deploy/build_linux_packages.sh

Setting torch-mlir Python Package version to: 0.0.1
Running on host
Launching docker image ubuntu:22.04
Outputting to /home/anush/github/torch-mlir/build_tools/python_deploy/wheelhouse
Unable to find image 'ubuntu:22.04' locally
22.04: Pulling from library/ubuntu
d19f32bd9e41: Pull complete 
Digest: sha256:34fea4f31bf187bc915536831fd0afc9d214755bf700b5cdb1336c82516d154e
Status: Downloaded newer image for ubuntu:22.04

powderluv · 2022-08-29T06:11:40Z

Gets both CI and Release builds integrated in one workflow.
Mount ccache and pip cache as required for fast iterative builds
Current Release docker builds still run with root perms, fix it
in the future to run as the same user.

There may be some corner cases left especially when switching
build types etc.

Docker build TEST plan:

tl;dr:
Build everything: Releases (Python 3.8, 3.9, 3.10) and CIs.
TM_PACKAGES="torch-mlir out-of-tree in-tree" 2.57s user 2.49s system 0% cpu 30:33.11 total

Out of Tree + PyTorch binaries:

Fresh build (purged cache):
TM_PACKAGES="out-of-tree" 0.47s user 0.51s system 0% cpu 5:24.99 total

Incremental with ccache:
TM_PACKAGES="out-of-tree" 0.09s user 0.08s system 0% cpu 34.817 total

Out of Tree + PyTorch from source

Incremental
TM_PACKAGES="out-of-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.81s system 2% cpu 1:59.61 total

In-Tree + PyTorch binaries:

Fresh build and tests: (purge ccache)
TM_PACKAGES="in-tree" 0.53s user 0.49s system 0% cpu 6:23.35 total

Fresh build/ but with prior ccache
TM_PACKAGES="in-tree" 0.45s user 0.66s system 0% cpu 3:57.47 total

Incremental in-tree with all tests and regression tests
TM_PACKAGES="in-tree" 0.16s user 0.09s system 0% cpu 2:18.52 total

In-Tree + PyTorch from source

Fresh build and tests: (purge ccache)
TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 2.03s user 2.28s system 0% cpu 11:11.86 total

Fresh build/ but with prior ccache
TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.88s system 1% cpu 4:53.15 total

Incremental in-tree with all tests and regression tests
TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.09s user 1.10s system 1% cpu 3:29.84 total

Incremental without tests
TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF TM_SKIP_TESTS=ON 1.52s user 1.42s system 3% cpu 1:15.82 total

In-tree+out-of-tree + Pytorch Binaries
TM_PACKAGES="out-of-tree in-tree" 0.25s user 0.18s system 0% cpu 3:01.91 total

To clear all artifacts:
rm -rf build build_oot llvm-build libtorch docker_venv externals/pytorch/build .ccache

silvasean · 2022-08-29T21:15:43Z

This seems reasonable to me. @sjain-stanford ?

sjain-stanford

Thanks @powderluv . This looks great. I will give the "local" build+test flow a try later today (very excited!). The main request I have is - since we set out to "dockerize CI" - it'd be good to also see GHA CI workflows updated to use these docker flows. This will validate the requirements fully, and ensure any cache issues or other GHA issues can be addressed alongside this PR.

powderluv · 2022-08-29T21:39:57Z

Happy to add the GHA pieces in the follow-on but wanted to get the base functionality in first so we don't have a mega commit and easy to revert just GHA if something goes haywire

docs/development.md

build_tools/python_deploy/build_linux_packages.sh

powderluv · 2022-08-30T14:02:34Z

I have also added the GHA workflows now in a follow on commit. It is currently running CI etc. #1313 and Release builds pass (https://github.com/llvm/torch-mlir/runs/8090506802?check_suite_focus=true).

sjain-stanford

LG(reat)TM. From my local testing, I can confirm that re-runs are blazing fast (utilize pip cache)! Left some minor comments to get this going.

I have also added the GHA workflows now in a follow on commit. It is currently running CI etc. #1313

It seems #1313 doesn't have GHA workflows yet - I'm showing this PR replicated there, could you PTAL? Again, thanks for working on the follow-on commit to validate the GHA workflows as well.

build_tools/python_deploy/build_linux_packages.sh

build_tools/docker/Dockerfile

build_tools/python_deploy/build_linux_packages.sh

docs/development.md

Gets both CI and Release builds integrated in one workflow. Mount ccache and pip cache as required for fast iterative builds Current Release docker builds still run with root perms, fix it in the future to run as the same user. There may be some corner cases left especially when switching build types etc. Docker build TEST plan: tl;dr: Build everythin: Releases (Python 3.8, 3.9, 3.10) and CIs. TM_PACKAGES="torch-mlir out-of-tree in-tree" 2.57s user 2.49s system 0% cpu 30:33.11 total Out of Tree + PyTorch binaries: Fresh build (purged cache): TM_PACKAGES="out-of-tree" 0.47s user 0.51s system 0% cpu 5:24.99 total Incremental with ccache: TM_PACKAGES="out-of-tree" 0.09s user 0.08s system 0% cpu 34.817 total Out of Tree + PyTorch from source Incremental TM_PACKAGES="out-of-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.81s system 2% cpu 1:59.61 total In-Tree + PyTorch binaries: Fresh build and tests: (purge ccache) TM_PACKAGES="in-tree" 0.53s user 0.49s system 0% cpu 6:23.35 total Fresh build/ but with prior ccache TM_PACKAGES="in-tree" 0.45s user 0.66s system 0% cpu 3:57.47 total Incremental in-tree with all tests and regression tests TM_PACKAGES="in-tree" 0.16s user 0.09s system 0% cpu 2:18.52 total In-Tree + PyTorch from source Fresh build and tests: (purge ccache) TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 2.03s user 2.28s system 0% cpu 11:11.86 total Fresh build/ but with prior ccache TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.88s system 1% cpu 4:53.15 total Incremental in-tree with all tests and regression tests TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.09s user 1.10s system 1% cpu 3:29.84 total Incremental without tests TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF TM_SKIP_TESTS=ON 1.52s user 1.42s system 3% cpu 1:15.82 total In-tree+out-of-tree + Pytorch Binaries TM_PACKAGES="out-of-tree in-tree" 0.25s user 0.18s system 0% cpu 3:01.91 total To clear all artifacts: rm -rf build build_oot llvm-build libtorch docker_venv externals/pytorch/build

Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow.

ashay

Sorry for jumping in late, but I just wanted to say Thanks for adding the instructions to docs/development.md! Having never quite figured out how to get Docker working, I was concerned that I'd never be able to figure out how to make use of this change, but the setup instructions are very helpful.

ashay · 2022-08-31T13:21:11Z

build_tools/python_deploy/build_linux_packages.sh

+# Location to store Release wheels
+TM_OUTPUT_DIR="${TM_OUTPUT_DIR:-${this_dir}/wheelhouse}"
+# What "packages to build"
+TM_PACKAGES="${TM_PACKAGES:-torch-mlir out-of-tree in-tree}"


I think this change (building all three packages) is causing a timeout in the Release build [https://github.com/llvm/torch-mlir/runs/8105971333?check_suite_focus=true].

powderluv · 2022-08-31T13:30:27Z

Yeah I noticed a timeout yesterday too and a rerun ran faster. There was no functional change for the release build but if you noticed anything that got added that could affect it please let me know

powderluv · 2022-08-31T16:10:02Z

Actually looking at the code somemore we did change the docker settings to use --ipc=host and bumped the ulimit so the tests can pass. So it is possible the host this VM is running on is slow w.r.t the IPC etc.

Opened #1322 to investigate

Gets both CI and Release builds integrated in one workflow. Mount ccache and pip cache as required for fast iterative builds Current Release docker builds still run with root perms, fix it in the future to run as the same user. There may be some corner cases left especially when switching build types etc. Docker build TEST plan: tl;dr: Build everythin: Releases (Python 3.8, 3.9, 3.10) and CIs. TM_PACKAGES="torch-mlir out-of-tree in-tree" 2.57s user 2.49s system 0% cpu 30:33.11 total Out of Tree + PyTorch binaries: Fresh build (purged cache): TM_PACKAGES="out-of-tree" 0.47s user 0.51s system 0% cpu 5:24.99 total Incremental with ccache: TM_PACKAGES="out-of-tree" 0.09s user 0.08s system 0% cpu 34.817 total Out of Tree + PyTorch from source Incremental TM_PACKAGES="out-of-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.81s system 2% cpu 1:59.61 total In-Tree + PyTorch binaries: Fresh build and tests: (purge ccache) TM_PACKAGES="in-tree" 0.53s user 0.49s system 0% cpu 6:23.35 total Fresh build/ but with prior ccache TM_PACKAGES="in-tree" 0.45s user 0.66s system 0% cpu 3:57.47 total Incremental in-tree with all tests and regression tests TM_PACKAGES="in-tree" 0.16s user 0.09s system 0% cpu 2:18.52 total In-Tree + PyTorch from source Fresh build and tests: (purge ccache) TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 2.03s user 2.28s system 0% cpu 11:11.86 total Fresh build/ but with prior ccache TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.58s user 1.88s system 1% cpu 4:53.15 total Incremental in-tree with all tests and regression tests TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF 1.09s user 1.10s system 1% cpu 3:29.84 total Incremental without tests TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY=OFF TM_SKIP_TESTS=ON 1.52s user 1.42s system 3% cpu 1:15.82 total In-tree+out-of-tree + Pytorch Binaries TM_PACKAGES="out-of-tree in-tree" 0.25s user 0.18s system 0% cpu 3:01.91 total To clear all artifacts: rm -rf build build_oot llvm-build libtorch docker_venv externals/pytorch/build

* Move CIs to use docker builds Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow. * update names * Update comments

Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com>

This was referenced Aug 16, 2022

Dockerize CI (1/N) #1225

Closed

Use Docker for Bazel Build #1232

Closed

powderluv force-pushed the docker-ci branch from ee26b8b to d21138f Compare August 29, 2022 06:04

powderluv changed the title ~~WIP: Dockerize CI + Release builds~~ Dockerize CI + Release builds Aug 29, 2022

powderluv requested review from silvasean, ashay, antoniojkim, sjain-stanford and makslevental August 29, 2022 06:07

This was referenced Aug 29, 2022

Unclear how to do a "really clean" build #1293

Closed

Build Linux releases on big managed runners iree-org/iree#10126

Merged

sjain-stanford reviewed Aug 29, 2022

View reviewed changes

docs/development.md Show resolved Hide resolved

sjain-stanford reviewed Aug 29, 2022

View reviewed changes

build_tools/python_deploy/build_linux_packages.sh Outdated Show resolved Hide resolved

sjain-stanford reviewed Aug 29, 2022

View reviewed changes

build_tools/python_deploy/build_linux_packages.sh Show resolved Hide resolved

powderluv force-pushed the docker-ci branch 2 times, most recently from 2055b25 to 6a8a345 Compare August 30, 2022 14:02

powderluv requested a review from sjain-stanford August 30, 2022 14:03

sjain-stanford approved these changes Aug 30, 2022

View reviewed changes

powderluv force-pushed the docker-ci branch from 6a8a345 to b0570d7 Compare August 30, 2022 17:02

powderluv merged commit 9f061ea into llvm:main Aug 30, 2022

powderluv deleted the docker-ci branch August 30, 2022 18:07

powderluv added a commit that referenced this pull request Aug 30, 2022

Move CIs to use docker builds

6f85281

Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow.

powderluv mentioned this pull request Aug 30, 2022

Move CIs to use docker builds #1316

Merged

ashay reviewed Aug 31, 2022

View reviewed changes

powderluv added a commit that referenced this pull request Sep 3, 2022

Move CIs to use docker builds (#1316)

e6528f7

* Move CIs to use docker builds Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow. * update names * Update comments

tanyokwok mentioned this pull request Sep 21, 2022

features/bladedisc rebase 20220830 pai-disc/torch-mlir#20

Closed

qedawkins pushed a commit to nod-ai/torch-mlir that referenced this pull request Oct 3, 2022

removed file (llvm#1234)

8a29010

Signed-off-by: Alexandre Eichenberger <alexe@us.ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerize CI + Release builds #1234

Dockerize CI + Release builds #1234

powderluv commented Aug 16, 2022

powderluv commented Aug 29, 2022

silvasean commented Aug 29, 2022

sjain-stanford left a comment

powderluv commented Aug 29, 2022

powderluv commented Aug 30, 2022 •

edited

Loading

sjain-stanford left a comment •

edited

Loading

ashay left a comment

ashay Aug 31, 2022

powderluv commented Aug 31, 2022

powderluv commented Aug 31, 2022

Dockerize CI + Release builds #1234

Dockerize CI + Release builds #1234

Conversation

powderluv commented Aug 16, 2022

powderluv commented Aug 29, 2022

silvasean commented Aug 29, 2022

sjain-stanford left a comment

Choose a reason for hiding this comment

powderluv commented Aug 29, 2022

powderluv commented Aug 30, 2022 • edited Loading

sjain-stanford left a comment • edited Loading

Choose a reason for hiding this comment

ashay left a comment

Choose a reason for hiding this comment

ashay Aug 31, 2022

Choose a reason for hiding this comment

powderluv commented Aug 31, 2022

powderluv commented Aug 31, 2022

powderluv commented Aug 30, 2022 •

edited

Loading

sjain-stanford left a comment •

edited

Loading