Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Properly handle sccache file permission in Linux GPU workflow #4967

Closed
qiao-bo opened this issue May 12, 2022 · 4 comments
Closed

[CI] Properly handle sccache file permission in Linux GPU workflow #4967

qiao-bo opened this issue May 12, 2022 · 4 comments
Assignees
Labels

Comments

@qiao-bo
Copy link
Contributor

qiao-bo commented May 12, 2022

At the moment, we tar and copy the checked-out code from our GPU Linux job into the executed docker container.

tar -cf - ../${{ github.event.repository.name }} --mode u=+rwx,g=+rwx,o=+rwx --owner 1000 --group 1000 | docker cp - taichi_build:/home/dev/

This was implemented some time ago for sccache handling (#3559). We need a better way to handle this because this step introduces two major problems in our CI.

  1. The specified PID and GID must match the buildbot's execution user. Note that the value 1000 as specified, typically this value is assigned to the first user on e.g., Ubuntu. In another words, if a buildbot has two users, and we use the second one to execute the docker container. We will not match the permissions in the file system and some errors such as cannot write to metadata.tcb will be reported.
  2. This leaves our GPU Linux CI dirty. As we implicitly rely on the checkout code of the Linux job, which on our self hosted runner is of location github-actions/_work/taichi/taichi. This folder is not deleted between different runs. The problem now is analogous to Windows workflow.

Welcome any suggestions!

@taichi-ci-bot taichi-ci-bot moved this to Untriaged in Taichi Lang May 12, 2022
@qiao-bo
Copy link
Contributor Author

qiao-bo commented May 12, 2022

CC @ailzhang @frostming @rexwangcc

@lin-hitonami any idea of what are the alternatives to handle sccache?

@neozhaoliang neozhaoliang moved this from Untriaged to Todo in Taichi Lang May 13, 2022
@ailzhang ailzhang added the ci label Jun 15, 2022
@ailzhang
Copy link
Contributor

cc: @feisuzhu fyi this might be an interesting issue for you - a great start to cleanup our docker related mess :P

@frostming
Copy link
Collaborator

maybe i will get the time to restart #4776, now it has some conflicts with the permissions on the self hosted runners

@feisuzhu feisuzhu self-assigned this Jun 23, 2022
@feisuzhu
Copy link
Contributor

Issue is irrelevant now.

@github-project-automation github-project-automation bot moved this from Todo to Done in Taichi Lang May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

4 participants