-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self-hosted runner with Docker step creates files that trip up the checkout step #434
Comments
@j3parker thanks for reporting this. I am also facing the same issue. # Validate not sudo
user_id=`id -u`
if [ $user_id -eq 0 -a -z "$RUNNER_ALLOW_RUNASROOT" ]; then
echo "Must not run interactively with sudo"
exit 1
fi EDIT : @j3parker I think you can simply achieve this by exporting the variable |
If you are going to use container based actions I would recommend you use a job container as well. Mixing container and host environments does not work very well. The other option is to add a step to change permissions of files after the container action runs. |
Also, I don't think we can say must run runner as root since many folks run as systemd service and I don't think that will allow us to run as root. |
@karancode the reason that's an envvar is there's many scenarios (service) where it won't work as the service ends up running as a user (the user configured or one specified). but there are some scenarios (like putting the runner in a container) where you want / need to explicitly run as root and that's why the override is there. We should formalize the run as root more and doc it better. We need to keep this open and think a bit more on what the right solution is. If you want to do run as root for now and you don't run the runner as a systemd service, then make sure however you launch it, it calls runsvc.sh and not run.sh so it doesn't exit on updates. |
@chrispat @bryanmacfarlane Thanks. I also faced the issue with run.sh exiting on updates(observed with |
Cool - when you say job containers are you referring to this? I hadn't seen that before... I'll definitely try that!
Thanks! It looks like If |
Another option here is that when running inside a container that has the user as The downside to this is that details of the actions runner environment is starting to leak into the workflows of your projects, which is less than ideal in larger companies. |
This is hitting me as well, at first it was easier to avoid, but now that we have more and more actions it is getting harder. Is there a way to have the runner clean up the work directory after the workflow is finished. If the runner isn’t running as root, then it probably can’t delete the directory. But it would be able to do that if it was running in a container. Maybe there is a way, to turn on a clean up job that will run after the workflow is complete to delete the files. If that job is run in a container it should have access to delete everything. This only works if you have Docker installed, but that is fine because I don’t think the issue happens without docker. I guess I could add this to all of my workflows, but that doesn’t seem ideal, getting it added at the runner level makes things cleaner. Just an idea, not sure if it is a good one, but figured I would add it here and see what people think. |
For all my private actions, I ended up putting The other solution is to just run the runner as root by setting |
in your workflow file use this, e.g:
|
I suppose that will only work if you're using I don't believe an action has those concepts unless specifically written by the action. In which case, it would need to get updated in every action that uses Docker (extreme example). Unless there is a universal environment variable that masks files or sets file creation permissions. (I suppose I'm thinking something similar to |
Yes for Docker actions (private or public) - User will need to modify the corresponding The nice thing is that Concerning
|
I found this temporary workaround for Docker action: Fix for self-hosted runner P.S: I updated script and moved it to repository README.md |
You might not be able to run with sudo, but you can add user to the
|
Strange... I did this before and still gave me problems. I can try again in the future. Thanks! |
@jef You may try my solution. It is hacky, but it does work for me. |
@jef I rewrote my script. Now it is possible to set docker's user option via env vars. You can find more details here: https://github.com/xanantis/docker-file-ownership-fix#to-fix-all-those-above-for-self-hosted-runner |
Hey! I faced with the same proble. Have you guys found easy solution? |
This worked for me... I had to remove manually the conflicted files, and then adding the user ID of the user we use for automation did the trick. |
Setting the I have created an action that can "reset" permissions on the workspace directories and files that would trip up a consecutive run and break at checkout; https://github.com/peter-murray/reset-workspace-ownership-action It is still not a perfect solution can can be appended to the end of the workflow with minimal impact and overhead to the workflow until there is a longer term fix available. |
Ran into this myself today due to running python in docker containers during a test step, creating root-owned pycache files. These files then break the next build when the runner attempts removal during the checkout step like OP. Are there any issues with running the service as root utilizing the [user] param for install? I'm hosting a runner on ubuntu 20.04.1 and running:
works well for me as a workaround for now. |
+1. This has also come up where users are using |
Uses docker that is pulled from container registry. Due to issue actions/runner#434 in github actions there is no clean sleeve at the start of the run, actions do not use temporary workspaces but employ existing file systems on the runner's system instead. This can lead to undefined behavior if no cleanup is executed before checkout. Also mind that there will be issues with the file system's permission if the docker container's user does not match the UID of the action runner's UID.
This works around actions/runner#434
This works around actions/runner#434
Thank you :-) This worked for me. |
Are people still having this issue? I've been able to fix my issue by doing what was suggested in #434 (comment) but it feels a bit gross to run the service as My issue is that after running a black formatting action (https://github.com/psf/black) there are files leftover in the
this causes subsequent actions to fail with the above message. |
Yep, still happens with a few different clients of mine. |
Instead of running the service as root, you can use ScribeMD/rootless-docker to run Docker in rootless mode. |
Not 100% sure this is relevant sorry but to work around what I think is a similar problem, I wrote a small script that dynamically creates a user inside the container with the ID that matches the user ID outside: https://stackoverflow.com/a/74330112 In that case the problem solved is: sharing inside files outside. It's ugly but surprisingly effective at solving what seems to be a docker design issue. |
If anyone is having the problem I mentioned above with |
Describe the bug
When using self-hosted runners, git checkouts are cached between runs (this is nice, because it greatly speeds up our builds).
However, if a docker-based step writes a file to the workspace it will (possibly) be owned by the root user. If the permissions don't give the (non-root) action runner user +w permission then a checkout step in a future workflow run will fail to remove this file. The first time, the error will look like this:
So
git clean -ffdx
tried tostat()
thisfoo/
directory (created via a container in a previous build) but failed. It was then unable to remove the directory because it wasn't empty. It tried to fall back torm -rf
which failed for the same reasons.In future builds it goes straight to
rm -rf
because the.git
folder did get cleaned up. It continues to fail in the same way for all future builds. Here's a screenshot:To Reproduce
I've created a repo that reproduces the error: https://github.com/Brightspace/self-hosted-runner-permissions-issue-repro
Here's an example of a workflow failing: https://github.com/Brightspace/self-hosted-runner-permissions-issue-repro/runs/596011452?check_suite_focus=true
Expected behavior
I guess I'd expect all the files to be owned by the runner user... in a perfect world. Maybe that can be done with user namespace maps? Documentation. Not sure what that would entail though or if it makes sense for what the runner is doing.
I think this is not an issue with the checkout action because I don't think there is anything they could do about it - it'd impact other actions too, checkout was just the first one I hit the issue with.
Runner Version and Platform
Ubuntu 18.04, runner version 2.168.0
These are org-level runners but I imagine it's not specific to that.
The text was updated successfully, but these errors were encountered: