-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching between local_repository and git_repository can result in the local_repository directory being removed #6879
Comments
Just verified this still happens with bazel 0.25.0 |
Added the platform label because of the statement
|
@aehlig In my case the local repository is not inside the workspace but a sibling directory to the workspace. My local_repository(..) references the directory with an absolute path in case that would make a difference. |
I still did not manage to reproduce the issue. Can you provide more information, or even a simple reproduction script? I'm currently trying on a Mac Pro with bazel 0.25.2. |
I haven't been able to reproduce it outside of my main repository unfortunately. I tried your script and no luck there as well. It's pretty easy for me to reproduce manually, so I could compile bazel from head and add in some instrumentation. Do you have any ideas of where to start looking? One potentially additional piece of information is that I also modify ~/.bazelrc whenever I switch between the two 'modes'. The .bazelrc switch is a build define |
Do I understand correctly, that the interrupt is essential for this issue and it does not occur when you allow the fetching to complete? Also, is the local repository already gone before the final local build (step 5 in your original post), or does the final build stop do something to cause the issue? |
I believe the interrupt is essential for the issue, I have never seen it without an interrupt. Here is a redacted log from when I just reproduced it. This was not using bazel head but 0.25.2.
So trying manually to switch back and forth, I've managed to reproduce it once in 25 minutes, so I can't say I still understand exactly what happens. We do not use strip_prefix. |
Ok, did some more experimentation and I can reproduce it every time by just speeding up the whole thing
(update)
Watching the external project folder in I don't know anything about bazel internals but is there some deferred action queue or similar that knows that the git clone is incomplete and begins to delete from the (now symlink) during the second command? |
Bazel itself has no such queue used for external repositories; but it might be |
Can you check between the two bazel runs which processes exist? In particular, it would be helpful to know if we some |
Your intuition is indeed correct, there are a few git processes still running when the second bazel invocation starts. git clone https://... /private/var/tmp/_bazel_johanbjork/5bbfecb1113aee50602070b |
Looking, with that knowledge, again at the A simple reproduction case to understand what's going on is the following.
A related, still pending, change is #8264 which tries to remove the uses of @philwo, we discussed the possibility of process-wrapping the processes started from |
This problem is present on all Unix-like platforms, and not specific to macOS. |
Thank you for investigating, looking forward to a fix. Also easy to avoid now that I'm aware what's causing it. |
Description of the problem / feature request:
When switching between local_repository and git_repository, if pressing CTRL+C at a bad time when a bazel command is started during the local_repository -> git_repository switch, the entire local repository folder can be removed.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
(sample output from my latest run
I'm guessing what's happening here is that if the clone is aborted at a bad time, instead of removing the partly checked out folder, it ends up removing the symlink and the target contents that points to the local repository.
I believe this is a regression and it seems it started happening with 0.19, but I'm not able to confirm this.
What operating system are you running Bazel on?
This happens in Mac OS X bazel and I have not been able to try and reproduce it on other platforms.
What's the output of
bazel info release
?INFO: Invocation ID: 5e97dba3-24ec-4e37-bb69-e7d9a388db06
Build label: 0.20.0-homebrew
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Nov 30 20:40:12 2018 (1543610412)
Build timestamp: 1543610412
Build timestamp as int: 1543610412
Any other information, logs, or outputs that you want to share?
The text was updated successfully, but these errors were encountered: