-
-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reinstate the clean of vt.results_dir in NodeResolve #4051
Comments
If we're not appropriately cleaning results dirs before giving them to tasks, that should be fixed at a lower level... most likely here: pants/src/python/pants/invalidation/cache_manager.py Lines 171 to 196 in c5a2e6d
|
I think I agree with that. But this task should be restored in the meantime, unless that is coming quickly. And, that change would still leave
(ETA: Updated to actually add the final |
I also have an additional It came with a patch that fixed it for his issue, but I am not sure it will be useful for everyone. |
this is also now resolved with the default clean: #4137 |
* Remove clean from pants task setups. Tasks that might leave partial results in a workdir were historically left to the task to clean. But that became bugged when the workdirs were moved to a symlink, since the clean algorithm didn't have symlink awareness. The consequence of this was that calling clean=True would silently overwrite the symlink with a real dir and break the cache symantics. For resolves, the fresh workspaces was still needed. I changed upstram Pants to clean workdirs by default, so any failure now cleans. pantsbuild/pants#4051 pantsbuild/pants#4137 * Fix linter and missed variables in the refactor (sapling split of c8a374715e5d7eb31eac507c70a3fe9db476af01)
The NodeResolve task used to clean the vt.results_dir but that was recently removed in 8517486.
The argument was that
safe_mkdir(results_dir, clean=True)
was redundant and removing it stopped exceptions likeException message: [Errno 1] Operation not permitted: '/Users/yic/workspace/source/.pants.d/resolve/node/252d64521cf9/<target.id>/current'
.I think that this was a misdiagnosis, and that the clean should be reinstated. The purpose of that call wasn't the mkdir, as much as the clean. The exceptions are below proposed as a consequence of the clean logic.
Motivation for reinstating the clean
The
clean
is actually useful if there has been a ctrl-c or error during a resolve - the directory could hold a halfway state and best be blown away. With a bad resolve, it is possibly vital to clean.Proposed cause of the exceptions
The exceptions are from the times when the above scenario has occurred. For one reason or another, the vt is invalid and the
results_dir
had something in it.In this case, the
safe_mkdir
call would correctly clean the directory and remake it. But the file pointer being passed by the SimpleCodegen superclass is thecurrent
results_dir - which is expected to be a symlink. Whensafe_mkdir
recreated the directory, it converted it from a symlink into a real dir, so the next run will raise an Exception when the cache_manager unsuccessfully attempts to unlinkcurrent
.Disclaimer
Does bad state get cleaned somewhere else along the pipeline? Not that I see with the below repro
Repro POC
On pants master, drop a trace on in the invalidation block:
Repro is to touch a file under the
vt.results_dir
and clean with safe_mkdir. Then ctrl-C and re-run the resolve to see exception. Here is mine:Robust solutions
The ideal fix would probably be restore the clean and do one or both of:
results_dir
instead of thecurrent
symlinkNB: (2) is a bit sticky, since it raises the philosophical question of whether that allows deleting directories outside the buildroot.
safe_mkdir
will happily create directories outside the buildroot, although I do not believe it exercises that power. But creating a directory is not as scary as recursively deleting. My instinct would be to add a check to make sure that the path is under thepants_workdir
if you went that route.Quick Solutions
The first two are not ideal since they only solve this situation for NodeResolve and leave other tasks to fend for themselves, but easy to patch in quickly.
One of:
1.
safe_mkdir(fp, clean=True)
to only delete the contents instead of deleting the entire directory and recreating. Possibly useful in other ways, since silently recreating means that it could also silently change the directory owner or permissions.The text was updated successfully, but these errors were encountered: