Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(builtin): account for racy deletion of symlink in linker #2662

Merged
merged 1 commit into from
May 11, 2021

Conversation

JoostK
Copy link
Contributor

@JoostK JoostK commented May 7, 2021

On Windows, the linker has been prone to race conditions with respect to
symlinking packages into the node_modules tree, causing ENOENT errors
that causes the target to fail. Since the intention of symlinkWithUnlink
is to unlink the directory anyway, we can simply ignore ENOENT errors for
the directory that would be about to be unlinked.

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature (please, look at the "Scope of the project" section in the README.md file)
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

We've been having quite a lot of Windows-only flakes where the linker fails with an ENOENT error.
And example can be found in this angular/angular job.

What is the new behavior?

The ENOENT error that occurs due to a race condition no longer causes the linker to fail.

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

There might continue to be a window for race conditions, but this change should help address the observed failures. Looking at the call site:

https://github.com/bazelbuild/rules_nodejs/blob/768d352d2fea8ce76e1d0cb5f0bde86c5f73c222/internal/linker/link_node_modules.ts#L653-L664

it becomes clear that isLeftoverDirectoryFromLinker is executed between obtaining the stats and actually running symlinkWithUnlink, and sinceisLeftoverDirectoryFromLinker performs a directory walk it may provide ample opportunity for the stats to get stale.

@JoostK
Copy link
Contributor Author

JoostK commented May 7, 2021

@devversion I hope this will address the linker failure we're seeing in angular/angular.

@JoostK JoostK force-pushed the linker/unlink_windows_race branch 2 times, most recently from af208a1 to 567d423 Compare May 7, 2021 22:30
@JoostK
Copy link
Contributor Author

JoostK commented May 7, 2021

This would unfortunately not fix this class of failure. Maybe we should be lenient in the case stats === null, as the intention of that function is to remove everything there is no harm done when skipping stats === null cases.

@JoostK JoostK force-pushed the linker/unlink_windows_race branch from 567d423 to 2f85780 Compare May 7, 2021 23:18
@devversion
Copy link
Contributor

@JoostK yeah, this might improve the odds of having successful builds. I'm specifically speaking of "odds" here because I think conceptually Windows and Bazel w/ concurrency will never work out 100% here. The reason is that multiple actions/tests operate on the same node_modules with multiple instances of the linker running. The linker can always break node_modules for an active test in favor of linking the node modules for a second test being spawned in parallel

@devversion
Copy link
Contributor

I think one idea to solve the concurrency for windows would be to just link the node_modules local to the actions being executed, and then setting the NODE_PATH. Surely just a thought.. I'm not sure if that's worth the effort, but that could avoid the race conditions in a reasonable way.

On Windows, the linker has been prone to race conditions with respect to
symlinking packages into the `node_modules` tree, causing ENOENT errors
that causes the target to fail. Since the intention of `symlinkWithUnlink`
and `deleteDirectory` is to unlink the directory anyway, we can simply
ignore ENOENT errors for the directory that would be about to be unlinked.
@alexeagle alexeagle merged commit e9a683d into bazel-contrib:stable May 11, 2021
twheys pushed a commit to twheys/rules_nodejs that referenced this pull request Jan 13, 2022
…ontrib#2662)

On Windows, the linker has been prone to race conditions with respect to
symlinking packages into the `node_modules` tree, causing ENOENT errors
that causes the target to fail. Since the intention of `symlinkWithUnlink`
and `deleteDirectory` is to unlink the directory anyway, we can simply
ignore ENOENT errors for the directory that would be about to be unlinked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants