Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1874 race condition #3561

Closed
wants to merge 1 commit into from
Closed

Fix #1874 race condition #3561

wants to merge 1 commit into from

Conversation

Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 12, 2017

Summary

Use file locking to fix the race condition when writing the cache

Test plan

Race conditions are tricky to reproduce, hence only manual testing.
Use some big modules that takes a long time to transform write to the filesystem, create multiple tests and run each test in a separate worker.

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

Don't forget to use clean cache when testing. Ideally, run this in a docker container which is destroyed on each run.

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

What is a supposed way to update yarn.lock? I did yarn add proper-lockfile.

@thymikee
Copy link
Collaborator

Because of the way Lerna works, you need to now reset your lockfile and run yarn install from project root

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

  {plugins: [...babelNodeOptions.plugins, 'transform-runtime']}
             ^^^

SyntaxError: Unexpected token ...

:( wrong node version?

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

Did yarn install, yarn.lock didn't appear. Is it expected?

@thymikee
Copy link
Collaborator

This is the diff I get, when I run yarn on your repo:

diff --git a/packages/jest-runtime/yarn.lock b/packages/jest-runtime/yarn.lock
index c0121df..4eaff68 100644
--- a/packages/jest-runtime/yarn.lock
+++ b/packages/jest-runtime/yarn.lock
@@ -654,6 +654,13 @@ private@^0.1.6:
   version "0.1.7"
   resolved "https://registry.yarnpkg.com/private/-/private-0.1.7.tgz#68ce5e8a1ef0a23bb570cc28537b5332aba63ef1"

+proper-lockfile@^2.0.1:
+  version "2.0.1"
+  resolved "https://registry.yarnpkg.com/proper-lockfile/-/proper-lockfile-2.0.1.tgz#159fb06193d32003f4b3691dd2ec1a634aa80d1d"
+  dependencies:
+    graceful-fs "^4.1.2"
+    retry "^0.10.0"
+
 randomatic@^1.1.3:
   version "1.1.6"
   resolved "https://registry.yarnpkg.com/randomatic/-/randomatic-1.1.6.tgz#110dcabff397e9dcff7c0789ccc0a49adf1ec5bb"
@@ -713,6 +720,10 @@ require-main-filename@^1.0.1:
   version "1.0.1"
   resolved "https://registry.yarnpkg.com/require-main-filename/-/require-main-filename-1.0.1.tgz#97f717b69d48784f5f526a6c5aa8ffdda055a4d1"

+retry@^0.10.0:
+  version "0.10.1"
+  resolved "https://registry.yarnpkg.com/retry/-/retry-0.10.1.tgz#e76388d217992c252750241d3d3956fed98d8ff4"
+
 "semver@2 || 3 || 4 || 5", semver@^5.3.0:
   version "5.3.0"
   resolved "https://registry.yarnpkg.com/semver/-/semver-5.3.0.tgz#9b2ce5d3de02d17c6012ad326aa6b4d0cf54f94f"

Use file locking to fix the race condition when writing the cache
@thymikee
Copy link
Collaborator

As for destructuring, it doesn't seem to work very well (despite babel transform used, dunno) on Node 4, so we use Array#concat instead

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

Thanks, after removing node_modules and rerunning from root it worked

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

All tests passed 🎉

One thing I'm not sure about is realpath. If cachePath contains symlinks locking might not work. The default value is /tmp/jest though so might be not as critical.

@thymikee
Copy link
Collaborator

Let's just wait for @cpojer or @DmitriiAbramov to review this :)

@codecov-io
Copy link

Codecov Report

Merging #3561 into master will decrease coverage by <.01%.
The diff coverage is 60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3561      +/-   ##
==========================================
- Coverage   62.39%   62.39%   -0.01%     
==========================================
  Files         181      181              
  Lines        6646     6651       +5     
  Branches        6        6              
==========================================
+ Hits         4147     4150       +3     
- Misses       2496     2498       +2     
  Partials        3        3
Impacted Files Coverage Δ
packages/jest-runtime/src/ScriptTransformer.js 89.58% <60%> (-1.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b157c3...7102c7d. Read the comment docs.

@aaronabramov
Copy link
Contributor

hey @Vanuan! how did you test this?
i tried to stress test the transform logic (asynchronously write 100000 files) and wasn't able to reproduce.
This is probably the most annoying bug that i was never able to reproduce :)
my thought were that some of the OS have some kind of locks already that prevent this from happening and some of them don't

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

I'm using a docker image based on alpine. Maybe there are some differences in musl library...

I've reviewed the coverage report and it looks like locking always fails :(
https://codecov.io/gh/facebook/jest/pull/3561/src/packages/jest-runtime/src/ScriptTransformer.js?before=packages/jest-runtime/src/ScriptTransformer.js#L349

Maybe I'm reading it wrong?

@aaronabramov
Copy link
Contributor

i don't think coverage will be collected for this part, because most of the time we run Jest in subprocesses without any instrumentation

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

Might also depend on the filesystem and flush policy.

@Vanuan
Copy link
Contributor Author

Vanuan commented May 12, 2017

Hm, another difference might be SSD vs HDD. Don't know how to make fs module slower. Maybe add some loops in the wrapper module?

@Vanuan
Copy link
Contributor Author

Vanuan commented May 13, 2017

How can I release my personal fork with this fix so that people in #1874 could verify? Or could you do that?

@bcruddy
Copy link

bcruddy commented May 13, 2017

@Vanuan if you put up a pull request anyone using the upstream/origin fork model can use the git cpr alias here to pull down the latest commit in the pull request to run locally - https://github.com/bcruddy/dotfile-manager/blob/master/home/.gitconfig

@Vanuan
Copy link
Contributor Author

Vanuan commented May 16, 2017

I've tried this in package.json:

-    "jest": "^20.0.1",
+    "jest": "Vanuan/jest#fix-race-condition",

Resulted in error:

Exit code: 1
Command: sh
Arguments: -c node ./scripts/postinstall.js && yarn run build --silent && (cd packages/eslint-plugin-jest && yarn link --silent) && yarn link eslint-plugin-jest --silent
Directory: /src/node_modules/jest
Output:
Setting up Jest's development environment...
$ cd /src/node_modules/jest
$ /src/node_modules/jest/node_modules/.bin/lerna bootstrap

/src/node_modules/jest/scripts/_runCommand.js:29
    throw error;
    ^

Error running command.
info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.

@thymikee
Copy link
Collaborator

@Vanuan this won't work, because you have to build the project (run Jest code through Babel). What you need to do is to publish your npm package (change "name": "jest" to "name": "Vanuan/jest" or so).

@Vanuan
Copy link
Contributor Author

Vanuan commented May 16, 2017

@thymikee Yeah, that's what I'm asking. Is there a script? Is it yarn publish or something?

@thymikee
Copy link
Collaborator

Yarn or npm, whatever: https://docs.npmjs.com/cli/publish

@BYK
Copy link
Contributor

BYK commented Jul 20, 2017

@Vanuan anything we can to do help you make progress on this?

@cpojer
Copy link
Member

cpojer commented Jul 20, 2017

We have a fix that doesn't use lockfiles in this PR: #4088 which we think is going to resolve this problem in a safer way. Thank you so much for experimenting with a fix and sending us a pull request :)

@cpojer cpojer closed this Jul 20, 2017
@SimenB
Copy link
Member

SimenB commented Oct 25, 2017

Should we retry this PR, seeing as #4088 did not resolve the issue?

Related to #4444 as well, I guess.

write-file-atomic doesn't seem to work properly across processes

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants