Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jest-runtime: atomic cache write, and check validity of data #4088

Merged
merged 1 commit into from
Jul 25, 2017

Conversation

jeanlauliac
Copy link
Contributor

@jeanlauliac jeanlauliac commented Jul 20, 2017

This change tries to address what may be a cause for #1874, where I posted some details on the approach. By doing atomic writes we ensure there's no cache file ending up being a mix of two transformed code files, and we limit the concurrency issues of having a file read and written at the same time. A prior PR #3561 tries to address the problem using a lock, but locks bring additional complexity that this change tries to avoid (ex. deadlocks).

This change also adds a checksum, because there can be other processes still writing non-atomically to cache files. Additional, not all filesystems may support atomic renames (what atomic-write-file relies on).

This change incurs a slight performance cost that is the additional I/O call to rename the files. However I believe it is less costly than a lock solution. I don't think this should have much effect even on processing several thousands of files.

Test plan

It is quite challenging to test for concurrency issues, so this change relies on the knowledge that writeFileSync is not atomic and that corruption could have the observed effects. I rely on the existing automated testing to ensure that the caching behaviour and jest-runtime in general is working as expected.

@jeanlauliac
Copy link
Contributor Author

Tests are failing, I'll fix!

@jeanlauliac jeanlauliac requested a review from cpojer July 20, 2017 10:58
@jeanlauliac jeanlauliac force-pushed the atomic-cache-writes branch from fc68a1d to bef5880 Compare July 20, 2017 11:03
@cpojer cpojer mentioned this pull request Jul 20, 2017
@jeanlauliac jeanlauliac force-pushed the atomic-cache-writes branch from bef5880 to f102dba Compare July 20, 2017 15:29
Copy link
Member

@cpojer cpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, pending the issues with CI.

@@ -43,6 +44,9 @@ const configToJsonMap = new Map();
// Cache regular expressions to test whether the file needs to be preprocessed
const ignoreCache: WeakMap<ProjectConfig, ?RegExp> = new WeakMap();

// To reset the cache for specific changesets (rather than package version).
const CACHE_BREAKER = '1';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CACHE_VERSION may be a better name

@jeanlauliac
Copy link
Contributor Author

jeanlauliac commented Jul 24, 2017

Looking into this today. Not sure why it fails CI still

@jeanlauliac jeanlauliac force-pushed the atomic-cache-writes branch from 9752fe4 to 91a5f31 Compare July 24, 2017 23:19
@jeanlauliac
Copy link
Contributor Author

I don't know how it's possible, but apparently adding write-file-atomic as a dependency affects other, unrelated dependencies, making the test to break because the coverage file is written a bit differently. Maybe it's an unsafe test in a sense it reads "private" data? I'm not familiar with the coverage-final.json file. @cpojer normally this version should now pass CI. But, I had to fix the unrelated test, do you have context about it by any chance?

@@ -29,7 +29,7 @@ it('maps code coverage against original source', () => {

// reduce absolute paths embedded in the coverage map to just filenames
Object.keys(coverageMap).forEach(filename => {
coverageMap[filename].path = path.basename(coverageMap[filename].path);
coverageMap[filename].data.path = path.basename(coverageMap[filename].data.path);
Copy link
Contributor Author

@jeanlauliac jeanlauliac Jul 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the shady change. It's like we were using a completely different version of the "coverage module", whichever it is; except that this changeset doesn't make any change related to coverage. See also how the snap file is pretty different.

@jeanlauliac
Copy link
Contributor Author

I changed my mind, I think my code genuinely breaks something. Trying to figure it out.

@jeanlauliac
Copy link
Contributor Author

jeanlauliac commented Jul 24, 2017

Got it, it's because we use return paths from inside the jest cache itself for the source maps. When code outside tries to read these files, naturally it fails because of the hash header. To fix this we can write hashes to different files. The problem is that it'll cause the number of files to double, something I'm not too sure is acceptable for some of our big repos.

EDIT: actually it's not acceptable because when reading source maps we should also check the hash.

@cpojer
Copy link
Member

cpojer commented Jul 25, 2017

What's the action plan there?

@jeanlauliac
Copy link
Contributor Author

Ideally, get rid of the code that return a path, and return a memory-based hash table instead. But, this will force us to publish a new major release of jest-runtime I think, because that'll be a breaking change.

To avoid a breaking change, we can just write the source map to a temporary file, but this will clutter more and more the temporary file and it never gets cleaned.

@cpojer
Copy link
Member

cpojer commented Jul 25, 2017

I'm fine with the breaking change. Since the transformer in jest-runtime includes the version number, it shouldn't really matter from any one version to the next what the transform cache looks like. Do you think you could work on this tomorrow so we can tag a release and unblock shipping a bunch of fixes? :)

@jeanlauliac
Copy link
Contributor Author

Yes, I'll ship that tomorrow. I'll reduce this PR's scope so that it only has the atomic cache write, that is urgent to ship soon, and I'll do the checksums and the source map fixes in another PR.

@jeanlauliac jeanlauliac force-pushed the atomic-cache-writes branch from 91a5f31 to c5f9768 Compare July 25, 2017 17:01
@jeanlauliac
Copy link
Contributor Author

jeanlauliac commented Jul 25, 2017

I figured out a way to do it without breaking change. The source maps will not be covered by the checksum verification, but to make them covered we'd have to change the model all across the stack I think, that'd take some more research and time. The transformed code itself is the most important thing that I hope this changeset gets right.

I'll land once CircleCI shows green.

@jeanlauliac jeanlauliac force-pushed the atomic-cache-writes branch from c5f9768 to edde90d Compare July 25, 2017 17:24
@codecov-io
Copy link

Codecov Report

Merging #4088 into master will increase coverage by 0.04%.
The diff coverage is 88.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4088      +/-   ##
==========================================
+ Coverage   60.32%   60.37%   +0.04%     
==========================================
  Files         195      195              
  Lines        6757     6768      +11     
  Branches        6        6              
==========================================
+ Hits         4076     4086      +10     
- Misses       2678     2679       +1     
  Partials        3        3
Impacted Files Coverage Δ
packages/jest-runtime/src/script_transformer.js 89.92% <88.23%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ffd43f7...edde90d. Read the comment docs.

@jeanlauliac jeanlauliac merged commit 2607f2b into jestjs:master Jul 25, 2017
@screendriver
Copy link
Contributor

@jeanlauliac thank you for the great work! Any chance to make a hotfix patch release v20.0.5 for that? We are encountering flaky unit tests (#1874) all the time in our CI and can't really rely on Jest at the moment 😕

@thymikee
Copy link
Collaborator

@screendriver if I'm not mistaken, it should already be available under jest@test tag (there may be some breaking changes for you)

@jeanlauliac
Copy link
Contributor Author

@screendriver did you try to upgrade to jest@test, and did it work for you?

@screendriver
Copy link
Contributor

Unfortunately I could not test it until yet because it is not welcome to use unstable dependencies in our product. I try to test it in a separate branch and give you feedback about it.

@screendriver
Copy link
Contributor

Just a little update: I couldn't test it because ts-jest does not allow the test version as peerDependency. I opened an issue here kulshekhar/ts-jest#282

tushardhole pushed a commit to tushardhole/jest that referenced this pull request Aug 21, 2017
@gdborton
Copy link
Contributor

@cpojer Is there any eta on when a release will be cut that includes this fix?

@cpojer
Copy link
Member

cpojer commented Aug 29, 2017

You can use jest@test for now.

@adamdicarlo
Copy link

Thank you so much for this fix!

We have a project that was manifesting this issue extremely often on CodeShip under docker. It was behaving as if input source code from node_modules was truncated on a multiple of 4096-bytes boundary, but only on a first run (no transform cache). The problem only seemed to start happening once we introduced a custom transformer, so tracking it down to finally realizing "hey wait jest has multiple processes and it's acting like it's reading a partly written cache file..." was an odyssey. (Spent an entire day on it with a coworker.) We finally came to the issue tracker with the right search terms... such a relief.

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants