Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jest-diff: Replace diff with diff-sequences package #6961

Merged
merged 6 commits into from
Jan 11, 2019

Conversation

pedrottimark
Copy link
Contributor

@pedrottimark pedrottimark commented Sep 11, 2018

Summary

Fixes #1772 #5392 besser spät als nie :)

Much faster when number of changed lines is large: less memory and fewer iterations.

The baseline and improved versions report the same number of changed versus common lines.

This a breaking change from viewpoint of dependents like snapshot-diff or snapshot-diff-serializer package which use output of jest-diff as a test criterion instead of a report.

/cc @thymikee @wldcordeiro

Given received and expected values, the improved version can return different particular lines as inserted, deleted, or common than the baseline version (especially for swapped lines, see below).

In case you ask why different particular lines, because diff-sequences can often return 2 differences per bisection to minimize depth of recursive calls, it might find alternative path in edit graph.

The tests for snapshot-diff still pass.

For realistic differences of .js, .json, .md, and .snap files, only 12.2% = 1259 / 10306 had exactly the same changed output in offline comparison of baseline to improved versions of jest-diff in the 2000 commits going back in history from Jest 23.5.0.

By the way, Jest still has diff as a dependence:

yarn why diff
   - "_project_#mocha" depends on it
   - Hoisted from "_project_#mocha#diff"
   - Hoisted from "_project_#example-typescript#ts-jest#jest-config#jest-jasmine2#jest-diff#diff"

Test plan

Updated 6 snapshots for collapses big diffs to patch format and context tests:

Expected: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Received: [1, 2, 3, 4, 5, 6, 7, 8, 10, 9]

Differences in baseline version, that is, 10 is change, 9 is common:

+ 10
  9
- 10

Differences in improved version, that is, 9 is change, 10 is common:

- 9
  10
+ 9

Updated 4 snapshots in matchers.test.js of expect package:

Expected: [2, 1] or {2: "two", 1: "one"}
Received: [1, 2] or {1: "one", 2: "two"}

Differences in baseline version, that is, 1 is change, 2 is common:

+ 1
  2
- 1

Differences in improved version, that is, 2 is change, 1 is common:

- 2
  1
+ 2

@codecov-io
Copy link

codecov-io commented Sep 11, 2018

Codecov Report

Merging #6961 into master will increase coverage by 0.13%.
The diff coverage is 98.48%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #6961      +/-   ##
=========================================
+ Coverage   66.86%     67%   +0.13%     
=========================================
  Files         250     250              
  Lines       10460   10510      +50     
  Branches        3       3              
=========================================
+ Hits         6994    7042      +48     
- Misses       3465    3467       +2     
  Partials        1       1
Impacted Files Coverage Δ
packages/jest-diff/src/diff_strings.js 98.52% <98.48%> (-1.48%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0cd757b...dd30440. Read the comment docs.

@SimenB SimenB requested a review from cpojer September 11, 2018 21:09
@SimenB
Copy link
Member

SimenB commented Sep 11, 2018

Wooooooo! I've been waiting for this! 😀

This a breaking change from viewpoint of dependents like snapshot-diff or snapshot-diff-serializer package which use output of jest-diff as a test criterion instead of a report.

React also uses jest-diff, mind seeing if this breaks them as well? We're gearing up for a major now anyways (due to the Babel 7 change), so the fact it's breaking is in and of itself not an issue 🙂

CHANGELOG.md Outdated
@@ -1,5 +1,9 @@
## master

### Features

- `[jest-diff]` Replace `diff` with `diff-sequences` package ([#6961](https://github.com/facebook/jest/pull/6961))
Copy link
Member

@SimenB SimenB Sep 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add **Breaking** as well if it is? 🙂

@SimenB
Copy link
Member

SimenB commented Sep 11, 2018

Also, is this a pure performance change or is the visual output for users different? If the latter, mind adding some screenshots?

EDIT: nvm, I just red your OP more thoroughly, and you've included some example, which both makes perfect sense 🙂

@pedrottimark
Copy link
Contributor Author

pedrottimark commented Sep 11, 2018

Goal for this pull request is improve performance. A separate pull request #6917 explains in ExpectAPI.md how to rewrite assertions that now will run faster, but still output large irrelevant diffs.

The formatting remains the same, so most dependents which use jest-diff to report test failures, won’t notice the difference, pardon the pun.

@pedrottimark
Copy link
Contributor Author

To answer your question about React, here is what I did:

  1. clone repository
  2. yarn
  3. yarn link jest-diff
  4. yarn test passed
  5. yarn test-prod passed

@wldcordeiro
Copy link

Thanks for the heads up! I'll take this into account when I bump the dependencies.

Copy link
Collaborator

@thymikee thymikee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a quick glance, so I left some nits. Can you share any perf numbers for smaller and larger diffs? Does it affect overall perf of our suite for example?

aStart: number,
aEnd: number,
aLinesUn: Array<string>,
aLinesIn: Array<string>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the variables names?
aStart -> start
aLinesUn -> linesUnchanged
etc

This comment is valid for all names like these (except when it makes sense where you have a,b related args)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will review naming and think how to improve comments here to make it less surprising.

The reason for a prefix in formatDelete and b prefix in formatInsert is to remind me assumption or convention that took me a while to understand:

  • differences in expected value are always formatted as deleted
  • differences in received value are always formatted as inserted

const fgDelete = chalk.green;
const fgInsert = chalk.red;
const fgUnchanged = chalk.dim; // common lines (even indentation same)
const fgInchanged = chalk.cyan; // common lines (only indentation different)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fgIndent maybe? Inchanged sounds weird

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with
fgUnchanged fgCommon
fgInchanged fgIndent
bgUnchanged bgCommon
bgDefault bgInverse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like those suggestions!

@SimenB
Copy link
Member

SimenB commented Sep 20, 2018

Can you share any perf numbers for smaller and larger diffs?

Might be a good idea to create a performance suite for this, similar to https://github.com/facebook/jest/tree/master/packages/pretty-format/perf

(although I would prefer it to be written in https://github.com/bestiejs/benchmark.js/ or something and not something home cooked)

@pedrottimark
Copy link
Contributor Author

@thymikee A performance suite in diff-sequences package is super suggestion.

The problem solved by replacing diff package is when Jest reports test failure and serialization of received value has huge number of inserted lines:

@SimenB Thank you for suggesting benchmark.js package for a performance suite.

I will convert perf test for pretty-format as warm up exercise. Your comment about “home cooked” is on target. I had to adapt it (offline) quite a bit to get useful feedback that correctness improvements did not cause performance regressions.

@SimenB
Copy link
Member

SimenB commented Sep 20, 2018

Got a PR using benchtable in another project I'm maintaining, you can look into that as an alternative to plain benchmark.js (it has a slightly easier API, and a prettier reporter)

@SimenB
Copy link
Member

SimenB commented Sep 24, 2018

Btw, #4619 is back on the table after this is merged, right?

CHANGELOG.md Outdated
@@ -1,5 +1,9 @@
## master

### Features

- Breaking Change: `[jest-diff]` Replace `diff` with `diff-sequences` package ([#6961](https://github.com/facebook/jest/pull/6961))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the format:

- `[jest-diff]` [**BREAKING**] blablabla

@hisapy
Copy link

hisapy commented Oct 12, 2018

I'm having this problem #1772 (comment) ... do you think I can solve it after this PR is merged? .. or do you think I can install jest from the fork so I can experiment with it ?

@thymikee
Copy link
Collaborator

Feel free to play with it :)

@SimenB
Copy link
Member

SimenB commented Oct 12, 2018

@pedrottimark any news on that benchmark? :) I think we're good to land as soon as those are in place

Also, semi related: prettier/prettier#4816

@hisapy
Copy link

hisapy commented Oct 12, 2018

I played with this PR in my project and apparently it feels faster than 23.6.0, having snapshots > 1MB

@stephanschubert
Copy link

@hisapy Why not time jest ... it?

@hisapy
Copy link

hisapy commented Oct 14, 2018

@hisapy Why not time jest ... it?

That's a good idea. Too bad I didn't think of it before trying this fork. I was in a real hurry and I just wanted to link this to my project and get it working.

Just for the record, I first tried to yarn link from the root dir of jest but I got some errors. The correct procedure is to yarn link from package in the packages dir. And in this sense, at first I just linked jest and the first I noticed was that my test, instead of hanging for long minutes, failed faster, with the same memory heap error .... Luckily I had my debugger ON and I quickly noticed that my test was using the old jest-diff so I also linked the jest-diff package from this branch ... although I had the same error, I believe that it got to it faster.

Finally, after some checks I would've done before chasing the error in jest, I realized that my snapshot tests were failing because I had some snapshots with really huge sizes (35MB, 13MB being the largest) so I came with strategy of splitting the snapshots in multiple files and now the largest are at most ~1.4MB, which run just fine.

I hope I can fix some other stuff quickly so I can time this with fully green snapshots in my test suite.

@SimenB SimenB added this to the Jest 24 milestone Oct 16, 2018
@canemacchina
Copy link

@SimenB could this fix be integrated in jest before Jest 24 milestone?
the issue #1772 is quite blocking for me, and if there are no reason to wait for next milestone, I can't understand why this fix can't be released before...

@pedrottimark
Copy link
Contributor Author

I will work on performance benchmark over the week end.

@SimenB
Copy link
Member

SimenB commented Oct 18, 2018

@canemacchina we've already landed breaking changes in master (and this is a breaking change in and of itself), so this won't be released before Jest 24. Sorry!

@canemacchina
Copy link

Oh... Well, I'll wait!

Thanks anyway!

@lochness42
Copy link

This would be really appreciated addition as I'm having hanging tests with storyshots which is using snapshots under the hood.

@SimenB
Copy link
Member

SimenB commented Nov 13, 2018

@rubennorte could you run this diff on fb and see if either functionality or performance regresses?

@SimenB
Copy link
Member

SimenB commented Dec 26, 2018

@rickhanlonii maybe you can?

@thymikee
Copy link
Collaborator

@pedrottimark mind resolving conflicts?

@thymikee
Copy link
Collaborator

@rubennorte @rickhanlonii we'd appreciate testing this at FB scale. For now I feel like we're ready to merge it

@thymikee thymikee merged commit 60ca308 into jestjs:master Jan 11, 2019
@pedrottimark pedrottimark deleted the replace-diff branch January 11, 2019 22:20
@pedrottimark
Copy link
Contributor Author

Think of small changes to cause toEqual and toMatchSnapshot tests to fail and display diff.

captain-yossarian pushed a commit to captain-yossarian/jest that referenced this pull request Jul 18, 2019
* jest-diff: Replace diff with diff-sequences package

* Update CHANGELOG.md

* Add Breaking Change in CHANGELOG.md

* Rename 4 variables for chalk colors
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Jest stalls after comparing to complex objects
10 participants