Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it make sense to checkin Cypress e2e snapshots? #3491

Open
elliot-nelson opened this issue Sep 19, 2022 · 17 comments
Open

Would it make sense to checkin Cypress e2e snapshots? #3491

elliot-nelson opened this issue Sep 19, 2022 · 17 comments
Labels
Area: DevOps Discussion Discussion which may lead to separate issues or PRs Status: Pending Is not to be executed as it currently is Type: Question

Comments

@elliot-nelson
Copy link
Contributor

Help us help you!

After interacting with the e2e tests a little bit, I'm wondering whether you would want to checkin these snapshots.

Normally with these kinds of tests, I'd expect:

  • The PNGs in the snapshot folder are checked into the repo
  • Running e2e locally or in CI produces failures if a snapshot has changed
  • You can run an e2e command to update snapshots if they are correct
  • Pull requests would include any new or updated snapshot images, so you can see right in the PR preview what visual changes will be introduced.

It doesn't seem like that is what's happening right now. (But, maybe I'm misunderstanding what's happening, or there's a reason it's not currently doing this?)

@elliot-nelson elliot-nelson added the Type: Other Not an enhancement or a bug label Sep 19, 2022
@github-actions github-actions bot added the Status: Triage Needs to be verified, categorized, etc label Sep 19, 2022
@dwhelan
Copy link

dwhelan commented Sep 20, 2022

Hey @elliot-nelson have you tried yarn run e2e-upd? I have not but I suspect it might allow you to ... run an e2e command to update snapshots if they are correct.

And I agree with all your points and as I am new to mermaid I'm unsure of the current state or history of e2e tests. I have been nervous to use them because I am get errors on a fresh checkout - not sure if this is normal or not. But it gives me pause.

@aloisklink
Copy link
Member

You read our minds :) We had a discussion about this a few days ago, copied below from #3459 (comment):

Do you think we can keep the screenshots in the repo and run local verification instead of relying on applitools for PRs?

Two issues I see with this:

  • The cypress/snapshots folder is about 8 MB on my PC. If we start updating those snapshots, the size of our git history will get larger and larger over time.
    We could use something like Git LFS, but the problem is:

    • you need to install Git LFS separately
    • GitHub charges for Git LFS bandwidth, so it get's expensive very quickly (I don't think it's free for open-source)
  • Local verification will throw a lot of tiny rendering differences.

I've actually got a similar project that checks in snapshots in a git repo (remark-mermaid-dataurl), and it's a always a pain since the images are always a few percent different from my local PC compared to GitHub Actions.

To avoid those issues, the current plan is two have two different types of visual regression/snapshot testing done only on images generated by GitHub Actions:

  • Applitools, which is a great SaaS, but we only have limited credits per month, so we can't use it for every PR, so it's mostly going to be used before releases and on high-risk PRs (PR ci(e2e-applitols): add applitools CI action #3483).
  • Some custom form of GitHub Actions that will do something like the following for each PR (still on the todo-list):
    • Checkout develop branch
    • Generate baseline e2e snapshots
    • Checkout PR branch
    • Run tests and somehow report the difference in e2e snapshots (the cypress-image-snapshot-review-action project looks interesting for this, but it's not finished)

And I agree with all your points and as I am new to mermaid I'm unsure of the current state or history of e2e tests. I have been nervous to use them because I am get errors on a fresh checkout - not sure if this is normal or not. But it gives me pause.

Since PR #3459, e2e tests have been fixed so that they all pass! We've even got a CI action that runs e2e tests: https://github.com/mermaid-js/mermaid/actions/workflows/e2e.yml
There's no visual regression/snapshots diffs working in CI, but we've got a rough plan to fix them.

@elliot-nelson
Copy link
Contributor Author

Thanks for the detailed background!

I'll admit I was coming from a corporate perspective and had not considered cost 😓 .

There is cheaper storage available than LFS; for example, 50GB on S3 is $1.15/mo compared to GH LFS at $5/mo. Buuuut, LFS can magically show image diffs, which I think is a killer feature in a PR.

Your other point is well taken though: I think you really can't do PR-uploaded snapshots unless you can dockerize your cypress setup and run it all inside a fixed environment, so that every developer produces an identical snapshot.

(maybe a someday experiment!)

@dwhelan
Copy link

dwhelan commented Sep 20, 2022

Would it be help to offload some e2e tests into jest DOM tests?

@dwhelan
Copy link

dwhelan commented Sep 20, 2022

That is all awesome to hear!

@sidharthv96
Copy link
Member

Would it be help to offload some e2e tests into jest DOM tests?

This is something we should do if it speeds up the tests.
There is a vitest PR in the works too, if it makes anything easier.
We're already using vitest in Live editor for UI component testing.

@aloisklink
Copy link
Member

I think you really can't do PR-uploaded snapshots unless you can dockerize your cypress setup and run it all inside a fixed environment

Even Docker might not be enough. ARM computers (e.g. one of those new Macs) often calculate shadows/shading very slightly differently from x86_64 computers (and it'll probably affect LoongArch CPUs too). I honestly wouldn't be surprised if your computer's GPU might also affect things.

Hence why I feel like just letting CI handle it for you is so much easier 😉 It's probably also why all the visual regression SaaS companies can charge so much money.

Would it be help to offload some e2e tests into jest DOM tests?

100%. Jest (or vitest) unit tests are much much faster than Cypress.

Jest (or vitest) uses JSDom, which doesn't support the CSS/layouting Mermaid needs, so most of the Mermaid rendering code needs to tested in a real browser engine, but everything else ideally should also be unit tested.

@dwhelan
Copy link

dwhelan commented Sep 25, 2022

Updating this comment because vitest addresses this in a much better way.


I have what I think could be important. I was able to get jest unit tests working with d3. From what I can tell, this was not possible before so we ended up with a lot of e2e tests as d3 required mocking.

jest challenges

The show stopper is that jest checks each ES module loaded by our tests transitively. If any loaded package.json does not have a type value of module, jest will throw an exception. These checks pass for some dependencies like memoize but fail with d3. This blocks us from both testing d3 logic and using d3 in our tests.

jest patch script

I wrote a bypassEsmChecks script that bypasses the above ESM check logic. The script patches the jest code in shouldLoadAsEsm.js line 71 from:

return cachedLookup;

to

return false;

Here is the script:

#!/bin/bash
shouldLoadAsEsm="node_modules/jest-resolve/build/shouldLoadAsEsm.js"
line_number=87 # the source at line 71 ends up being at line 87 when distributed.
sed -i "''" "${line_number}s/return cachedLookup/return false/" ${shouldLoadAsEsm}

⇥ This patch unblocks us to test mermaid with jest and d3. ⇤

Using this I was able to fully unit test markers which I believe would obsolete 5 tests in cypress/integration/other/configuration.spec.js and would reduce the e2e snapshots accordingly.

@dwhelan
Copy link

dwhelan commented Sep 27, 2022

I had a look at the new vitest and I love it!

vitest looks like a great choice and it removes all the jest/ESM module loading issues that have been an obstacle AFAIK to unit testing.

So my comment above about a jest workaround is now mute. vitest solves this in a much more robust way.

@dwhelan
Copy link

dwhelan commented Sep 27, 2022

So, with vitest we now have the ability to move some tests from Cypress to vitest.

Here are some opportunities that I see:

  • focus e2e tests and reduce size of diagrams to the minimal to test (e.g.A[Christmas] -->|Get money| B(Go shopping) is present 62 times in cypress folder)
  • move rendering e2e tests to vitest image snapshots (does not solve overall size but tests should run about 100x faster)
  • migrate tests from image snapshots to vitest snapshots (e.g. validate SVG structurally)
  • update unit tests to assert using d3 (without mocking or snapshots)

@aloisklink
Copy link
Member

Here are some opportunities that I see:

I agree with of these 👍 Some comments I the image snapshot parts:

  • move rendering e2e tests to vitest image snapshots (does not solve overall size but tests should run about 100x faster)

Cypress e2e image snapshots actually use the same library as the recommended library for vitest image snapshots!

The only issue is that the latest version of cypress-image-snapshot uses an outdated version of jest-image-snapshot.

  • migrate tests from image snapshots to vitest snapshots (e.g. validate SVG structurally)

I tried to set this up with https://github.com/mermaid-js/mermaid-cli, and didn't have much success for two reasons:

  • Mermaid needs to calculate the width/size of elements to decide where to put elements, and so if you have a different font installed on your computer (e.g. Windows vs Linux), the generated SVG will have different pixel values.
  • The SVGs are pretty unoptimized (lots of repeated styles), so it's difficult to see changes. It also means that the SVGs are a similar size to PNGs (although SVGs compress a lot better)
  • Percy.io (visual regression platform used in mermaid-cli) doesn't yet support SVGs, so we need to convert SVGs to PNGs anyway (and that's not as easy as you might think, since Mermaid SVGs use embedded HTML).

@dwhelan
Copy link

dwhelan commented Oct 3, 2022

Cypress e2e image snapshots actually use the same library as the recommended library for vitest image snapshots!

Yes. The improvement I am foreseeing is not from the library but from running the tests via vitest rather than Cypress. I would like to experiment with this as I see Cypress tests taking ~ 1 second and vitest tests taking ~ 1 millisec. Having faster tests means developers would get faster feedback.

Mermaid needs to calculate the width/size of elements to decide where to put elements, and so if you have a different font installed on your computer (e.g. Windows vs Linux), the generated SVG will have different pixel values.

Good to know! Maybe fonts are why I can't get e2e tests to all pass on my iMac. It is complexity like this that motivates a decreasing reliance on image snapshots. My suggestion here was for data based snapshots (not image snapshots) where we could extract say the generated SVG as a data snapshot. So, lack of rendering support is not an issue.

I submitted my first PR to mermaid. It removes mocks and enables both d3 and dagre-d3 modules to be actually loaded and used. This allows you to assert against the DOM. While this does not provided a visual guarantee it does allow unit tests to go deeper taking the load off e2e tests. See marker2.spec.js for tests that check that the markers are correctly populated in the DOM.

@aloisklink
Copy link
Member

My suggestion here was for data based snapshots (not image snapshots) where we could extract say the generated SVG as a data snapshot. So, lack of rendering support is not an issue.

Generating the SVG still requires layout support, which JSDom does not yet support, see https://github.com/jsdom/jsdom#unimplemented-parts-of-the-web-platform (although I could be wrong! I last checked this a few years ago)

For example, Mermaid might decide to place boxes side-by-side with narrow fonts, while they might be on top of each other for wider fonts.

I'd love to see Mermaid render SVGs in Node.JS, though! Maybe it might be possible by combining JSDom and svgdom, and if we use a fixed embedded open-source font https://github.com/svgdotjs/svgdom#fonts.

I submitted my first PR to mermaid. It removes mocks and enables both d3 and dagre-d3 modules to be actually loaded and used.

Congrats 🚀 Removing mocks is always great (issues with mocks is a big reason why we switched from the faster jest to the slower vitest). I'm not really experienced with DOM/SVG stuff, but I want to learn, so if I get a bit of free time, I'll have a look!

@sidharthv96
Copy link
Member

Off topic, but it feels weird to read

faster jest to the slower vitest

when their home page looks like this
image

Can't shake the feeling that we're doing some config wrong.
Running with --no-isolate significantly boosts the performance, but makes the tests flaky.

@dwhelan
Copy link

dwhelan commented Oct 6, 2022

I have results from using vitest non-image snapshots. So in a snapshot the document.body is serialized as a string to a vitest snapshot file. I cloned classDiagram-v2.spec.js cypress test and adjusted it to run via vitest keeping the diagrams intact and ran the tests:

cypress: 667 KB, 93 secs
vitest: 452 KB, 0.6 secs

Snapshot File Size

The 32% snapshot file size reduction is ok but modest, as you predicted @aloisklink.

Further reduction is possible by selecting a subset of the document to snapshot and/or writing a custom snapshot serializer. I did not try either but both look straight forward and might be worth checking out.

A more promising diet plan might be to refactor some cypress tests into smaller, more focused tests. For instance, a test that just includes the shapes the diagram supports. This would reduce the storage consumed by duplicate drawing elements in our current e2e test suite snapshots. I think the benefits are deeper but this would be a wider discussion especially as I don't have all the context and experience that you all have :).

Test Speed

The 150x speed improvement is really nice!
These tests are fast enough to be run frequently/continuously by devs - hopefully catching bugs sooner in the dev flow.

Unit Tests

The main thing I learned from this is that by allowing actual DOM access in our tests we can write fast, understandable, focused tests without mocking. (To be clear, I am not dissing mocking. It has its place for sure.)

I think this will make it easier to write unit tests so we should expect that devs would be more inclined to write them.

And that would be good thing!

@dwhelan
Copy link

dwhelan commented Oct 6, 2022

Can't shake the feeling that we're doing some config wrong.
What's your gut feel @sidharthv96? Say more ... I would be happy to provide a fresh set of eyes on it.

@weedySeaDragon
Copy link
Contributor

Something to consider meanwhile:
Add some information to the contributing document to explain how to run e2e tests on their local machine. (New to cypress, I needed to figure out how to generate the 'correct' snapshots to compare my code to.)

Perhaps something along the lines of:

Update the current cypress snapshots with the develop branch. When you run the e2e (end-to-end) tests against your code in your branch, cypress will compare the screenshots generated with your code against these snapshots. Running this locally will ensure that things like diagrams are generated using the fonts on your system (instead of fonts used on the CI systems, for example).

Here's how to do it:

  1. Update the cypress snapshots by running the code in the develop branch:
    1. Check out the develop branch
    2. Start the dev server: pnpm dev
    3. Run cypress with this command to update the snapshots:
pnpm cypress --env updateSnapshots=true
  1. When you're ready to run the e2e test against your code:
    1. Check out your branch
    2. Run the e2e test: pnpm e2e No need to start the dev server because this script will do that for you.
    3. If any of the tests fail, you should look at the differences between what the develop branch generated and what your code generated under the snapshots/..[test filep path].../__diff_output directory.

@jgreywolf jgreywolf added Discussion Discussion which may lead to separate issues or PRs Status: Pending Is not to be executed as it currently is Type: Question and removed Status: Triage Needs to be verified, categorized, etc Type: Other Not an enhancement or a bug labels Jan 26, 2023
@sidharthv96 sidharthv96 linked a pull request Feb 19, 2023 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: DevOps Discussion Discussion which may lead to separate issues or PRs Status: Pending Is not to be executed as it currently is Type: Question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants