move and enable visual tests #9961

AngeloDanducci · 2023-04-13T17:00:35Z

Summary

Moved and enabled the visual tests

Details and comments

Need to verify artifact publishing occurs on failure/missing, may need to write some purposeful failures.

qiskit-bot · 2023-04-13T17:00:40Z

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

@Qiskit/terra-core

CLAassistant · 2023-04-13T17:00:41Z

All committers have signed the CLA.

coveralls · 2023-04-13T20:06:51Z

Pull Request Test Coverage Report for Build 5522051947

0 of 1 (0.0%) changed or added relevant line in 1 file are covered.
5 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.01%) to 85.994%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/visualization/state_visualization.py	0	1	0.0%

Files with Coverage Reduction	New Missed Lines	%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py	1	90.39%
crates/qasm2/src/lex.rs	4	91.14%

Totals
Change from base Build 5520034869:	0.01%
Covered Lines:	71659
Relevant Lines:	83330

💛 - Coveralls

test/visual/results.py

AngeloDanducci · 2023-06-07T19:11:30Z

Moved all of the failures to one spot so there should be less friction for artifact creation and readability. Looks like it might need an approval to run the tests now?

Once the new artifact creation is tested in the pipeline we can revert and should be good to merge.

jakelishman

Thanks for looking at this. I think the Azure run actually just got lost in the webhooks aether - it happens sometimes. The "authorisation required" thing is about a couple of the tests that run on GitHub Actions, and won't actually be affected by these changes at all. Pushing a new commit to this branch should retrigger the webhook to get Azure going again.

It'll definitely be good to get these tests running again in CI. I haven't been paying attention through the process of developing this PR, but just to check: have you seen any flakiness at all in any of the tests? I don't know if there's any stochastic elements to any of the visualisations, but hopefully not.

.azure/test-linux.yml

.azure/test-visual.yml

azure-pipelines.yml

qiskit/visualization/state_visualization.py

test/visual/mpl/circuit/test_circuit_matplotlib_drawer.py

jakelishman · 2023-06-08T12:26:21Z

test/visual/mpl/circuit/test_circuit_matplotlib_drawer.py

+    @staticmethod
+    def _image_path(image_name):
+        return os.path.join(RESULTDIR, image_name)
+
+    @staticmethod
+    def _reference_path(image_name):
+        return os.path.join(TEST_REFERENCE_PATH, image_name)


Minor: I'd roughly prefer these just to be inlined into the call sites with the constants becoming pathlib.Path instances. It'd be the same or less typing, even at the point of call:

self._reference_path("my_image.png") # current TEST_REFERENCE_PATH / "my_image.png" # alternate

Chose not to inline these for the moment - but I can see the reasoning.

test/visual/mpl/circuit/test_circuit_matplotlib_drawer.py

jakelishman · 2023-06-08T12:43:42Z

test/visual/mpl/circuit/test_circuit_matplotlib_drawer.py

+        ratio = self._similarity_ratio(self._image_path(fname), self._reference_path(fname), fname)
+        assert ratio == 1


If we're using this exact same form everywhere (and it looks like we are), perhaps we could factor it out into a

def assertImageSimilarity(self, name: str, similarity: int)

method? Looking at what self._similarity_ratio does, I'd expect some of its work to be about handling the assertion. We already effectively require that our test images and the references have the same file name, so it doesn't seem too disruptive to factor that out into the assertion method.

At the least, here and everywhere, please can we use unittest assertions rather than bare assert, so we can see the full error messages in the logs if there's a failure.

If you were aiming to have the assertions fire only after all the tests ran, I'd suggest using ddt to parametrise the multi-circuit tests so there's only one assertion, or use with self.subTest to make internally separate test contexts.

Changed to use the unittest assertEqual.

Left as an assert after the fact, after changing _similarity_ratio to _save_diff I think it makes sense that the function saves the diff and returns how different it is.

test/visual/mpl/graph/test_graph_matplotlib_drawer.py

… into ad9892

jakelishman

Thanks so much for sticking with this. The change to from similarity_ratio to save_diff does help a lot with legibility, thanks. It's still maybe not ideal in terms of the ratio between a magic number shared between two places, but it's not the end of the world.

I pushed up a minor couple of commits to tidy up a file permission and remove old documentation references to Binder (that were a bit useless anyway). Thanks for sticking through this to the end!

In the recently merged Qiskit#9961 the visual comparison tests were re-added to run a visual diff on the visualization output against a known reference image. These improved tests provide us missing coverage and have already found potential bugs in proposed changes, see: Qiskit#10208 (comment) However, the comparison were looking for a 100% match between the reference image and the generated output. In an ideal world we'd be able to rely on this. But in practice an exact comparisons of the images are quite flaky. This is because the actual visualization output is a function of not only our python code in Qiskit, but also upstream python dependencies (such as matplotlib, seaborn, etc) and more importantly those libraries leverage many system libraries and packages. This includes the C libraries for image formats (e.g. libpng), but also things like fonts. A version bump in our CI image's package versions can cause subtle differences in the output of visualizations. The CI images are controlled by Microsoft/Github and isn't something we we can't really depend on being constant forever (even if we used docker for a base test image the upstream images we would build off of that aren't 100% fixed either and could cause similar issues). We've had numerous issues with this in the past with image comparison tests (especially when you start running them cross-platform which isn't the case here) and is why we've oscillated between having them in the test suite and not throughout the development history of Qiskit. This commit adds a 1% tolerance to the comparison ratio so instead of needing a 100% match a 99% match is good enough. While this does technically reduce the coverage and a potentially invalid image could slip in that 1% difference, the chance of that happening are fairly unlikely especially weighed against the likelihood of a system change causing a CI blockage (which happened within one day of merging Qiskit#9961). The only other option is to disable these tests as voting in the CI job, which would all but remove their utility because they will likely go stale (as the testing harness before this did). We may still end up making them non-voting anyway despite the coverage gains if we can't reliably run the tests in CI.

In the recently merged #9961 the visual comparison tests were re-added to run a visual diff on the visualization output against a known reference image. These improved tests provide us missing coverage and have already found potential bugs in proposed changes, see: #10208 (comment) However, the comparison were looking for a 100% match between the reference image and the generated output. In an ideal world we'd be able to rely on this. But in practice an exact comparisons of the images are quite flaky. This is because the actual visualization output is a function of not only our python code in Qiskit, but also upstream python dependencies (such as matplotlib, seaborn, etc) and more importantly those libraries leverage many system libraries and packages. This includes the C libraries for image formats (e.g. libpng), but also things like fonts. A version bump in our CI image's package versions can cause subtle differences in the output of visualizations. The CI images are controlled by Microsoft/Github and isn't something we we can't really depend on being constant forever (even if we used docker for a base test image the upstream images we would build off of that aren't 100% fixed either and could cause similar issues). We've had numerous issues with this in the past with image comparison tests (especially when you start running them cross-platform which isn't the case here) and is why we've oscillated between having them in the test suite and not throughout the development history of Qiskit. This commit adds a 1% tolerance to the comparison ratio so instead of needing a 100% match a 99% match is good enough. While this does technically reduce the coverage and a potentially invalid image could slip in that 1% difference, the chance of that happening are fairly unlikely especially weighed against the likelihood of a system change causing a CI blockage (which happened within one day of merging #9961). The only other option is to disable these tests as voting in the CI job, which would all but remove their utility because they will likely go stale (as the testing harness before this did). We may still end up making them non-voting anyway despite the coverage gains if we can't reliably run the tests in CI.

move and enable visual tests

ba307d9

qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Apr 13, 2023

change directory for linux visual

8974a3f

1ucian0 self-assigned this Apr 13, 2023

1ucian0 mentioned this pull request Apr 16, 2023

Create visualization tests for pulse #6502

Open

AngeloDanducci added 12 commits April 20, 2023 02:28

diff visual tests and assert immediately

8a9e54f

use autoformatting

de59432

fix lint errors

5b48a20

reformat after lint fix

00700b0

remove unused var lint err

f9536b1

fix test image reference paths

4df1977

archive and publish image tests on failure

8832bee

add results to archive on failure

3c0ed18

fix format issue

96f1778

fix graph result naming

026d008

add new source of truth for image tests

e71add2

update state city graph reference

be0eb04

AngeloDanducci marked this pull request as ready for review April 25, 2023 03:00

AngeloDanducci requested a review from a team as a code owner April 25, 2023 03:00

1ucian0 mentioned this pull request Apr 25, 2023

Latex drawer tested using notebook in binder #6450

Closed

AngeloDanducci added 2 commits May 2, 2023 12:39

Merge branch 'main' into ad9892

52360af

Merge branch 'Qiskit:main' into ad9892

0454279

1ucian0 added type: qa Issues and PRs that relate to testing and code quality Changelog: None Do not include in changelog labels May 11, 2023

1ucian0 reviewed May 11, 2023

View reviewed changes

test/visual/results.py Outdated Show resolved Hide resolved

1ucian0 added 2 commits May 24, 2023 22:28

Merge branch 'main' into ad9892

f7d7bd8

Merge branch 'main' of github.com:Qiskit/qiskit-terra into ad9892

fc86af1

jakelishman reviewed Jun 8, 2023

View reviewed changes

jakelishman mentioned this pull request Jun 15, 2023

Prequel to if/else changes to mpl circuit drawer #10096

Merged

AngeloDanducci and others added 14 commits June 22, 2023 05:30

refactor basedon review

691f1ec

Merge branch 'Qiskit:main' into ad9892

7b711ac

Merge branch 'ad9892' of https://github.com/AngeloDanducci/qiskit-terra…

46ff49b

… into ad9892

formatting, update test discovery

b9e7f2c

import order per linting

5b411cd

add docstring to vis utilities

7f6ac34

add dev requirements to image tests

accb3a0

update references

6589cc1

Merge branch 'main' into ad9892

ed9549d

Merge branch 'main' into ad9892

a9ed9fa

Remove unnecessary Azure default parameter

2591b09

Remove out-of-date references to Binder

68d1b59

Remove executable bit from Python script

14a8657

Merge remote-tracking branch 'ibm/main' into ad9892

f5e3a7a

jakelishman approved these changes Jul 11, 2023

View reviewed changes

jakelishman enabled auto-merge July 11, 2023 15:45

jakelishman added this pull request to the merge queue Jul 11, 2023

Merged via the queue into Qiskit:main with commit fb9d5d8 Jul 11, 2023
13 checks passed

mtreinish mentioned this pull request Jul 11, 2023

Update plot_gate_map() family to leverage graphviz for visualization #10208

Merged

mtreinish mentioned this pull request Jul 12, 2023

Add 1% tolerance to visual tests #10422

Merged

enavarro51 mentioned this pull request Jul 13, 2023

Update .gitignore to reflect new auto visual tests #10430

Merged

This was referenced Aug 4, 2023

visual testing needs an update (and maybe some regressions?) #9760

Closed

Update image reference for visual testing #9853

Closed

1ucian0 mentioned this pull request Sep 21, 2023

add image comparison to CI Image_tests #6427

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move and enable visual tests #9961

move and enable visual tests #9961

AngeloDanducci commented Apr 13, 2023 •

edited by 1ucian0

Loading

qiskit-bot commented Apr 13, 2023

CLAassistant commented Apr 13, 2023 •

edited

Loading

coveralls commented Apr 13, 2023 •

edited

Loading

AngeloDanducci commented Jun 7, 2023

jakelishman left a comment

jakelishman Jun 8, 2023

AngeloDanducci Jun 22, 2023

jakelishman Jun 8, 2023

jakelishman Jun 8, 2023

AngeloDanducci Jun 22, 2023

jakelishman left a comment

		ratio = self._similarity_ratio(self._image_path(fname), self._reference_path(fname), fname)
		assert ratio == 1

move and enable visual tests #9961

move and enable visual tests #9961

Conversation

AngeloDanducci commented Apr 13, 2023 • edited by 1ucian0 Loading

Summary

Details and comments

qiskit-bot commented Apr 13, 2023

CLAassistant commented Apr 13, 2023 • edited Loading

coveralls commented Apr 13, 2023 • edited Loading

Pull Request Test Coverage Report for Build 5522051947

💛 - Coveralls

AngeloDanducci commented Jun 7, 2023

jakelishman left a comment

Choose a reason for hiding this comment

jakelishman Jun 8, 2023

Choose a reason for hiding this comment

AngeloDanducci Jun 22, 2023

Choose a reason for hiding this comment

jakelishman Jun 8, 2023

Choose a reason for hiding this comment

jakelishman Jun 8, 2023

Choose a reason for hiding this comment

AngeloDanducci Jun 22, 2023

Choose a reason for hiding this comment

jakelishman left a comment

Choose a reason for hiding this comment

AngeloDanducci commented Apr 13, 2023 •

edited by 1ucian0

Loading

CLAassistant commented Apr 13, 2023 •

edited

Loading

coveralls commented Apr 13, 2023 •

edited

Loading