Skip to content

Commit

Permalink
Add 1% tolerance to visual tests
Browse files Browse the repository at this point in the history
In the recently merged Qiskit#9961 the visual comparison tests were re-added
to run a visual diff on the visualization output against a known
reference image. These improved tests provide us missing coverage and
have already found potential bugs in proposed changes, see:

Qiskit#10208 (comment)

However, the comparison were looking for a 100% match between the
reference image and the generated output. In an ideal world we'd be able
to rely on this. But in practice an exact comparisons of the images are
quite flaky. This is because the actual visualization output is a
function of not only our python code in Qiskit, but also upstream python
dependencies (such as matplotlib, seaborn, etc) and more importantly
those libraries leverage many system libraries and packages. This
includes the C libraries for image formats (e.g. libpng), but also
things like fonts. A version bump in our CI image's package
versions can cause subtle differences in the output of visualizations.
The CI images are controlled by Microsoft/Github and isn't something
we we can't really depend on being constant forever (even if we used
docker for a base test image the upstream images we would build off of
that aren't 100% fixed either and could cause similar issues). We've had
numerous issues with this in the past with image comparison tests
(especially when you start running them cross-platform which isn't the
case here) and is why we've oscillated between having them in the test
suite and not throughout the development history of Qiskit.

This commit adds a 1% tolerance to the comparison ratio so instead of
needing  a 100% match a 99% match is good enough. While this does
technically reduce the coverage and a potentially invalid image could
slip in that 1% difference, the chance of that happening are fairly
unlikely especially weighed against the likelihood of a system change
causing a CI blockage (which happened within one day of merging Qiskit#9961).
The only other option is to disable these tests as voting in the CI job,
which would all but remove their utility because they will likely go
stale (as the testing harness before this did). We may still end up
making them non-voting anyway despite the coverage gains if we can't
reliably run the tests in CI.
  • Loading branch information
mtreinish committed Jul 12, 2023
1 parent e2674ce commit 9b4747a
Show file tree
Hide file tree
Showing 2 changed files with 94 additions and 94 deletions.
Loading

0 comments on commit 9b4747a

Please sign in to comment.