YTEP-0036.rst: converting from nose to pytest #9

jcoughlin11 · 2019-10-01T21:40:26Z

This PR covers the addition of YTEP-0036.rst. This new YTEP discusses changing yt's testing framework from nose to pytest, changing the testing architecture to remove yield tests, and using stored hashes of answer arrays instead of storing the entire answer array itself. In addition, it also details how to run the tests under pytest and how to write new answer tests in the new framework.

… moving from stored arrays to stored array hashes.

ngoldbaum

Overall this is really nice, thank you for working on this. This will make it much easier for people to run answer tests locally. See my comment below about reducing boilerplate - I think we want to make it as easy as possible to write new answer tests and I think it's worthwhile to think about insulating test authors from having to know about how hashes should be computed (and possibly screwing that up).

One of the nice aspects of the old answer testing system - especially for the plot answer tests - is that we had rich comparisons for the old answers and the new answers. For plot answer tests we even use matplotlib's image comparison machinery to generate a diff image which people can use to help figure out what exactly is changing, this is very useful in debugging. For the GitHub CI, it also generates a little HTML document that you can download as a test artifact that includes the before, after, and diff image.

Is that maintained here? I would guess that since we'd only be storing the image hashes we wouldn't be able to do that anymore. It might be nice to think about ways to continue storing the images - we're already putting them in the repo as-is (although with this it might make sense to offload the data storage to a secondary repo that only stores the images so we don't continue increasing the size of the yt repository).

In general it would be nice to have more discussion here in this document about the plot answer tests, since those are a little special and also got a lot of love during the google summer of code 2018.

The before/after comparison issue (e.g. a test fails and a developer wants to quickly understand what exactly changed) is less of a concern for the field values tests because we never really did anything to make it easy for developers to see exactly what changed between the new and old answers. That said, it might be nice to think about ways to ergonomically tell developers how to investigate failing answer tests, e.g. by somehow printing out a test script as well as git hashes to run the test script on to see the before/after values and before/after hashes explicitly. I could imagine some hacky ways to do this (e.g. by inspecting the source code for the test functions) but don't really have any concrete ideas - just a thought!

ngoldbaum · 2019-10-01T21:44:00Z

source/YTEPs/YTEP-0036.rst

+
+.. code-block:: bash
+
+  $ pytest


Does this still only run the unit tests or does it run the answer tests too?

Ah I see below it doesn't. Maybe note that here explicitly?

ngoldbaum · 2019-10-01T21:50:29Z

source/YTEPs/YTEP-0036.rst

+        for d in dobjs:
+          hashes['field_values'][f][d] = utils.generate_hash(self.field_values_test(ds, f, d))
+      hashes = {'test_method1': hashes}
+      utils.handle_hashes(self.save_dir, self.answer_file, hashes, self.answer_store)


It seems like a lot of this code is going to get repeated across all the answer tests in all the frontends. Might it be possible for the AnswerTest class to handle allocating the hashes dict, hashing the data, and storing the hashes for each test?

matthewturk · 2019-10-02T00:38:50Z

@jcoughlin11 I agree with what @ngoldbaum said about the image tests in particular, but I wonder if we could also extend this somewhat to all of the tests. As it stands, we are just storing the hashes, but most of the hashing is computed in a central location. Would it be possible to supply arrays to that central location, and have an option that stores the array values (even just in raw binary!) as well as checks the hashes? That might help preserve both options.

It would also open up the opportunity for someone who is making changes and finding test errors to run the tests with that option (--store-values or something) and then load them in to manually compare in detail against the new tests that are failing.

matthewturk · 2019-10-02T00:39:03Z

(Of course this could then be extended to save actual image files, etc etc.)

munkm · 2019-10-10T19:37:16Z

I've been thinking a bit more about this (specifically related to the image diffs and how nice the gsoc work is that supports it). I agree that we should do our best to preserve that functionality, and ideally make sure the upload feature of the diff and the new data is preserved in CI as well. I'm not exactly sure how much more difficult it would be to supply arrays to the central location, but what if we had an option that if a test fails with the hash it then generates the diff and the image files? We could minimize storage needs by only generating these diffs if the test fails, but still make it so we can quickly identify issues with new features.

One downside to this would be that if this is written where a test requires a rerun to generate the diffs, we could end up wasting a lot of resources running and re-running tests. But maybe that won't be an issue at all.

ngoldbaum · 2019-10-15T17:22:35Z

One thing that hasn't come up yet that might be relevant for this discussion is that github actions are a thing now and they have generous free compute resources associated with them.

…ametrizing tests, no longer using OrderedDicts, and reworking the hashing fixture.

neutrinoceros

This is excellent work. There's no question that I'm on board with the whole project but I listed a handful of questions I would like addressed, as well as a couple minor suggestions for improvements. Keep it up !

source/YTEPs/YTEP-0036.rst

neutrinoceros · 2021-02-26T21:11:01Z

source/YTEPs/YTEP-0036.rst

+The second proposal of this YTEP is a general reorganization and streamlining for yt's answer testing framework.
+
+Currently, many answer tests in yt generate large arrays of data. In order to facilitate comparison, these arrays need to be saved. Unfortunately, their size necessitates that these answers be stored separately from the main yt code base and makes answer comparisons slow.
+


Suggested change

Synchronising pull-requests merging with two repos instead of one also slows down the development itself and creates technical debt.

@jcoughlin11 this comment is still not marked "outdated". Maybe it simply doesn't fit the doc anymore, in which case you can simply resolve this.

Oh wait it still applies. You can reject it if you want, no hard feelings.

I added your suggestion as the third bullet point here. I can make it its own paragraph if you think that would fit better.

neutrinoceros · 2021-02-26T21:15:17Z

source/YTEPs/YTEP-0036.rst

+
+Currently, the answer tests are mostly implemented as yield tests using nose. Pytest no longer supports yield tests because they resulted in multiple test functions being generated (one for each parameter combination), and these generated test functions did not properly support fixtures, as described `here <https://docs.pytest.org/en/latest/deprecations.html#yield-tests>`_.
+
+The first step in reorganizing the tests is, therefore, to remove all yield tests. Each function that previously yielded a test now properly calls a test function with the desired parameters, where assertions are made. This makes the tests easier to read, as most of the boilerplate associated with yielding is gone (e.g., changing the ``__name__`` attribute of the yielded test class).


Each function that previously yielded a test now properly calls a test function with the desired parameters

I'm confused, it sounds more like an intermediate step than a desirable final state: shouldn't test function be self-contained ? I believe that test generation is roughly equivalent to @pytest.mark.parametrize, is there something I'm missing ?

neutrinoceros · 2021-02-26T21:22:22Z

source/YTEPs/YTEP-0036.rst

+
+The first step in reorganizing the tests is, therefore, to remove all yield tests. Each function that previously yielded a test now properly calls a test function with the desired parameters, where assertions are made. This makes the tests easier to read, as most of the boilerplate associated with yielding is gone (e.g., changing the ``__name__`` attribute of the yielded test class).
+
+The second step is to organize the answer tests into classes (e.g., all of the enzo answer tests go in a class called ``TestEnzo``). This not only provides a natural organizational structure within the code itself, but it also provides a simple way of ensuring that all related answer tests get saved to the same answer file. The class structure also facilitates easier running of specific tests, especially in the case where answer tests and unit tests share the same module.


all related answer tests get saved to the same answer file

why is this important ?

The class structure also facilitates easier running of specific tests

I don't really get this last argument. pytest offers several ways to select a subset of test at runtime already based on names (pytest -k) or markers (pytest -m). Why do you think it's not enough ?

neutrinoceros · 2021-02-26T21:27:52Z

source/YTEPs/YTEP-0036.rst

+
+The second step is to organize the answer tests into classes (e.g., all of the enzo answer tests go in a class called ``TestEnzo``). This not only provides a natural organizational structure within the code itself, but it also provides a simple way of ensuring that all related answer tests get saved to the same answer file. The class structure also facilitates easier running of specific tests, especially in the case where answer tests and unit tests share the same module.
+
+As an extension of the above, each answer test (e.g., ``FieldValuesTest``) that was previously a class of its own is now a method (e.g., ``field_values_test``) of the base ``AnswerTest`` class, which resides in ``yt/utilities/answer_testing/framework.py`` and takes the place of the ``AnswerTestingTest`` class. The use of this base class provides a central location for any new answer tests that might be added in the future. Every class that performs answer testing (e.g. ``TestEnzo``) inherits from ``AnswerTest``, thereby giving it access to the methods of ``AnswerTest``.


I must be reading this wrong. Are you saying that those tests live in the framework.py ?

Upon reading a third time, I think the source of my confusion is the phrasing of the first sentence. Specifically

each answer test (e.g., FieldValuesTest)

I'm sure you mean something like "abstract answer test" here (I don't know of a better word here, sorry). In any case there's a distinction between generic/abstract tests and actual tests classes/functions, right ?
I think it should be clarified here and in the following.

source/YTEPs/YTEP-0036.rst

neutrinoceros · 2021-02-26T22:24:40Z

source/YTEPs/YTEP-0036.rst

+  # This is in yt/utilities/answer_testing.framework.py
+  class AnswerTest:
+    # The other methods are here. New method goes below
+    def my_new_answer_test(self, test_param1, test_param2):


what are the test discovery rules ? I'm wondering if my_new_answer_test is actually a "valid" test name according to pytest's default rules.

The part of the doc this comment applied too has been removed, but my question hasn't been answered (unless the answer is in the new version, but I'm not looking at it right now. In this case, please resolve this thread @jcoughlin11).

neutrinoceros · 2021-02-26T22:26:11Z

source/YTEPs/YTEP-0036.rst

+      # Do stuff
+      return result
+
+Any new frontend must have its tests registered and assigned an answer file. This should be done by updating ``yt_repo/tests/tests.yaml`` with:


Hurray, that's a small doc PR we can do !

source/YTEPs/YTEP-0036.rst

jcoughlin11 · 2021-03-03T17:08:30Z

@neutrinoceros Thanks for the comments! I think I addressed all of them (let me know if that's not the case)! I also significantly reworked the document to reflect the infrastructure changes I made since this ytep was first written. I also removed a lot of the material that felt like a re-hashing of the pytest documentation.

neutrinoceros · 2021-03-03T17:30:37Z

I tried resolving as many threads as I could @jcoughlin11, but I don't have time to read the whole new version again right now so I'll need to come back to this later. Thank you for simplifying the document, I look forward to proof reading again !

neutrinoceros

It looks even nicer than the last time I read it, well played !
I would like to add a small contribution that I think fits this very well and hasn't been discussed yet:
we have a lot of docstrings embedding example snippet suited for doctest, but we currently have no CI for those. Coincidently, it is extremely easy to make pytest run these along normal tests
(see https://docs.pytest.org/en/stable/doctest.html).
I expect this is a third subproject that can fit this YTEP. It will likely not be straightforward because many tests may be a little rotten, but I'm sure it's worth to do it.
Can you add a short paragraph (2 to 3 sentences should be enough ?) for this ?
Otherwise, after my remaining points have been adressed I'll be happy to approve this !

neutrinoceros · 2021-03-06T22:06:22Z

source/YTEPs/YTEP-0036.rst

+The second proposal of this YTEP is a general reorganization and streamlining for yt's answer testing framework.
+
+Currently, many answer tests in yt generate large arrays of data. In order to facilitate comparison, these arrays need to be saved. Unfortunately, their size necessitates that these answers be stored separately from the main yt code base and makes answer comparisons slow.
+


Oh wait it still applies. You can reject it if you want, no hard feelings.

source/YTEPs/YTEP-0036.rst

matthewturk · 2021-03-16T15:38:06Z

So I think this is good to go. I'm approving! Anyone want to object to merging?

munkm · 2021-03-16T15:42:28Z

I'm confused how all the unrelated YTEPS are touched in this PR? There should only be one file changed.

matthewturk · 2021-03-16T15:44:28Z

@jcoughlin11 if you're OK with me pushing to your branch I can fix what @munkm mentions, I think.

matthewturk · 2021-03-16T15:52:23Z

@munkm I think @jcoughlin11 fixed it with this

neutrinoceros · 2021-03-16T17:39:32Z

So I think this is good to go. I'm approving! Anyone want to object to merging?

✋🏻 I would like my last suggestion answered, even if not implemented.

jcoughlin11 · 2021-03-16T19:58:23Z

@neutrinoceros Are you referring to your comment about doctest integration? If so, I added a section to the ytep called Doctest Integration that briefly covers this. Was there something else I was missing?

neutrinoceros · 2021-03-16T20:06:26Z

@jcoughlin11 sorry my bad, I didn't see it (I was on my phone, it's harder to spot on a small screen)

Cool ! I figure we don't need to write down that the command line option --doctest-glob can be "adopted" in a config file, since it's not specific to this particular option.
However, could you also mention (or link to) this : https://docs.pytest.org/en/stable/doctest.html?highlight=doctest#doctest-namespace-fixture ? It seems like a pretty big deal to me since it allow to skip >>> import yt in doctest docstrings, which I remember adding in some places just to make doctest happy.

Added YTEP-0036.rst, which details converting from nose to pytest and…

80cc88b

… moving from stored arrays to stored array hashes.

ngoldbaum reviewed Oct 1, 2019

View reviewed changes

Updated ytep-0036 to reflect the new changes made with regards to par…

48df611

…ametrizing tests, no longer using OrderedDicts, and reworking the hashing fixture.

munkm requested a review from neutrinoceros February 26, 2021 19:48

neutrinoceros reviewed Feb 26, 2021

View reviewed changes

Updated pytest ytep.

bea7ef4

neutrinoceros reviewed Mar 6, 2021

View reviewed changes

Jared Coughlin added 3 commits March 8, 2021 11:37

Addresses review comments.

fda7fbb

Merge branch 'jcoughlin11/master'

139bef1

Added a brief section about doctests.

1c9d8f3

matthewturk approved these changes Mar 16, 2021

View reviewed changes

Merge branch 'master' of https://github.com/yt-project/ytep

7f54f9f

Xarthisius approved these changes Mar 17, 2021

View reviewed changes

Adds a bit more to the doctest section.

9a5968e

neutrinoceros approved these changes Mar 19, 2021

View reviewed changes

Xarthisius merged commit 686afa8 into yt-project:master Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YTEP-0036.rst: converting from nose to pytest #9

YTEP-0036.rst: converting from nose to pytest #9

jcoughlin11 commented Oct 1, 2019

ngoldbaum left a comment

ngoldbaum Oct 1, 2019

ngoldbaum Oct 1, 2019

jcoughlin11 Mar 8, 2021

ngoldbaum Oct 1, 2019

matthewturk commented Oct 2, 2019

matthewturk commented Oct 2, 2019

munkm commented Oct 10, 2019

ngoldbaum commented Oct 15, 2019

neutrinoceros left a comment

neutrinoceros Feb 26, 2021

neutrinoceros Mar 3, 2021

neutrinoceros Mar 6, 2021

jcoughlin11 Mar 8, 2021

neutrinoceros Feb 26, 2021

neutrinoceros Feb 26, 2021

neutrinoceros Feb 26, 2021

neutrinoceros Feb 26, 2021

neutrinoceros Mar 3, 2021

neutrinoceros Feb 26, 2021

jcoughlin11 commented Mar 3, 2021

neutrinoceros commented Mar 3, 2021

neutrinoceros left a comment

neutrinoceros Mar 6, 2021

matthewturk commented Mar 16, 2021

munkm commented Mar 16, 2021

matthewturk commented Mar 16, 2021

matthewturk commented Mar 16, 2021

neutrinoceros commented Mar 16, 2021

jcoughlin11 commented Mar 16, 2021

neutrinoceros commented Mar 16, 2021

		The second proposal of this YTEP is a general reorganization and streamlining for yt's answer testing framework.

		Currently, many answer tests in yt generate large arrays of data. In order to facilitate comparison, these arrays need to be saved. Unfortunately, their size necessitates that these answers be stored separately from the main yt code base and makes answer comparisons slow.


	Synchronising pull-requests merging with two repos instead of one also slows down the development itself and creates technical debt.


		Currently, the answer tests are mostly implemented as yield tests using nose. Pytest no longer supports yield tests because they resulted in multiple test functions being generated (one for each parameter combination), and these generated test functions did not properly support fixtures, as described `here <https://docs.pytest.org/en/latest/deprecations.html#yield-tests>`_.

		The first step in reorganizing the tests is, therefore, to remove all yield tests. Each function that previously yielded a test now properly calls a test function with the desired parameters, where assertions are made. This makes the tests easier to read, as most of the boilerplate associated with yielding is gone (e.g., changing the ``__name__`` attribute of the yielded test class).


		The first step in reorganizing the tests is, therefore, to remove all yield tests. Each function that previously yielded a test now properly calls a test function with the desired parameters, where assertions are made. This makes the tests easier to read, as most of the boilerplate associated with yielding is gone (e.g., changing the ``__name__`` attribute of the yielded test class).

		The second step is to organize the answer tests into classes (e.g., all of the enzo answer tests go in a class called ``TestEnzo``). This not only provides a natural organizational structure within the code itself, but it also provides a simple way of ensuring that all related answer tests get saved to the same answer file. The class structure also facilitates easier running of specific tests, especially in the case where answer tests and unit tests share the same module.


		The second step is to organize the answer tests into classes (e.g., all of the enzo answer tests go in a class called ``TestEnzo``). This not only provides a natural organizational structure within the code itself, but it also provides a simple way of ensuring that all related answer tests get saved to the same answer file. The class structure also facilitates easier running of specific tests, especially in the case where answer tests and unit tests share the same module.

		As an extension of the above, each answer test (e.g., ``FieldValuesTest``) that was previously a class of its own is now a method (e.g., ``field_values_test``) of the base ``AnswerTest`` class, which resides in ``yt/utilities/answer_testing/framework.py`` and takes the place of the ``AnswerTestingTest`` class. The use of this base class provides a central location for any new answer tests that might be added in the future. Every class that performs answer testing (e.g. ``TestEnzo``) inherits from ``AnswerTest``, thereby giving it access to the methods of ``AnswerTest``.

YTEP-0036.rst: converting from nose to pytest #9

YTEP-0036.rst: converting from nose to pytest #9

Conversation

jcoughlin11 commented Oct 1, 2019

ngoldbaum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewturk commented Oct 2, 2019

matthewturk commented Oct 2, 2019

munkm commented Oct 10, 2019

ngoldbaum commented Oct 15, 2019

neutrinoceros left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcoughlin11 commented Mar 3, 2021

neutrinoceros commented Mar 3, 2021

neutrinoceros left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewturk commented Mar 16, 2021

munkm commented Mar 16, 2021

matthewturk commented Mar 16, 2021

matthewturk commented Mar 16, 2021

neutrinoceros commented Mar 16, 2021

jcoughlin11 commented Mar 16, 2021

neutrinoceros commented Mar 16, 2021