Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON reporting #131

Merged
merged 23 commits into from
Aug 19, 2022
Merged

JSON reporting #131

merged 23 commits into from
Aug 19, 2022

Conversation

asmeurer
Copy link
Member

JSON reporting will be done with the pytest-json-report package. This produces a pretty usable JSON and makes it easy to extend it with custom metadata at test run time.

The idea will be to take the JSON generated by pytest-json-report and translate it into something more usable by a reporting tool. To start with, I will focus on test_has_names, which has been reintroduced here, to list which spec names are or are not defined in the given package.

This just tests which names exist without checking anything else. It also
doesn't use hypothesis, so it should work even if the library doesn't have
proper hypothesis support (assuming the test suite can even run at all in that
case, I'm not completely sure).
That way it can be run even on libraries that fail the hypothesis sanity
checks.
@@ -34,6 +34,10 @@
f for n, f in inspect.getmembers(array, predicate=inspect.isfunction)
if n != "__init__" # probably exists for Sphinx
]
array_attributes = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about array_properties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in any case, if you could add this to __all__. Maybe this is just a weird side product of how I learnt Python, but I like using it to denote the internally-public API of a module.

Copy link
Member

@honno honno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest-json-report looks nifty!

@rgommers
Copy link
Member

rgommers commented Jul 7, 2022

@asmeurer nice! For review/discussion it would be useful if you added an example of what the JSON output looks like (maybe in a gist?) if you have one.

@asmeurer asmeurer marked this pull request as ready for review August 2, 2022 22:27
@asmeurer
Copy link
Member Author

asmeurer commented Aug 2, 2022

This is ready to be merged. I've extracted a lot of custom metadata from the tests in to the report. For hypothesis errors, for now it only extracts the error message (string). I could look into trying to extract the actual data itself and JSON serializing it, but I think that will be harder, so for now I'm going to wait until it looks like we actually need that.

I have uploaded an example report for numpy.array_api here https://gist.github.com/asmeurer/70d9b05e91a67d6b035b9b5305e427f0. Note that the numpy.array_api report is 11 MB uncompressed. I also ran the tests with plain numpy, and the report size is 595 MB (I did not upload it). In general, many failures will balloon the file size because the full traceback is stored for each failure. However, they compress quite well (I'm not sure what the best compression to use for the web is, but a basic zip compression takes the numpy.array_api report down to 375 KB and the plain numpy report down to 3.4 MB).

If there is any other metadata you can think of that should be included, let me know.

@asmeurer
Copy link
Member Author

asmeurer commented Aug 2, 2022

Something to keep in mind when doing any further developments on the test suite is to make sure that any inputs to test functions are JSON-serializable. If they aren't, the to_json_serializable function in reporting.py should be updated. I've made it so that dataclasses and namedtuples are serialized as objects (dicts), and functions are serialized to a string of their name, but anything that is unrecognized will just be serialized as its repr.

@honno honno self-requested a review August 3, 2022 08:58
@rgommers
Copy link
Member

rgommers commented Aug 3, 2022

In general, many failures will balloon the file size because the full traceback is stored for each failure.

Maybe using --tb=native is helpful? Those long tracebacks are usually unhelpful, and pytest has a terrible default there.

This reverts commit f356ffd.

It doesn't actually save that much space, so let's just use the default since
it has more information and it keeps the code simpler.
@asmeurer
Copy link
Member Author

asmeurer commented Aug 3, 2022

OK, it turns out I was actually wrong here. The tracebacks do contribute a some amount of data in general, but it's not actually the reason the plain numpy report is over half a gigabyte. That's apparently because of the warnings. There are over a million warnings emitted in the run, and they account for 98% of the JSON file size. Of these 98% are the warning "np.bool is a deprecated alias for the builtin bool. To silence this warning, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.bool_ here." I guess it is getting emitted for every single hypothesis run.

Here's the full warning metadata that's repeated so many times:

{
    "message": "`np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.\nDeprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations",
    "category": "DeprecationWarning",
    "when": "runtest",
    "filename": "/Users/aaronmeurer/anaconda3/envs/array-apis/lib/python3.9/site-packages/hypothesis/extra/array_api.py",
    "lineno": 119
}

@honno, any idea why this warning would be omitted every single time hypothesis calls np.bool, instead of just once?

I did try switching from the default traceback type to native, but the savings are actually pretty small, so I'm going to revert it because the default type has more information, and because it keeps the code simpler.

@honno
Copy link
Member

honno commented Aug 4, 2022

@honno, any idea why this warning would be omitted every single time hypothesis calls np.bool, instead of just once?

(I assume you're testing the top-level numpy namespace)

If you meant emitted, I'd guess it's to do with the pytest reporting steps. Hypothesis/pytest might "merge" warnings together as it does errors in the reporting stage, so pytest would only show one warning, but however we capturing warnings here ignores that. Generally it makes sense that warning is emitted many times, as np.bool is accessed many times due to the vast number of Hypothesis examples generated in a run of the whole test suite.

@honno
Copy link
Member

honno commented Aug 4, 2022

Note flaky CI failure is due to #125

Copy link
Member

@honno honno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like prefering array_attributes -> array_properties, personally I'd prefer test_has_names.py:: test_has_names -> test_has_attributes.py:: test_has_attributes.

conftest.py Outdated
@@ -7,8 +7,9 @@
from array_api_tests import _array_module as xp
from array_api_tests._array_module import _UndefinedStub

settings.register_profile("xp_default", deadline=800)
from reporting import pytest_metadata, add_extra_json_metadata # noqa
Copy link
Member

@honno honno Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a small comment explaining what's being imported here and why? I assume it's doing some magic when imported. Oh import reporting is importing the file you've introduced. I'd lean towards just deleting reporting.py and putting those fixtures in conftest.py, but I see value in not bloating conftest.py.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the names have to be defined here so that pytest will see them as fixtures. I decided to keep the reporting in a separate file to keep things clean, since it's already a good bit of code and could potentially be more if we decide to add more stuff to the report in the future.

@asmeurer
Copy link
Member Author

asmeurer commented Aug 4, 2022

personally I'd prefer test_has_names.py:: test_has_names -> test_has_attributes.py:: test_has_attributes

Your suggestion is to have two separate files for this? I don't really see the value in that, especially since the file is so short and uses the same logic for names and attributes.

@asmeurer
Copy link
Member Author

asmeurer commented Aug 4, 2022

My point with the warnings is that by default, warnings are only omitted once per warning:

>>> import numpy as np
>>> np.bool
<stdin>:1: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
<class 'bool'>
>>> np.bool
<class 'bool'>

But this exact same warning is emitted every single time. So somewhere something is changing the warnings filter to enable them always. I did see some warnings code in hypothesis but I wasn't sure if it was what was causing this. I guess it could also be pytest doing it somehow.

@honno
Copy link
Member

honno commented Aug 4, 2022

personally I'd prefer test_has_names.py:: test_has_names -> test_has_attributes.py:: test_has_attributes

Your suggestion is to have two separate files for this? I don't really see the value in that, especially since the file is so short and uses the same logic for names and attributes.

Ah I was being lazy heh, I meant rename test_has_names.py to test_has_attributes.py, and rename the test case test_has_names to test_has_attributes.

@honno
Copy link
Member

honno commented Aug 4, 2022

My point with the warnings is that by default, warnings are only omitted once per warning:

>>> import numpy as np
>>> np.bool
<stdin>:1: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
<class 'bool'>
>>> np.bool
<class 'bool'>

But this exact same warning is emitted every single time. So somewhere something is changing the warnings filter to enable them always. I did see some warnings code in hypothesis but I wasn't sure if it was what was causing this. I guess it could also be pytest doing it somehow.

Ooo, this feels like Numpy is doing some magic which pytest/Hypothesis isn't equipped for? Probably requires some hacking around to fix.

@rgommers
Copy link
Member

rgommers commented Aug 4, 2022

Ooo, this feels like Numpy is doing some magic which pytest/Hypothesis isn't equipped for? Probably requires some hacking around to fix.

It shouldn't do anything fancy; the one thing is filterwarnings = error in pytest.ini.

@asmeurer
Copy link
Member Author

asmeurer commented Aug 4, 2022

NumPy's pytest.ini is irrelevant here. It looks like the pytest defaults are to store each warning invocation separately. Since the warning metadata doesn't even contain enough information to distinguish the different warnings, I think the simplest solution here is to post-process the JSON to deduplicate the warnings. (probably a lot of the stuff I'm doing here could be upstreamed to the pytest-json-report library)

This was causing the report file sizes to be huge, especially for plain numpy.
@asmeurer
Copy link
Member Author

asmeurer commented Aug 4, 2022

OK, I deduplicated the warnings output. The plain numpy report is now 11 MB and numpy.array_api is 6.1 MB (both uncompressed). I've uploaded examples of each to the gist. I'd like to hear feedback from Athan on whether I need to try to reduce this file size any further. Again, these are uncompressed sizes, and they compress well (both are under 0.5 MB after basic zip compression).

@honno
Copy link
Member

honno commented Aug 5, 2022

Ooo, this feels like Numpy is doing some magic which pytest/Hypothesis isn't equipped for? Probably requires some hacking around to fix.

It shouldn't do anything fancy; the one thing is filterwarnings = error in pytest.ini.

Ah my confusion here is say the following

# Using ipython
>>> class foopy_factory:
...     @property
...     def bool(self):
...         warn("don't use this", DeprecationWarning)
...         return bool
... 
>>> foopy = foopy_factory()
>>> foopy.bool
DeprecationWarning: don't use this
bool
>>> foopy.bool
DeprecationWarning: don't use this  # warning is emitted again
boo
...
>>> np.bool
DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`...
np.bool
>>> bool
DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`...  # warning is emitted again
np.bool

the warning is emitted for each access of foopy.bool, but Aaron's example only show it emits the first time.

>>> import numpy as np
>>> np.bool
<stdin>:1: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
<class 'bool'>
>>> np.bool
<class 'bool'>

I see now that the emit-once behaviour is seen using the default python interpreter, but not using say IPython.

# Using python
>>> class foopy_factory:
...     @property
...     def bool(self):
...         warn("don't use this", DeprecationWarning)
...         return bool
... 
>>> foopy = foopy_factory()
>>> foopy.bool
DeprecationWarning: don't use this
bool
>>> foopy.bool
bool  # no warning this time

Seems like Aaron's found a solution anywho, but might be worth noting that environment things can affect this behaviour... or I might be missing something (python just catches duplicate warnings?), I'm pretty ignorant on this stuff heh.

@asmeurer
Copy link
Member Author

Need to have this merged so I can start testing it.

@asmeurer asmeurer merged commit 66ab89c into data-apis:master Aug 19, 2022
@kgryte
Copy link

kgryte commented Aug 22, 2022

@asmeurer Looking through the sample JSON, they look fine to me. No need to reduce further, IMO. The raw files can be parsed and transformed for subsequent presentation on, e.g., a website. So not too concerned regarding initial report file sizes, particularly as we can store them on disk in compressed format.

@kgryte
Copy link

kgryte commented Aug 22, 2022

One thing we should consider, however, is versioning the JSON format. E.g., using JSON schema. This would allow versioning our backend server APIs to handle transformations of different versions of the JSON report.

In which case, adding a __version__ field to the JSON report could may sense which refers to the relevant schema version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants