-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLD: require numpy 2.0.0 (stable) at build time #4930
BLD: require numpy 2.0.0 (stable) at build time #4930
Conversation
I'm puzzled. I don't understand how it's possible that the one of the test is still failing after bumping. |
Is it possible there's a NaN? I'm not sure how NaNs would impact the typical equality/hash comparisons. |
Good thinking ! indeed, hashing an array isn't a stable operation if it contains a NaN In [8]: a = np.arange(10, dtype="float")
In [9]: a
Out[9]: array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
In [10]: a[0] = np.nan
In [11]: hash(tuple(a))
Out[11]: -5529612596851953840
In [12]: hash(tuple(a))
Out[12]: -5371981722103082231
In [13]: hash(tuple(a))
Out[13]: -7440010721638526525
In [14]: hash(tuple(a))
Out[14]: -4675353377044086749
In [15]: hash(tuple(a))
Out[15]: 1634418349654970603
In [16]: hash(tuple(a))
Out[16]: 5494725379304409846
In [17]: a = np.arange(10, dtype="float")
In [18]: hash(tuple(a))
Out[18]: -2040549277248155741
In [19]: hash(tuple(a))
Out[19]: -2040549277248155741
In [20]: hash(tuple(a))
Out[20]: -2040549277248155741
In [21]: hash(tuple(a))
Out[21]: -2040549277248155741
In [22]: hash(tuple(a))
Out[22]: -2040549277248155741
In [23]: hash(tuple(a))
Out[23]: -2040549277248155741 (I should point out that this little experiment was conducted with numpy 1.26) |
I really need to learn more about NaNs. I had no idea there was, officially, a 'payload': https://en.wikipedia.org/wiki/NaN |
oh interesting, it's not even the same tests that are failing -- still enzo, but the |
Awesome, thank you very much. Feel free to push to this branch if you need to ! |
well I'm thoroughly confused. Manually comparing values didn't show any difference in grid values for in an environment with np2, store the answer tests for enzo:
and then immediately run a comparison in the same np2 environment
and that test fails for me:
But this is only happening within nose -- I made a little manual answer test and couldn't reproduce the failure (script is here). Not even sure where to look now... EDIT: on my desktop (Ubuntu), I don't have the above issue -- storing the enzo tests then immediately re-running passes as expected. |
Interesting that you get
while the latest attempt on Jenkins has
So the hash isn't stable but it's not completely random either.
Just curious, how many attempts did you make ? Was it one (un)lucky try ? |
Maybe there's an unstable ordering issue -- is it possible |
wait sorting can be unstable ??? in what sense ? |
So I'm speculating. :) What I'm wondering is if
and then sorting by those indices could potentially result in different orderings. But I'm kind of grasping at straws here and I don't think it's the source of the error. I will note that we've struggled with divergence in the past -- it has been the source of differences (but reliable, not stochastic) as a result of order-of-operations, subtraction of tiny numbers from single precision, etc. |
Oh I missed that you were talking about |
oh, and |
Ah, but while I thought we used
|
is it ? the docs says
|
Right -- |
Ooooow. To think I actually worked on this a couple months back and apparently obliterated it from my memory... |
well i don't have an answer but have found some suspicious behavior. I'm going to summarize it all here how to reproduce failureinstall yt, noseWorking from a fresh py3.10 environment:
For running nose with python 3.10, you need to edit nose a bit. The jenkins build does the following:
on a mac and with a pyenv-virtualenv called
running the testsIn an effort to match more closely what jenkins is doing, I hacked away at the failureHere's the output of running
Initially the answer store does not exist -- it runs the tests and stores them. Then it immediately runs the tests again and fails with the exact same ACTUAL/DESIRED hashes in the failing test here. getting it to pass?run againIf I immediately run tests again, with the answer store already existing, tests pass:
dataset state?In trying to find the actual cause of the error, I came across a couple of ways to get the test to pass that are likely fixing a symptom and not the cause... First, yt/yt/frontends/enzo/tests/test_outputs.py Lines 117 to 126 in 01db599
If I edit that to instead pass the filename of the dataset, then tests pass initially. Second, the actual calculation of the divergence is very sensitive to small changes in the value of grid-spacing. Here's the relevant code: yt/yt/fields/vector_operations.py Lines 355 to 367 in 01db599
If I change those |
no update on a fix, but looks like the failure isn't np 2 related. I think this behavior has been around a while and it likely is only being exposed now because of bumping the answers: here's a test script import yt
import numpy as np
yt.set_log_level(50)
def _get_vals():
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
g1 = ds.index.grids[0]
vdiv_1 = g1['gas', 'velocity_divergence']
return vdiv_1
if __name__ == "__main__":
print(np.version.version)
v1 = _get_vals()
v2 = _get_vals()
print(np.max(np.abs(v1 - v2))) # 4.930380657631324e-32 1/s
print(np.all(v1==v2)) # False
v3 = _get_vals()
print(np.all(v1==v3)) # False
print(np.all(v2==v3)) # True The script above behaves the same for np>2 and np<2 (including when building yt with np<2 -- I went back as far as yt-4.1.4). |
Ok -- kinda figured out why this is happening, but still not certain of the root cause: So the issue is related to unit registry state. Within the yt/yt/fields/vector_operations.py Lines 355 to 367 in 01db599
Those Here's a standalone script that illustrates the bug: import yt
yt.set_log_level(50)
print("load ds 1\n")
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
print(f"Initial registry id on load 1: {id(ds.unit_registry)}") # 139759425784224
dx = 1. * ds.index.grids[0]['index', 'dx'][0,0,0]
print(f"dx registry: {id(dx.units.registry)}") # 139759425784224
print("\nload ds 2\n")
ds2 = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
print(f"Initial registry id on load 2: {id(ds2.unit_registry)}") # 140563377278064
dx0 = ds2.index.grids[0]['index', 'dx'][0,0,0]
print(f"dx0 registry: {id(dx0.units.registry)}") # 140563377278064
dx = 1. * dx0
print(f"dx registry: {id(dx.units.registry)}") # 139759425784224 for which you get
That the unit registry of the final If I comment all the
Adding the following will print out the conversion factor that ends up being used if "velocity_divergence" in str(field):
print("_generate_fields")
print(id(fd.units.registry))
from unyt.array import _sanitize_units_convert
new_units = _sanitize_units_convert(fi.units, fd.units.registry)
(conv_factor, offset) = fd.units.get_conversion_factor(
new_units, fd.dtype
)
print(f"conversion factor: {conv_factor}") when you access
it hits this line twice, the first time through the conversion factor is not exactly 1:
the second time through it is exactly one:
and with the unit registry state issue, when you open the dataset again, the initial conversion factor is identically 1.0, so the final hash differs between the two cases. a quick fix?turns out that you change this line: yt/yt/fields/vector_operations.py Line 365 in 01db599
to pass in the str repr of the unit: new_field = data.ds.arr(np.zeros(data[xn].shape, dtype=np.float64), str(f.units)) the unit-registry state issue is fixed (because it forces unyt will pull in the correct registry associated with |
If the str But your fix seems safe anyway, so let's ! |
the hashes were just on based the array values though -- there are actual differences in the arrays. Here's a simpler example that illustrates the problem:
shows
in any case, the quick fix is probably the way to go -- i have to hop off right now though, may not get to it before next week. |
ooow, I completely missed that. Yes, I think your fix is the way to go, so I'll push it now (with you as the commit author). |
just pushed it now. I wasn't sure how to phrase the commit message so feel free to amend it. |
a55609a
to
73a1672
Compare
Ok Jenkins is back to green, but just to be sure, let's double check that it's really stable : @yt-fido test this please |
PR Summary
Numpy 2.0 just dropped on PyPI (🎉). This change is a pretext to make sure everything still runs smoothly. From the conversation on #4874, I expect I'll need to bump a couple answer tests but everything else I expect to stay green.
Closes #4874