Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all cache is reused when running a notebook the second time #37

Closed
bdestombe opened this issue Feb 25, 2022 · 4 comments
Closed

Not all cache is reused when running a notebook the second time #37

bdestombe opened this issue Feb 25, 2022 · 4 comments

Comments

@bdestombe
Copy link
Collaborator

When I run the Bergen model twice, not all cached data is reused, as can be seen below.

Intersecting oppwater with grid: 100%|██████████| 9/9 [00:02<00:00,  4.17it/s]
Loading gridded surface water data from cache.
INFO:nlmod.mdims.mlayers:get active cells (idomain) from bottom DataArray
INFO:nlmod.mdims.mgrid:get first active modellayer for each cell in idomain
INFO:nlmod.mdims.mlayers:using top and bottom from model layers dataset for modflow model
INFO:nlmod.mdims.mlayers:replace nan values for inactive layers with dummy value
INFO:nlmod.mdims.mlayers:add kh and kv from model layer dataset to modflow model
INFO:nlmod.mdims.mlayers:nan values at the northsea are filled using the bathymetry from jarkus
INFO:nlmod.cache:cache was created using different numpy array values, do not use cached data
INFO:nlmod.cache:cache was created using different dictionaries, do not use cached data
INFO:nlmod.mfpackages.mfpackages:creating modflow SIM, TDIS, GWF and IMS
INFO:nlmod.cache:caching data -> sea_model_ds.nc
INFO:nlmod.cache:cache was created using different numpy array values, do not use cached data
INFO:nlmod.cache:cache was created using different dictionaries, do not use cached data
INFO:nlmod.cache:caching data -> bathymetry_model_ds.nc
INFO:nlmod.mdims.mgrid:get first active modellayer for each cell in idomain
INFO:nlmod.cache:cache was created using different numpy array values, do not use cached data
INFO:nlmod.cache:cache was created using different dictionaries, do not use cached data
INFO:nlmod.mfpackages.mfpackages:creating modflow SIM, TDIS, GWF and IMS
INFO:nlmod.cache:caching data -> surface_water.nc```

Could this be that there are NaN values in the array and a different way of comparing two arrays should be used?
Is an optimization used to estimate certain values, so thtat there is always a small random error? Could we use np.isclose here?

Can't we easily make a test of this? Run a notebook twice and make sure nothing is downloaded the second time from the internet..

Best regards,
Bas

OnnoEbbens added a commit that referenced this issue Mar 21, 2022
@OnnoEbbens
Copy link
Collaborator

In this case the issue was some very small difference in the botm numpy array (dtype np.float64) from the gridprops dictionary. On my computer the cache was created using the value -13.60151824536105 while the new value was -13.601518245360648. I think this comes from preprocessing of the botm array where nan values are filled using interpolation.

Using np.allclose to compare the numpy arrays solves this issue.

Creating tests for this is a good idea although maybe not so easy. I will leave this issue open so we can decide later if we want to do this.

@bdestombe
Copy link
Collaborator Author

@OnnoEbbens I think you fixed this, right?

And writing the tests for the cache functions still remains?

@OnnoEbbens
Copy link
Collaborator

And writing the tests for the cache functions still remains?

Correct!

@OnnoEbbens
Copy link
Collaborator

Some tests were added for caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants