-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues using xarrays in update step #3
Comments
I guess doing this every time step would indeed slow down the model. It also depends on if the DataArray consists of dask arrays, or already numpy arrays. I would not expect much of an overhead if the data is already in memory (of course more than just an array slice, but I did not write the bmi example to be extremely efficient). Do note that if the data is loaded lazily (dask arrays), that this line will cause the file to be accessed. If a dask worker/thread dies it isn't as easy to debug the issue as you don't get a clear stacktrace. |
Try changing this line: To have this structure, it'll avoid the resource being locked:
Edit: this is actually implemented as |
Even then, I'm still having stability issues (on huge runs). Ill give |
For context, I'm calling |
Does the same issue still occur if you give each model its own forcing file? That could be a way to try to debug this. Or first do the initialize step sequentially, giving each model sufficient time to load the files before initializing the next model. Then the update step does not have any file IO. If that works OK, I would think it's probably related to file access being locked somehow. Perhaps you can work around that with locks https://distributed.dask.org/en/latest/api.html#distributed.Lock |
Another thing you could try is the |
There is also the Seems to always lock on the NETCDF4 backend if you have the default |
The
Will give this a try. |
I think its this, when testing with multiple forcing files it works faster and (fingers crossed) doesn't seem to crash. |
I think that in general it would be better to have separate forcing objects for separate models. We can continue the discussion here: Daafip/eWaterCycle-DA#35 |
When using this template as a starting point to implement my own model in eWaterCycle I followed this convention:
leakybucket-bmi/src/leakybucket/leakybucket_bmi.py
Line 49 in df3a8ac
For very large dataset, i understand this as you don't want to load everything into memory.
However for smaller (conceptual) models, from my testing the repeated calling of xarray causes slower code and even crashes:
Not sure on the best way to tackle this? a short comment or somthing like that?
The text was updated successfully, but these errors were encountered: