-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use FieldTimeSeries
for analysis?
#181
Comments
Just want to bump this and make sure it is read --- I think it's important to use |
I made the choice to stick with NetCDF because that's how the majority of users (based on the people I know at least) use it to process Oceananigans output and the Oceananigans examples are already all written with JLD2 output (with I think two exceptions only). Once |
I know this isn't the place but, out of curiosity, can you explain the motivation behind building an in-house solution from scratch? DimensionalData has very nice capabilities that abstract away indices (much like xarray), with some nice visualizations. I definitely don't want to downplay the development of |
Suggesting that we use
We use It's not really important whether the "majority of users" interface with NetCDF (even if that were true, which I doubt). What matters more is our vision for next-generation ocean modeling software. My argument is that we can build something much more productive and powerful if we recognize that the problems of post-processing and online diagnostics are in fact, exactly the same. The typical approach is to build different systems for those two things. If we build one system, we'll have a lot more with less code and less work. |
I think the priority should be to use That said, I don't think the output format is very crucial if we are working within Julia. NetCDF has the advantage of being useful outside of Julia, but this is moot for a Julia package (like Oceanostics). On the other hand JLD2 is more lightweight and portable (there are systems that don't even support NetCDF), so JLD2 is absolutely crucial. |
Gotcha, I wasn't aware of that distinction. Thanks for the clarification. I do like the idea of subtyping
I disagree. I think people come to the examples section of the docs as a starting point to write their own scripts. So providing them with something they're more likely to use is helpful. I don't oppose using JLD2 and NetCDF in different examples though. I just don't think we should switch all examples completely.
I agree with that too. The issue in my view is that xarray has a huge head start in solving these issues, so people are more likely to use it. |
But following this logic, we could never implement / demonstrate a new feature, because nobody would use it yet. I think we need to document the way we want things to be done. If users persist in doing things differently, we need to figure out why and make sure we work on features that are useful. |
Why is this an issue? I think in principle we can bring xarray features into |
I'm also a huge proponent for NetCDF for multiple reasons (even for fully Julian workflows) so I'm interested in making I'd also love some of the niceties of xarray, DimensionalData.jl, and Maybe we can work together on this, and resolve this issue in the process? CliMA/Oceananigans.jl#2652 may be the next step. Then I was thinking of opening a PR to modernize or clean up |
I think NetCDF has important applications, despite some downsides like worse performance and reduce portability. The most important deficiency is that calculations from the simulation cannot be reproduced when saving in NetCDF because Field cannot be reconstructed. I think this is a major risk of saving in NetCDF; even though it may be convenient to analyze with xarray, the price of ruling out precise analysis has a major scientific cost. It's a lot to pay for convenience. Performance and portability have to be solved elsewhere but we can do something about this reproducibility problem. We need to make it easier to rebuild native Oceananigans types from NetCDF data. This could be part of an effort to support As for DimensionalData / RasterStack, they seem convenient so it would be nice to leverage them. It seems like the right way to do this is to make |
I like this plan. Although CliMA/Oceananigans.jl#2652 is old enough that it might be preferable to start a new PR. And if that's true it might be preferable to start with modernizing the |
Yeah I don't think it'll be pretty, but we should be able to save enough metadata in the NetCDF file to do this.
I know this came up in the past but it sounded like it would be too much work and refactoring? I could be wrong. If we just want a few features then it could be easier to implement just those?
Sounds good! I don't know if we should open a new issue to discuss but yeah I do have my own list of |
I can't remember. But of course, there is always a trade-off between cost/work and benefit gained. If people are willing to absorb the huge penalty of going to xarray for analysis in order to get slicing then to me that suggests there is a pretty significant benefit. That assumes that the penalties are understood and rational decisions are made of course... |
I think it's a bit nicer to use our built-in solution, since it gels with plotting and enables working with
AbstractOperations
in post-processing:Oceanostics.jl/docs/examples/two_dimensional_turbulence.jl
Lines 105 to 107 in f7573d9
Also nice to be diligent about using it in docs, which encourages users to help contribute to developing its features.
The text was updated successfully, but these errors were encountered: