Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access Beam_groupX as subgroups of the Sonar group #567

Closed
b-reyes opened this issue Feb 24, 2022 · 9 comments
Closed

Access Beam_groupX as subgroups of the Sonar group #567

b-reyes opened this issue Feb 24, 2022 · 9 comments
Assignees
Labels
data format Anything about data format docs
Milestone

Comments

@b-reyes
Copy link
Contributor

b-reyes commented Feb 24, 2022

In issue #519 we are proposing to move the beam group to a sonar subgroup. If we make this change, we need to consider what is the best way for a user to easily understand this change and more importantly how we should code this. This spawns several questions:

  1. Say we have two beam groups: narrowband and broadband, then how should the user expect to access these groups? If given the EchoData object "ed", the intuitive way to access these groups is to do: "ed.sonar.narrowband" or "ed.sonar.broadband". The output here would then be an Xarray Dataset that is similar to the current beam group.

    Now, say the user wants to access the metadata of the sonar system, then they would do "ed.sonar". From a coding perspective, how could we accomplish this? It seems like "sonar" would have to be both a variable and a class at the same time.

  2. If we resolve how to code item 1., how can we make it such that a user knows what sonar subgroups are available within "ed.sonar"?

  3. Once this change is made, how do we reflect this change within the documentation? Currently, we have a very nice display of the EchoData object at the bottom of our documentation that is useful for new users. However, once we make this move, this depiction will no longer be accurate. Is it possible to maintain this portion of the documentation, and if so, how?

@leewujung
Copy link
Member

In general I think we should do it in a way that is consistent with the convention ver 1.

  1. Say we have two beam groups: narrowband and broadband, then how should the user expect to access these groups? If given the EchoData object "ed", the intuitive way to access these groups is to do: "ed.sonar.narrowband" or "ed.sonar.broadband". The output here would then be an Xarray Dataset that is similar to the current beam group.

For this I think we should have the access pattern ed.sonar.beam_group1 and ed.sonar.beam_group2 that is consistent with the convention, so new users understanding the convention will be able to use this easily. This will also allow us to open and process files that in theory can be written to netcdf directly from the instruments once manufacturer made that change.

Note that we do expect power users who will want to access the details of the data, so our code should support that.

  1. If we resolve how to code item 1., how can we make it such that a user knows what sonar subgroups are available within "ed.sonar"?

As we discussed today (and what @emiliom @gavinmacaulay and I discussed on Monday), in the Sonar group (ed.sonar) we can have a data variable recording which beam_group contain what type of data. In the form similar to the below:

# say we 2 beam groups as in your example

# coordinate of length 2
beam_group_sequence = ("beam_group1", "beam_group2")

# data variable with coordinate beam_group_sequence
beam_group_content = ("narrowband raw data", "broadband raw data encoded in power/angle samples")

Note that in this example the values in the coordinate variable are just strings and not xarray datasets. Values i the data variable are also just strings.

  1. Once this change is made, how do we reflect this change within the documentation? Currently, we have a very nice display of the EchoData object at the bottom of our documentation that is useful for new users. However, once we make this move, this depiction will no longer be accurate. Is it possible to maintain this portion of the documentation, and if so, how?

I think once we agree on item 1. and the underlying repr, since we now use jupyter book for generating documentation, it'll be a much easier job than before (when @emiliom did it in html!) to incorporate the new layout.

@leewujung leewujung added the data format Anything about data format label Feb 25, 2022
@leewujung leewujung added this to the 0.6.0 milestone Feb 25, 2022
@leewujung leewujung moved this to Todo in Echopype Feb 25, 2022
@leewujung
Copy link
Member

Note that we need this for v0.6.0 since the access pattern will change from the current ed.beam or ed.beam_power to ed.sonar.beam_groupX.

@leewujung leewujung changed the title User Experience with #519 Access Beam_groupX as subgroups of the Sonar group Feb 25, 2022
@leewujung leewujung added the docs label Feb 25, 2022
@emiliom
Copy link
Collaborator

emiliom commented Mar 3, 2022

Thanks for creating this issue and thinking through these ramifications, @b-reyes ! I finally had a chance to digest it. You raise excellent questions. As you probably saw, in PR #545 I sidestepped those questions and was sticking with the flat EchoData object accesors for each group for the time being.

I think the crux of the challenge is this:

Now, say the user wants to access the metadata of the sonar system, then they would do "ed.sonar". From a coding perspective, how could we accomplish this? It seems like "sonar" would have to be both a variable and a class at the same time.

In the current EchoData class we could get away with the flattened structure because there wasn't much of a hierarchy; the only one currently present is Platform/NMEA, and even that flattened presentation started out as a misunderstanding where NMEA was incorrectly placed at the root level like all other groups.

There is some progress in xarray with support for groups (see pydata/xarray#4118, though it's not clear to me where it's headed). I wonder if there's anything we can adapt from it. Or, more generally, whether we can make things a bit less simple by a "subgroup" dict property, such as ed.sonar.subgroup['beam_group1']. Though, I guess that doesn't solve the challenge you mention of "sonar" needing to be both a variable and a class. Hmm.

@emiliom
Copy link
Collaborator

emiliom commented Mar 29, 2022

@lsetiawan were you planning to advance your DataTree exploration soon? Or, more to the point, could you create an initial implementation and push it as a PR for us to discuss? 🙏 😍

BTW I'm realizing that the scope of this issue really goes beyond the sonar vs beam subgroup scope in the title. It actually spans the handling (accessing and referencing) of all SONAR-netCDF4 groups and subgroups. Maybe we should edit the title 🤔

@lsetiawan
Copy link
Member

Sure! I can definitely do that. I'll try implement it this Friday. 😄

@emiliom
Copy link
Collaborator

emiliom commented Mar 29, 2022

Thanks!!

@leewujung
Copy link
Member

Please see #606 -- I think there can be a glitch in using DataTree directly to open v0.5.x data, since the groups won't be in the right place for those Beam_groupX subgroups (but will be correct for the Platform/NMEA subgroup).

@b-reyes
Copy link
Contributor Author

b-reyes commented Apr 12, 2022

@leewujung and @emiliom I think we might be able to close this since @lsetiawan implemented DataTree for EchoData. His work seems to address all of the concerns above.

@emiliom
Copy link
Collaborator

emiliom commented Apr 12, 2022

Agreed. Thanks. I'll go ahead and close it.

@emiliom emiliom closed this as completed Apr 12, 2022
Repository owner moved this from Todo to Done in Echopype Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data format Anything about data format docs
Projects
Status: Done
Development

No branches or pull requests

4 participants