Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The future of analysis.py functions #146

Closed
freemansw1 opened this issue Jun 27, 2022 · 3 comments · Fixed by #207
Closed

The future of analysis.py functions #146

freemansw1 opened this issue Jun 27, 2022 · 3 comments · Fixed by #207
Labels
enhancement Addition of new features, or improved functionality of existing features In Review A PR is in review for this issue v2.x compatibility Compatibility issues with v1.x and v2.x.
Milestone

Comments

@freemansw1
Copy link
Member

With the discussion of docstrings for these functions in #138, I think it's time that we have a discussion about the future of the analysis functions in analysis.py more generally. I also want us to discuss plotting.py and centerofgravity.py, but those conversations can be in separate issues. While this issue will be focused on v1.x, I think what we take with us to v2.x and beyond is relevant here.

I would really welcome thoughts and comments from @mheikenfeld and @pjmarinescu on the history of these functions and any thoughts that you have on their longer-term future.

Here is a list of all functions in analysis.py with a short description of what they do, and my general thoughts on them:

  • cell_statistics_all

    def cell_statistics_all(

    • Runs cell_statistics across all cells passed in.
    • Exists in v2.0-dev
  • cell_statistics

    def cell_statistics(

    • Builds netcdf files of statistics of each cell. Not sure exactly what is output; need to run to double check
    • I think we should immediately depreciate this function.
    • Exists in v2.0-dev
  • cog_cell

    def cog_cell(

    • Calculates the (x,y,z) center of gravity for each cell
    • I think we should immediately depreciate this function, as it is very specific to clouds and very specific to liquid/frozen precip.
    • This functionality is also replicated in centerofgravity.py.
    • Removed from analysis.py in v2.0-dev.
  • lifetime_histogram

    def lifetime_histogram(

    • Calculates a histogram of cell lifetime
    • I am indifferent about this function, but it is something that is really easy for users to do. Maybe it is better to keep it around and have it work with xarray and pandas as we transition to being xaray?
    • Exists in v2.0-dev
  • haversine

    def haversine(lat1, lon1, lat2, lon2):

  • calculate_distance

    def calculate_distance(feature_1, feature_2, method_distance=None):

    • Calculates distance (in meters) between features, either using lat/lon or x/y
    • I think this is generally reasonable to keep around, but it's very specific with the projection coordinates.
    • Exists in v2.0-dev
  • calculate_velocity_individual

    def calculate_velocity_individual(feature_old, feature_new, method_distance=None):

  • calculate_velocity

    def calculate_velocity(track, method_distance=None):

    • Calculates all cell velocities
    • Exists in v2.0-dev
  • velocity_histogram

    def velocity_histogram(

    • Calculates a histogram of cell velocities
    • I feel similarly to this as I do to lifetime_histogram
    • Exists in v2.0-dev
  • calculate_nearestneighbordistance

    def calculate_nearestneighbordistance(features, method_distance=None):

    • Calculates the distance between each feature and its nearest neighbor.
    • I think this is fine in there, but why does it not also report what the nearest neighbor is?
    • Exists in v2.0-dev
  • nearestneighbordistance_histogram

    def nearestneighbordistance_histogram(

    • Same as the other histogram functions, and I feel similarly
    • Exists in v2.0-dev
  • calculate_areas_2Dlatlon

    def calculate_areas_2Dlatlon(_2Dlat_coord, _2Dlon_coord):

  • Calculates cell area using lat/lon

  • I think the implementation in v2.0-dev is better and we should probably migrate it to v1.x for 1.6 or earlier

  • calculate_areas

    def calculate_area(features, mask, method_area=None):

    • See discussion on calculate_areas_2Dlatlon
    • I am not opposed to keeping this, but it will be challenging to move this to a pure xarray implementation.
    • Exists in v2.0-dev
  • area_histogram

    def area_histogram(

    • See discussion on earlier histogram functions
    • Exists in v2.0-dev
  • histogram_cellwise

    def histogram_cellwise(

    • Produces a histogram of whatever cell value (using max/min/mean) over each cell
    • I am happy for this to remain in, but it will again be more challenging to move this to a pure xarray implementation.
    • Exists in v2.0-dev
  • histogram_featurewise

    def histogram_featurewise(Track, variable=None, bin_edges=None, density=False):

    • Same as histogram_cellwise, but along features rather than along cells.
    • Same comments as histogram_cellwise
    • Exists in v2.0-dev
  • calculate_overlap

    def calculate_overlap(

  • Calculates whether or not two cells overlap

  • I think this function needs a lot more explanation, but it could potentially be useful.

  • Exists in v2.0-dev

This was longer than I was expecting, so I've put a summary here:

Recommend Depreciation

  • cell_statistics_all
  • cell_statistics
  • cog_cell

Everything else could generally go either way, but we need more documentation on these functions in general. Perhaps a page or two in the documentation specifically on how to use these functions?

@freemansw1 freemansw1 added enhancement Addition of new features, or improved functionality of existing features v2.x compatibility Compatibility issues with v1.x and v2.x. labels Jun 27, 2022
@mheikenfeld
Copy link
Member

mheikenfeld commented Jun 28, 2022

Thanks for the nice overview of your understanding of the functions... I think most of your assessment is pretty similar to how I see it now, too.. I think quite a lot of it stemmed from the early days when we were taking existing code that I had written to wrap it up into the package and thus relatively specific to the way we were using the tracking.

Deprecating the more detailed things in here makes perfect sense. Some of it like the histograms could maybe picked up in some kind of tutorial notebook. Other things might be better placed under utils.

I guess for most of these functions I might be the only one who has really used them a bit more (would be good to hear if anyone else has, but I doubt it due to the lack of documentation...) and for me it would be completely fine to break backwards compatibility here at some point..

@mheikenfeld
Copy link
Member

mheikenfeld commented Jun 28, 2022

I am sitting down now to add to the missing docstring where it seems to make sense at this point (See discussion in #138), then we can still see what we do with those functions in the future.

@snilsn snilsn mentioned this issue Jul 6, 2022
11 tasks
freemansw1 added a commit to freemansw1/tobac that referenced this issue Nov 21, 2022
@freemansw1 freemansw1 linked a pull request Nov 21, 2022 that will close this issue
11 tasks
@freemansw1 freemansw1 added this to the Version 1.5 milestone Nov 29, 2022
@freemansw1 freemansw1 added the In Review A PR is in review for this issue label Nov 29, 2022
@freemansw1
Copy link
Member Author

Resolved with #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Addition of new features, or improved functionality of existing features In Review A PR is in review for this issue v2.x compatibility Compatibility issues with v1.x and v2.x.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants