Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: singledispatch in centrography to natively support geopandas objects #149

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

martinfleis
Copy link
Member

@martinfleis martinfleis commented Nov 14, 2024

Very much WIP as I just wanted to see how this actually works. Will continue at some point later on this.

One thing I wanted to figure out was to ensure that we also support numpy arrays of shapely objects and return a shapely object but it seems that singledispatch typing is a bit limited in this sense and there's no way of distinguishing between np.ndarray[np.float64] and np.ndarray[shapely.Geometry]. At least none of the attempts I made worked.

xref #135

TODO:

  • complete test suite
  • update docstrings

@ljwolf
Copy link
Member

ljwolf commented Nov 15, 2024

no way of distinguishing...

Yes, I think that's the case. singledispatch focuses on the type of the input, not the input types' dtype. I wonder if there's a way to do this with the fun.register like at the end of the docs for @singledispatch?. I think if ndarray[dtype] is recognized as a type, it should be possible to key the register on that and look it up in the register, but I've never dug into that.

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

Attention: Patch coverage is 92.30769% with 12 lines in your changes missing coverage. Please review.

Project coverage is 74.5%. Comparing base (4ecafc6) to head (8bf4246).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
pointpats/centrography.py 92.3% 12 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #149     +/-   ##
=======================================
+ Coverage   69.7%   74.5%   +4.8%     
=======================================
  Files         12      12             
  Lines       1775    1904    +129     
=======================================
+ Hits        1237    1419    +182     
+ Misses       538     485     -53     
Files with missing lines Coverage Δ
pointpats/centrography.py 83.2% <92.3%> (+37.0%) ⬆️

@martinfleis
Copy link
Member Author

They even have examples of types within collections like the one below but I failed to make it work with numpy arrays. I'd say that is something for a potential follow-up :).

@fun.register(list)
def _(arg: list[int], verbose=False):
    if verbose:
        print("Enumerate this:")
    for i, elem in enumerate(arg):
        print(i, elem)

@ljwolf
Copy link
Member

ljwolf commented Nov 15, 2024

Would be interested in your thoughts about the design pattern generally... singledispatch is one of the things that sent me off to julia 😁

@knaaptime
Copy link
Member

i know they're interested over in spreg as well

@martinfleis
Copy link
Member Author

When you need different output types for different input types, it is kind of neat. Especially when you have a primary implementation either directly as one of the functions (e.g. for np.ndarray) or as a helper fn, and all the other functions do is decomposing the other type to an array, passing it through and casting the output to a proper type. When the output type does not change based on input type (like in case of dtot here), it loses part of the benefits but it is still quite readable and likely easy to maintain.

Let's see how messy the docstrings will become.

@ljwolf
Copy link
Member

ljwolf commented Nov 15, 2024

Super!

Imho it's very clean, and helps simplify a lot of the "from_x" usage patterns, since you can easily keep output types consistent with input types (as we have examined in the past!).

I use it always now in my private consulting code.

@martinfleis
Copy link
Member Author

@ljwolf do you have any recommendation on how to deal with docstrings with conditional input and output types? I've been looking for some examples with little to no luck. Everyone talks about code but how to deal with possibly messy docstrings is not covered anywhere.

@martinfleis
Copy link
Member Author

Also, the way the API docs deal with this is super weird. See:
Screenshot 2024-12-14 at 14 57 04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants