Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signal: Add type stubs to _waveforms.pyi. #195

Merged
merged 19 commits into from
Nov 25, 2024

Conversation

pavyamsiri
Copy link
Contributor

@pavyamsiri pavyamsiri commented Nov 24, 2024

Contributes to closing #99.

Adds type stubs to the file _waveforms.pyi and corresponding type tests to test overload coverage for the function gausspulse.

Notes

I am not used to the new optype.numpy types so tell me if I did anything suboptimally or wrong.

Also for chirp I found that for the float dtypes, float16 and float32 that the output dtype is same as the input dtype but for everything else that is float64. I wanted to make an overload to cover this behaviour but I don't know how to do it without the overloads overlapping.

Technically for some of functions you can pass in complex dtypes and it will work fine, but the functions want times and I don't think they were expecting imaginary times i.e. some functions return just the real part but the array would be complex dtype if the input is complex. For this reason I didn't allow complex arrays.

Funny Bug

Also I discovered a 19 year old bug in scipy. The intention of sawtooth and square is that the output dtype is the same as the input t's dtype if the dtype is float32 or float64 otherwise it is always float64. But the check for float32 can never pass because

if t.dtype.char in ['fFdD']:
    ytype = t.dtype.char
else:
    ytype = 'd'

is checking if a single character string is an element of a list of a single string instead of checking if the character is part of the string like t.dtype.char in 'fFdD'.

This makes the type checking easier because the return type is only a single possible type, but I find it a bit funny that this has been here for so long.

Copy link
Owner

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not used to the new optype.numpy types so tell me if I did anything suboptimally or wrong.

That's of course no problem. I'm actually didn't expect you to notice the new optype at all haha.

You're already using the ToFloat and ToInt correctly 👌🏻.
For the ToInt{1,2,N}D array-like aliases, there's only one (subtle) difference with the np._typing._ArrayLike{}_co analogues which you might have missed, namely that they don't accept "bare" scalars like float or np.float16. That way, the e.g. ToFloat and ToFloatND types don't overload, which can help a lot when writing overloads. It does so by additionally requiring a __len__ method to be implemented in the new onp.CanArrayND protocol, instead of just __array__ like with np._typing._SupportsArray.

Also for chirp I found that for the float dtypes, float16 and float32 that the output dtype is same as the input dtype but for everything else that is float64. I wanted to make an overload to cover this behaviour but I don't know how to do it without the overloads overlapping.

I left a suggestion on a possible way to deal with that. But it's also fine by me without any overloads. Because usually functions that accept e.g. float64 arrays, also accept float16 arrays (although there are a couple of exceptions to this as you know).

Technically for some of functions you can pass in complex dtypes and it will work fine, but the functions want times and I don't think they were expecting imaginary times i.e. some functions return just the real part but the array would be complex dtype if the input is complex. For this reason I didn't allow complex arrays.

Ah yes I commented on one such case before I read this. As I also said there; I have no idea what "complex time" is used for, and have never seen it used in statistical and econometric literature. But if you decide to allow it, be sure to use a separate overload, so that real input results in real output.
Anyway, I'll leave this decision up to you.

scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
scipy-stubs/signal/_waveforms.pyi Outdated Show resolved Hide resolved
tests/signal/test_waveforms.pyi Outdated Show resolved Hide resolved
@pavyamsiri
Copy link
Contributor Author

The gausspulse function has such a weird API. I did all the overloads I believe but this does bloat the stubs quite a bit.

@pavyamsiri
Copy link
Contributor Author

pavyamsiri commented Nov 24, 2024

It seems for some reason, the literal "cutoff" overlaps with onp.ToFloatND? Only for mypy it seems, pyright doesn't see an overlap.

@pavyamsiri
Copy link
Contributor Author

I think mypy gets confused because of the definition of ToFloatND

ToFloatND: TypeAlias = _To1D[_f_co, float] | Sequence["ToFloatND"]

The self-referential type is ignored by mypy and it substitutes "ToFloatND" with ... meaning str fits because it is a Sequence[str] which leads to overload overlaps.

@jorenham
Copy link
Owner

jorenham commented Nov 24, 2024

I think mypy gets confused because of the definition of ToFloatND

ToFloatND: TypeAlias = _To1D[_f_co, float] | Sequence["ToFloatND"]

The self-referential type is ignored by mypy and it substitutes "ToFloatND" with ... meaning str fits because it is a Sequence[str] which leads to overload overlaps.

Yea you're right:
https://mypy-play.net/?mypy=latest&python=3.12&flags=strict&gist=18fea0a692fc85f5da18d00bc578b2fa

from collections.abc import Sequence
from typing import TypeAlias

FloatVector: TypeAlias = Sequence[float]
FloatTensor: TypeAlias = FloatVector | Sequence["FloatTensor"]

rejected_float_vector: FloatVector = "ok"       # OK: rejected

accepted_float_tensor: FloatTensor = [[[3.14]]] # OK: accepted
rejected_float_vector1: FloatTensor = object()  # OK: rejected
rejected_float_vector2: FloatTensor = "fail"    # FAIL: accepted

So even though recursive types should be supported since mypy 1.7 (python/mypy#731), in this case, it results in a false negative

@jorenham
Copy link
Owner

jorenham commented Nov 24, 2024

As it turns out, this isn't limited to float sequences.

I raised it at python/mypy#18184

In the meantime, I'll try to figure out a workaround for this and publish a new optype release.

@jorenham
Copy link
Owner

jorenham commented Nov 25, 2024

@pavyamsiri I released a new optype version that includes the workaround. So if you rebase your branch, then the overlapping overload errors should disappear.

…yi`.

Some notes:

1. It seems the scipy implementation for `sawtooth` and `square` has a
   bug. The intention is that the dtype of the `t` array when it is
   `float32` or `float64` determines the output dtype but because
   the check wrongly uses a list instead of just a string this
   code path never runs. Therefore the output dtype is fixed to
   always be `float64`. This makes typing simpler however.
   This bug has been in the code for 19 years!
Wanted to add overloads depending on input dtype but because complex is
a superclass of float, the overloads overlap and so we can't do it
Add type tests as well to check overloads are complete
float16, float32 preserve their while everything else becomes float64.

Not sure how to represent this though.
Ruff complains that we can only use simple values and `float` (the type)
is not a simple value.
`t` can be either an array or a scalar or "cutoff". Add corresponding
tests
This doesn't fix `stubtest` however but it does fix `typetest`.
@pavyamsiri pavyamsiri force-pushed the improve/signal/waveforms branch from 633310c to 4d74dc8 Compare November 25, 2024 05:30
@pavyamsiri
Copy link
Contributor Author

@pavyamsiri I released a new optype version that includes the workaround. So if you rebase your branch, then the overlapping overload errors should disappear.

Thanks for the help with the mypy bug. There's an issue remaining with the overloads for chirp however. I guess the new change has caused pyright to think onp.ToFloatND to overlap with numpy._typing._ArrayLike and so we get conflicting overlaps.

@pavyamsiri
Copy link
Contributor Author

The simplest fix would be to just leave one of the overloads out, unless you know of a way to avoid this.

@jorenham
Copy link
Owner

I guess the new change has caused pyright to think onp.ToFloatND to overlap with numpy._typing._ArrayLike and so we get conflicting overlaps.

Well, I suppose that'd be correct; they actually overlap. I'm not sure why before this it wasn't detected though 🤷🏻. I had to change quite a lot of overloads because of this as well in #197. So maybe you a similar approach I used there to solve it in this case.

@pavyamsiri
Copy link
Contributor Author

I managed to avoid the overlap but I am not convinced the overloads work as intended. I tried making some type tests like passing np.zeros(4, dtype=np.float16) as t and there was no suitable overload for chirp.

@jorenham
Copy link
Owner

I managed to avoid the overlap but I am not convinced the overloads work as intended. I tried making some type tests like passing np.zeros(4, dtype=np.float16) as t and there was no suitable overload for chirp.

hmmm 🤔
Well in this case you could also brute force it by overloading each of the float{16,32,64} cases.
But I also don't mind if you just return an np.floating[Any] there, or keep it as it currently is. I'd be happy to merge either way. So let me know if I should

@jorenham jorenham self-requested a review November 25, 2024 15:20
@pavyamsiri
Copy link
Contributor Author

Yeah I think I will go with the Any approach because I fear the brute force approach will be combinatorially explosive due to the other parameters.

I added an overload to differentiate between scalar and non scalar array likes. Should be able to be merged now.

@jorenham jorenham merged commit 5559ee0 into jorenham:master Nov 25, 2024
2 checks passed
@jorenham
Copy link
Owner

Thanks Pavadol!

@jorenham jorenham modified the milestone: 1.14.1.4 Nov 25, 2024
@jorenham jorenham mentioned this pull request Nov 25, 2024
21 tasks
@jorenham jorenham added this to the 1.14.1.5 milestone Nov 25, 2024
@jorenham jorenham added the stubs: improvement Improve or refactor existing annotations label Nov 25, 2024
@pavyamsiri pavyamsiri deleted the improve/signal/waveforms branch November 26, 2024 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scipy.signal stubs: improvement Improve or refactor existing annotations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants