-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PEP 646] Explicitly allow *args: *Tuple[*Ts, T] #2125
Conversation
@gvanrossum How does this look to you now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG. I do have another question though. Could I write a non-parametrized tuple? E.g.
def foo(x: int, *args: *tuple[int, int]) -> int:
...
which would be roughly equivalent to
def foo(x: int, __a: int, __b: int) -> int:
...
(But there may be reasons to prefer writing it using *args
, e.g. passing that on to something else.)
Also, consider using built-in generic types, e.g. list[...]
, tuple[...]
instead of typing.List
, typing.Tuple
.
That should technically work since any tuple is unpackable. Some caveats below. A corollary of being able to unpack tuples is that One reason we would need However, the PEP currently forbids this, saying that it will introduce syntax for such arbitrary Tensors in a future PEP:
IIRC we added the above restriction because there was some pushback on the exact syntax for arbitrary-dimensional tensors. We could either:
I'm guessing (1) is more prudent. |
Hm. I don’t see how it could mean anything but what I proposed, so I don’t see why we’d have to disallow it on the *args position. (I don’t have an opinion on tensor types with tuple[T, …].) |
Ah, you mean allow |
Exactly. |
(Preferably you would also explain why the form with … is disallowed.) |
Oh gosh, I've learned so much about edge cases in programming languages writing this PEP... Ok, I've added a paragraph explicitly stating that I've also reorganised the section on @guido @pradeep90 How does this look to you?
Good point. I'll make this change if the PEP is accepted - I hesitate to just do a find-and-replace in case there could be any subtle gotchas that need thinking through more carefully. |
pep-0646.rst
Outdated
all arguments must be a ``Tuple`` parameterised with the same types. | ||
|
||
Also note that, following `Type Variable Tuples Must Have Known Length`_, | ||
the following should *not* type-check as valid (even though it is, of course, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we specify that we're simply leaving this behavior up to type checkers? There's no technical reason why it needs to be ruled out (Pyre does handle it correctly). The only reason we decided to leave it out is because we didn't reach a consensus on a syntax for arbitrary-sized Tensors.
TypeScript too allows passing an unbounded tuple to a variadic tuple.
Handling arbitrary-sized Tensors in a principled manner without a bunch of special-case hacks requires Tuple[int, ...]
. The problem with explicitly forbidding it is that our hands are tied in future PEPs where we may well need to specify the behavior for arbitrary-sized Tensors in terms of unbounded tuples.
Given that our aim here is to mention that the PEP doesn't specify the behavior, can we just leave it as unspecified? That will give us flexibility in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW all things being equal I'd rather just allow *args: *tuple[int, ...]
(with the obvious meaning) rather than making an exception of it. But I have to admit I don't understand the issue with arbitrary-sized tensors. I'll wait until you two have reached an agreement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that our aim here is to mention that the PEP doesn't specify the behavior, can we just leave it as unspecified? That will give us flexibility in the future.
Seems fine to me. I've just removed that paragraph altogether for now - do you think that's OK, or should we explicitly say that we leave the behaviour up to type checkers? (I'd prefer leaving it out completely, so we don't risk type checkers accidentally committing to handling it in a particular way because the PEP makes it sound like they can do what they like.)
FWIW all things being equal I'd rather just allow
*args: tuple[int, ...]
(with the obvious meaning) rather than making an exception of it.
I'm assuming you mean *args: *tuple[int, ...]
, the obvious meaning being *args: int
? Is that because it seems uglier to you to explicitly disallow something that seems intuitive than to have more than one way to do something?
But I have to admit I don't understand the issue with arbitrary-sized tensors.
In case you're curious :) In the future we may want to add a nice syntax for doing the following:
def manipulates_first_and_last_dimension(
x: Tensor[Batch, some arbitrary number of dimensions that we don't care about, Channels]
): ...
x: Tensor[Batch, Height, Width, Channels]
foo(x) # OK!
y: Tensor[Batch, Height, Width, Depth, Channels]
foo(y) # Also OK!
As of this PEP, we would probably write this with something like:
_ = TypeVarTuple('_')
def foo(x: Tensor[Batch, *_, Channels]): ...
But that feels a bit cumbersome; it seems more intuitive to write something like:
def foo(x: Tensor[Batch, ..., Channels]): ...
But obviously this use of ...
is very different than what ...
means in the context of tuple[int, ...]
. I personally would like to have a discussion in the future about whether this is possible/a good, but it seemed wise to try and put off that discussion until PEP 646 is done. Hence wanting to avoid committing to anything in PEP 646 about what the meaning of ...
should be in the context of variadic stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that our aim here is to mention that the PEP doesn't specify the behavior, can we just leave it as unspecified? That will give us flexibility in the future.
Seems fine to me. I've just removed that paragraph altogether for now - do you think that's OK, or should we explicitly say that we leave the behaviour up to type checkers? (I'd prefer leaving it out completely, so we don't risk type checkers accidentally committing to handling it in a particular way because the PEP makes it sound like they can do what they like.)
Honestly I don't like playing games with undefined behavior in a standard (even in type checkers). That usually means there's something where different implementations want to be able to claim conformance to the standard while still being different enough to lock in users. I realize that probably exactly this is the motivation -- but I can still register my displeasure, can't I? :-)
FWIW all things being equal I'd rather just allow
*args: tuple[int, ...]
(with the obvious meaning) rather than making an exception of it.I'm assuming you mean
*args: *tuple[int, ...]
, the obvious meaning being*args: int
?
Yes, and yes -- I've fixed my comment above. (I was typing on a phone.)
Is that because it seems uglier to you to explicitly disallow something that seems intuitive than to have more than one way to do something?
Yeah, the Zen of Python rule is really about a preference, it's not a hard rule, and it also has lines about special cases.
The specific thing here that there are two entirely different syntactic features that end up having end cases that mean the same thing. To me that's not enough to stop one of the syntactic features from supporting that end case, assuming the end case logically follows from the other cases -- which it does, in this case.
But I have to admit I don't understand the issue with arbitrary-sized tensors.
[nice explanation snipped for brevity]
But obviously this use of
...
is very different than what...
means in the context oftuple[int, ...]
. I personally would like to have a discussion in the future about whether this is possible/a good, but it seemed wise to try and put off that discussion until PEP 646 is done. Hence wanting to avoid committing to anything in PEP 646 about what the meaning of...
should be in the context of variadic stuff.
I think that may have to be resolved by declaring that the interpretation of ...
in subscripts for generic types is up to the type. Or we may end up declaring tuple[int, ...]
to be the odd one out (tuples are rather unique here, and we were careful to only define this syntax for Tuple
and tuple
, not for "regular" generic types).
In any case none of this seems to be a reason to disallow *args: *tuple[int, ...]
(or to declare its semantics undefined) -- we won't be able to change the meaning of tuple[int, ...]
based on context.
Also, to me at least, I don't see how allowing *args: *tuple[int, ...]
would force you to accept tensors with an indefinite (as opposed to arbitrary) size -- which IIUC is another point of contention, right? If you don't want to allow that you'll just have to disallow it at the call site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Matthew]
I'm assuming you mean *args: *tuple[int, ...], the obvious meaning being *args: int? Is that because it seems uglier to you to explicitly disallow something that seems intuitive than to have more than one way to do something?
It's cleaner to just say that any tuple can be unpacked, than to say that one kind of tuple is explicitly forbidden: *args: *Tuple[int, *Ts]
, *args: *Tuple[int, int]
, *args: *Tuple[int, ...]
.
[Guido]
The specific thing here that there are two entirely different syntactic features that end up having end cases that mean the same thing. To me that's not enough to stop one of the syntactic features from supporting that end case, assuming the end case logically follows from the other cases -- which it does, in this case.
+1. We're doing more work to suppress a logical usage.
[Matthew]
_ = TypeVarTuple('')
def foo(x: Tensor[Batch, *, Channels]): ...
But that feels a bit cumbersome; it seems more intuitive to write something like:
def foo(x: Tensor[Batch, ..., Channels]): ...
I don't find that use case compelling enough to justify adding fresh semantics for ...
.
The way you solve that in Pyre today is:
def foo(x: Tensor[Batch, *Tuple[int, ...], Channels) -> None: ...
This:
- doesn't need a throwaway, out-of-line
TypeVarTuple
declaration, like you wanted. - is explicit about what's happening there and what the type of those elements must be (Explicit is better than Implicit :) )
- is wholly consistent with the way we use
*
everywhere else. Don't need to leave holes in our specification and don't need to add new syntax (since*Tuple<blah>
is valid). - is precisely how TypeScript works (playground example):
declare function firstAndLast<T, R>(arr: readonly [T, ...number[], R]): [T, R];
The only downside is that it's slightly more verbose. Given how infrequent this is, I'm ok with that. In the worst case, if people use it often enough to find it verbose, we could always add syntax for it later. Though we'd still need a way to specify the expected type - maybe Tensor[Batch, [int, ...], Channels]
, but it's not very illuminating.
Proposal: How about we allow any tuple to be unpacked in this PEP? If we really want syntax for ease of use, we could add it later (even though I'm not in favor of adding syntax unless strongly justified).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that I used by GitHub admin privileges to add attributes to some of your quotes, since you alternate between quoting Matthew and me.)
Proposal: How about we allow any tuple to be unpacked in this PEP? If we really want syntax for ease of use, we could add it later (even though I'm not in favor of adding syntax unless strongly justified).
+1. But I'm guessing this is a big change that we'd need to take back to the SIG for a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that I used by GitHub admin privileges to add attributes to some of your quotes, since you alternate between quoting Matthew and me.)
Didn't even know that was possible :)
+1. But I'm guessing this is a big change that we'd need to take back to the SIG for a bit.
We could probably do another round but it's not a big change. The two sections affected are:
(i) Using a Type Variable Tuple Inside a Tuple: we would say that x: Tuple[str, *Tuple[int, ...]]
behaves as if we wrote x: Tuple[str, *Ts__fresh]
except that it insists on the type being int
. We would also specify that, e.g., *args: *Tuple[T, ...]
means that we expect zero or more T
s (like *args: T
).
(ii) Type Variable Tuples Must Have Known Length: here, we would no longer forbid passing Tuple[int, ...]
to Tuple[*Ts]
. We would also say that any Ts
not provided is treated as Tuple[Any, ...]
just like any T
not provided is treated as Any
. So, x: Tensor
behaves identically to x: Tensor[Any, *Tuple[Any, ..]]
. (No longer need to special-case x: Tensor
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, if Guido's happy with *args: *Tuple[int, ...]
being allowed, I'm happy too. I've made the corresponding changes.
[Pradeep]
The way you solve that in Pyre today is:
def foo(x: Tensor[Batch, *Tuple[int, ...], Channels) -> None: ...
The only downside is that it's slightly more verbose. Given how infrequent this is, I'm ok with that.
I think it could end up being used more frequently than we think. It's one of the features I've been asked about most often from DeepMind folks - we have a lot of library functions that operate on generic time series objects (with shape [Time, Batch, ...]).
For me the *Tuple[int, ...]
version feels unacceptably verbose, but I think a crux between us here is that I'm still thinking of the version of tensor typing where the labels like Batch and Channel are types rather than ints. I agree that for the version where the labels are ints, this is probably a reasonable solution.
[Pradeep]
Proposal: How about we allow any tuple to be unpacked in this PEP? If we really want syntax for ease of use, we could add it later (even though I'm not in favor of adding syntax unless strongly justified).
I strongly hesitate to do this - I'd really like to get the PEP finished, and I think it's something we would have to think carefully about the ramifications of (I bet there are subtle implications we're not thinking of off the top of our heads).
I could be persuaded, though. If I understand correctly, this would also imply going back on the 'type variable tuples must have known length' thing altogether. So the arguments for pursuing the change would be:
- It simplifies the PEP (both conceptually, and in terms of length) by getting rid of the 'Type Variable Tuples Must Have Known Length' business
- It means we can be explicit about how an unparameterised objects work (
x: Array
=x: Array[Any, ...]
) - It means we do have a mechanism for specifying an arbitrary number of axes (putting
*Tuple[int, ...]
or*Tuple[Any, ...]
in the type parameter list). - It's consistent with us allowing
*args: *Tuple[int, ...]
.
Are these all the arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me the *Tuple[int, ...] version feels unacceptably verbose, but I think a crux between us here is that I'm still thinking of the version of tensor typing where the labels like Batch and Channel are types rather than ints.
Even in that case, you have *Tuple[Any, ...]
.
Succinct syntax is something we could always add later if justified by actual usage. My point here is that this is semantically sound.
If I understand correctly, this would also imply going back on the 'type variable tuples must have known length' thing altogether.
Not sure what we would be going back on. The explanation in that section of the PEP doesn't justify it:
Type variables tuples may not be bound to a type with unknown length. That is:
If this is confusing - didn't we say that type variable tuples are a stand-in for an arbitrary number of types? - note the difference between the length of the type variable tuple itself, and the length of the type it is bound to. Type variable tuples themselves can be of arbitrary length - that is, they can be bound to Tuple[int], Tuple[int, int], and so on - but the types they are bound to must be of known length - that is, Tuple[int, int], but not Tuple[int, ...].
A function expecting x: Tuple[*Ts]
is agnostic to the length of the tuple. It cannot do x[0]
or x[-1]
without a type error. So, from the function's point of view, there is no difference whether it is passed Tuple[int, int]
or Tuple[int, ...]
.
Note that, as a result of this rule, omitting the type parameter list is the only way of instantiating a generic type with an arbitrary number of type parameters. ... For example, an unparameterised Array may behave like Array[Any, ...], but it cannot be instantiated using Array[Any, ...], because this would bind its type variable tuple to Tuple[Any, ...]:
There is no need for this limitation.
Proposal: How about we allow any tuple to be unpacked in this PEP? If we really want syntax for ease of use, we could add it later (even though I'm not in favor of adding syntax unless strongly justified).
I strongly hesitate to do this - I'd really like to get the PEP finished, and I think it's something we would have to think carefully about the ramifications of (I bet there are subtle implications we're not thinking of off the top of our heads).
I'm strongly in favor of doing this. This isn't some feature "off the top of our heads". We have strong evidence that this is a sound application: a fully-fleshed out reference implementation in Pyre with lots of testing, actual usage in many Tensor stubs, plus a preexisting reference implementation in TypeScript that is widely used. Any questions about edge cases you may have can be answered by trying it out in Pyre. At this point, if we want to say that there are subtle implications, we'd need to show what. Otherwise, that's a generic objection we can levy against any proposal.
Overall, I suspect there might be some miscommunication here about what needs to be changed. It looks like you think there will be a lot of change in behavior. I'm saying that all of the specified behavior remains the same. The only change is to remove the artificial restriction above.
What I understand of your objections:
...
is shorter than*Tuple[Any, ...]
: True and if this is actually used a lot, we could always add that syntax later. I don't see why that has to block this semantic change. If the real objection is that, in that future, there would be two ways to specify it, I think that is fine. As Guido said, we don't have to be too dogmatic about "one - and preferable only - obvious way to do it". That zen refers to the obvious way to do it; we shouldn't have to restrict the longer way just because there's a syntactic shortcut (e.g.,x[0]
vsx.__getitem__(0)
)- This will take more time to amend: If you're busy with other stuff, I'm happy to take on the task of doing a quick round of feedback from Eric and others and making the necessary change within a few days. I would personally be fine with just a quick review from Eric on this Github thread given that this is a pretty minimal change in reality, but ymmv. I would rather put in that time than have artificial limitations that are hard to fix later.
Are these all the objections?
…o a function with signature '*args: *Ts'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This did change quite a bit since I last saw it. Here's some feedback.
I have never understood the reason why we had to tread so carefully around tensors of indefinite length. I don't think it will affect the acceptability of the PEP if we remove the exception for that case. |
(Sorry for the slow reply - I am indeed busy with some other stuff, but nonetheless will try to respond as quickly as I have the energy for.) [Guido]
Right; the main reason for me is that I didn't want us to accidentally commit to making the most obvious syntax, But I'm conscious this explanation is very similar to the one I gave at #2125 (comment), so I'm feeling like I might have misunderstood your question? [Pradeep]
Fair point.
Just to be absolutely sure we're on the same page about this - by "remove the artificial restriction", we're talking about the removal of the "Type Variable Tuples Must Have Known Length" restriction, right? I'm confused because when you say:
Removing the "Type Variable Tuples Must Have Known Length" restriction does imply a change in behaviour doesn't it? Or is your point just that removing that restriction wouldn't change the behaviour of anything else in the PEP - its effects would be limited? Having said all that, I went back and re-read the mailing list threads (Review of PEP 646 and Variadic generics PEP draft) that resulted in the decision to disallow binding TypeVarTuples to arbitrary-length types. As far as I can tell, the reasons were:
Reason 1 can certainly be worked around by using Reason 2 is no longer relevant, because we decided to remove Reason 3 I don't put too much weight in given that I don't understand what the reason was. So actually I feel pretty good about allowing TypeVarTuples to be bound to arbitrary-length types now, and therefore also about allowing type tuple-unpacking in general. I do agree we should double-check with folks on typing-sig though. I'll do that now. |
[Matthew]
I'm sorry, I can't keep the entire thread in my head any more. :-( That said, there is nothing in PEP 646 that would seem to give a meaning to the syntax I think what's bugging Pradeep and me about the offending paragraph in PEP 646 is that it talks about forbidding |
Oh, sorry, fair enough! It's barely fitting in my head any more either.
Fwiw I do completely agree this part of the PEP is weird and if we can I'd be happy to be rid of it. |
So the SC just accepted our PEP without this change! I still think we should proceed with this change, it's not unusual to have some churn in a PEP before the implementation lands. |
Closing this PR, since it's replaced by #2162. |
Per discussion at https://mail.python.org/archives/list/typing-sig@python.org/thread/3RFXH6C3HB5XL65ZZFXKIZP4EPE4B4PK/.
@pradeep90