-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Do not consider iterators as scalars in broadcast #25356
Conversation
Consider that all types implementing start() are collections, and throw an error for SizeUnknown and IsInfinite iterators. This makes broadcast() fail by default for most iterators, since the current fallback functions assume that collections support indexing. Custom iterators could implement their own methods, but the default ones should probably be improved to collect iterators without requring indexing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't notice the original discussion. I think that with a small change this may be workable. Nanosoldier will be critical.
Needs tests, though.
BroadcastStyle(::Type) = Scalar() | ||
hasshape_ndims(::Base.HasShape{N}) where {N} = N | ||
function BroadcastStyle(::Type{T}) where T | ||
if method_exists(start, Tuple{T}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not inferrable, and thus might be pretty bad for code like
function foo(x)
y = Float64.(x)
# now do something with y
end
Unfortunately #16422 wouldn't help.
At the same time, I recognize that any problematic type can be optimized by adding a specific defintion.
At a minimum we may have to change the typeof
calls in collect_styles
to Core.Typeof
, and then specialize
BroadcastStyle(::Type{Type{T}}) where T = Scalar()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could require iterators to define a trait. That would be more consistent with what we do elsewhere, and that wouldn't be a terrible burden either. I had contemplated adding an NotIterable
type to Base.iteratorsize
, but that could also be a separate function like isiterable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On balance I think it's better to require iterators to define a trait, and make scalars the default.
One slightly-crazy thought is that the trait name could be the output of BroadcastStyle
. But on balance I'm not sure this is a good idea, because there may be reasons to have things that act like scalars that don't return Scalar()
. It's probably better to have a separate isiterable
trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but it hasn't been in the past since we don't want the coupling. Also, #25261 will break this.
@@ -53,7 +53,7 @@ end | |||
abstract type IteratorSize end | |||
struct SizeUnknown <: IteratorSize end | |||
struct HasLength <: IteratorSize end | |||
struct HasShape <: IteratorSize end | |||
struct HasShape{N} <: IteratorSize end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -57,3 +57,20 @@ struct RangeStepRegular <: TypeRangeStep end # range with regular step | |||
struct RangeStepIrregular <: TypeRangeStep end # range with rounding error | |||
|
|||
TypeRangeStep(instance) = TypeRangeStep(typeof(instance)) | |||
|
|||
## iterable trait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code isn't used in the PR currently, but it illustrates the alternative approach based on a trait rather than o method_exists(start, Tuple{T})
. It fixes the type inference issue (provided the fallback doesn't call method_exists
as it currently does).
BTW, I've noted an inconsistency in the naming of traits: we have iteratorsize
, iteratoreltype
, but IndexStyle
, TypeRangeStep
, TypeArithmetic
and TypeOrder
. Looks like the CamelCase variants are more numerous and more recent, so maybe we should adopt that convention everywhere? Added to #20402.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uppercase makes the most sense when what will be returned is a type-instance---for example, IndexStyle(a)
will return T()
where T<:IndexStyle
. With better constant-prop it's less obvious that we need to return a dedicated type-instance, although that does have some advantage in clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👎 Adding a new thing every iterable type needs to define is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not "ideal", but not the end of the world either IMHO given that you need to define several methods anyway. And if we really don't want to add another trait, we can add a new type to iteratorsize
(and rename it), which will have the advantage of making the choice of the type more explicit.
Anyway I'm all ears if somebody has a better solution that works (i.e. is inferrable, see @timholy's comment above).
prod_iteratorsize(::HasLength, ::HasLength) = HasShape{2}() | ||
prod_iteratorsize(::HasLength, ::HasShape{N}) where {N} = HasShape{N+1}() | ||
prod_iteratorsize(::HasShape{N}, ::HasLength) where {N} = HasShape{N+1}() | ||
prod_iteratorsize(::HasShape{M}, ::HasShape{N}) where {M,N} = HasShape{M+N}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these will be inferrable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@code_warntype
seems to be happy with a similar example:
f(::AbstractArray{S,M}, ::AbstractArray{T,N}) where {S,T,M,N} = Array{M+N}()
@code_warntype f([1], [1 2])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you're right.
Superseded by #26435. |
Consider that all types implementing
start
are collections, and throw an error forSizeUnknown
andIsInfinite
iterators. This makesbroadcast
fail by default for most iterators, since the current fallback functions assume that collections support indexing. Custom iterators could implement their own methods, but the default ones should probably be improved to collect iterators without requring indexing.Possible fix for #18618.
This change is not terribly appealing as-is as it does not add any new feature, it just throws an error when calling
broadcast
on iterators, where they were previously treated as scalar. Treating iterators as scalars isn't very useful, as can be seen from the fast that only one test relied on this. In the perspective of the feature freeze, throwing errors is good since it will allow supportingbroadcast
on iterators without breaking existing code. But I'm not yet sure what the fallback implementation can look like: it would probably have to process one element at a time, but if multiple iterators are passed one of them would have to be collected temporarily somewhere (since repeated indexing is not possible in general).