-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
broadcast should not drop zero-dimensional results to scalars #28866
Comments
This is an intentional feature. Zero-dimensional arrays act like scalars in broadcast. |
OK seems like it's a feature. But I still wonder why |
Oh my, you're right — I missed your third example there. That indeed is a bug. |
In short: we have long depended upon broadcasting to implement a number of array functions that implicitly work elementwise. We just add additional size checks. Unfortunately we now also need to add a zero-dimensional check, too. Just around the definition of |
IMHO, we should make things consistent here, if My suggestion is, broadcast SHOULD treat 0d arrays as scalars, but SHOULD NOT implicitly cast them to scalars in the final result. |
Yes, I'm in complete agreement. That's the bug, and it needs to be fixed. Edit (a year later): I'm confused by my comment. I'm pretty sure I didn't mean to say what this says. I just wanted to fix the |
The casting of 0D arrays to scalars at the end of broadcast has been bothering me too. |
I just noticed this too while working to update the tests for https://github.com/mbauman/InvertedIndices.jl to 1.0. Specifically, A = fill(1)
@test A[Not(A.==1)] == [] fails with
since |
This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.
IIUC, what you are calling "treating 0d arrays as scalars" is not an exception, it's just an application of the usual broadcasting rules, as |
Here's the crux of the problem: What should As far as broadcast is concerned, it's exactly the same as |
IMO, the following chain of equivalences should hold: 1 .+ 2 === broadcast(+, 1, 2) === broadcast(+, convert(1), convert(2)) == fill(3) I say this because the type signature I don't see any principled reason for |
I tend to agree; I have a strong distaste for behaviors like this on principle. The reasons were entirely practical. I initially tried to make This is something we could reconsider as a breaking change for 2.0, but in order for it to be feasible I think we'd also need to allow 0-dimensional arrays to participate in the linear algebra of vectors and matrices. I'm also still not convinced that it's actually all that terrible in practice, despite my own distaste for it in theory. For more details, check out #17318. |
What’s the relationship between |
It's just that folks would see zero-dimensional arrays much more frequently; making them more capable and allowing them to do things like |
I think for big software projects, it's best to have the guts of the language be meticulous about degenerate cases, and to have the convenient-but-slightly-unprincipled stuff be found only very close to the surface, so as not to muddy the ability to reason consistently about the software. I'd say in this bug we are reaching precisely the point at which the motivation for that thinking becomes apparent: because Julia handled zero-dimensional broadcasting inconsistently since the beginning, it's now much easier to make a band-aid fix to the internals that special-cases zero-dimensional arrays, than to rework the internals so that the special case is not necessary. (This is totally a hindsight observation, and I don't think anyone necessarily did anything wrong in this regard during the development trajectory of Julia, since there were trade-offs to make at every step of the way.) |
What is the intended meaning of |
Yes, history and inertia is one part of the story. The other part is that it is practically quite useful and (I think) friendlier. Arithmetic on a zero-dimensional array is quite limited, but there would be ways of defining it to be a bit friendlier (like having Now, I'm much more sympathetic to cases like |
My concern there is that the One thought that comes to mind is that perhaps we could have an infix operator for the tensor product. It seems to me that confusion of what
Hm. I would say that any dotted operator should always return an array. That's a consistent and easy-to-remember rule, and I think having that consistency could save headaches like this without burdening programmers much. What it might do is be slightly surprising to people who are used to math notation which often just uses juxtaposition for all the many notions of multiplication (scalar, matrix, function composition, tensor). But I think it's good that we don't carry that notational ambiguity into this language :) And converting an |
I would describe the situation as this: There is a type f(w::W, x::U, y::U)::U (in our case |
Honestly, I never much liked that In very generic code the one thing you do have to mentally track is which variables are scalar and which are containers, so I don't see any impediment to generic code if for example it were forbidden to write Also note that if we support |
This is specific to the rank-2 case, right? |
The best way to evaluate this change would be to try it out. Personally, I don't find the status quo so abhorrent, so I won't be pushing for this change, and clearly it'd be a breaking v2.0 change, but it's a really easy change to make. Just steal the logic from #32122 — it really shouldn't be more than 10LOC between all the builtin broadcast implementations. Then the big question is how many LOC worth of tests and packages need to change. I'm up for changing my mind on the practicality here if the evidence weighs against it! |
Here's a compelling argument: https://discourse.julialang.org/t/broadcasting-and-pairs-using/24739 In short: p = "2".=>"two"
replace.(["123", "246"], p) vs. replace.(["123", "246"], "2".=>"two") The first fails, whereas the second succeeds, but the two should be equivalent in their end result. This only happens because we've lost the 0-d container. |
@mbauman, I'd be interested in trying your suggestion in #28866 (comment) to see how much work it is. I'm new to contributing to Julia -- can you point me to what I need to know to get my local clone to the point where I can make a change to the broadcast implementation, run some compile/test commands, and see what breaks? I can do that quickly, then over the next few weeks work on it as I have time. |
Here you go: https://github.com/JuliaLang/julia/compare/mb/true28866 Just: git clone https://github.com/JuliaLang/julia.git
cd julia
git checkout mb/true28866
make
make testall |
Just an administrative note: this issue started out describing both broadcast's design as well as the bug in #32122 — and I wrote the commit message there before we really started hashing out broadcast's design here. So we should re-open this issue if that commit message ends up auto-closing this — it only addresses the "easy" half of this issue. |
…tainers This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.
…tainers (#32122) This is a simple workaround for the handful of elementwise operations that are defined on arrays _without_ the need for explicit broadcast but use broadcasting (with an extra shape check) in their implementation. These were the only affected cases I could find.
I find
Then Maybe there should be a definition for |
Broadcasting should return 0-dimensional array! Support you with another using case: julia> conj(Zygote.Fill(1.0im))
0-dimensional Array{FillArrays.Fill{ComplexF64, 0, Tuple{}}, 0}:
Fill(0.0 - 1.0im) The This is cause by:
Line 860 in 938da26
Is it really safe to treat all arrays as dense array type? The above code is very bad to ecosystem. This now causing OMEinsum AD test failure: |
Ironically, that's happening because FillArray's It'd work as you want if FillArrays would match Base's current broadcasting semantics. |
It is not only the problem of the FillArrays. This implementation in Julia base ( Line 860 in 938da26
length(axes(bc)) == 0 ? fill!(similar(bc, typeof(r)), r) : r Two aspects can be improved
I won't be happy if |
It tries to; that's what the
It is not a type instability because the length of the axes is encoded into the broadcast's type. |
I see what you mean. So it is a wield interaction of |
Hi, |
I came to realize the broadcasting behavior on 0-dimensional arrays pretty late. This will mean a significant implementation change for what I'm working on. A minimal example of the issue I'm running into:
|
This is especially problematic for GPU code because scalarization strips any allocation information present in a zero-dimensional array: julia> x = MtlArray(fill(1f0))
0-dimensional MtlArray{Float32, 0, Private}:
1.0
julia> x .+ x
2.0f0 which now lives on the CPU. |
Is it a bug, or a feature?
The text was updated successfully, but these errors were encountered: