Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: setindex for mutable collections #33495

Closed
wants to merge 6 commits into from
Closed

Conversation

tkf
Copy link
Member

@tkf tkf commented Oct 8, 2019

setindex was documented recently #32738 and there are also some interests in defining it for general arrays (e.g., ArrayInterface.jl, Setfield.jl).

I propose the following definition of setindex for (possibly) mutable collections. I believe this is more useful than more straightforward implementation like setindex!(copy(collection), value, key...). This is because it can be used as a building block of the "mutate-or-widen" strategy that is used in functions like collect and materialize.

I'm using this definition in BangBang.jl and it turned out to be quite useful for writing efficient collect etc. for Transducers.jl while avoiding depending on the compiler's inference API.

Proposed API

setindex(collection, value, key...)

Create a new collection with the element/elements of the location specified by key
replaced with value(s).

Implementation

setindex(collection, value, key...) must have the property such that

y1 = setindex(x, value, key...)

y2 = deepcopy(x)
setindex!(y2, value, key...)

@assert isequal′(y1, y2)

with a suitable definition of isequal′ (e.g., elements are approximately but possibly
not exactly equal), if setindex! supports given arguments.

setindex should support more combinations of arguments by widening collection
type as required.

I also included the implementations for arrays and dicts.

@jw3126
Copy link
Contributor

jw3126 commented Dec 10, 2019

I think this PR is quite useful for writing generic code that works on both immutable and mutable collections. To me this looks is a basic interface, that really should live in Base.

Copy link
Sponsor Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I'm in full support of adding this functionality. It still doesn't fully complete the copy-and-widen story, but it's a huge helping block in getting there.

base/array.jl Outdated
```

with a suitable definition of `isequal′` (e.g., elements are approximately but possibly
not exactly equal), _if_ `setindex!` supports given arguments.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The not exactly equal part here is just because we have, e.g., convert(Float64, Int(maxintfloat()) + 1) != Int(maxintfloat()) + 1, right? Maybe it'd be a bit clearer to directly say that it's equal after conversion? I think I'd rather focus on the copy side of things than the equality side of things:

That is, something more like:

y1 = setindex(x, value, key...)
y2 = copy′(x)
y2[key...] = value

@assert convert.(eltype(y2), y1) == y2

with a suitable definition of copy′ that ensures value can be assigned into the collection. This is where the focus should be. Getting the @assert exactly right (which it's not right now because broadcast) is slightly less important than getting the widening right, I think.

That way we don't have to bend over backwards with awkward approximately-but-maybe-not-exact sorts of phrasings. It's just a (possible convert plus) copy plus setindex!.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea! Yes, it was a very awkward part of my version of the specification. Your version is much more elegant and precise.

base/array.jl Outdated
function setindex(xs::AbstractArray, v, I...)
T = promote_type(eltype(xs), typeof(v))
ys = similar(xs, T)
if eltype(xs) !== Union{}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases will promote_type return Union{}? And why can't we copy! a Union{} array?

julia> copy!(Array{Union{}}(undef, 0), Array{Union{}}(undef, 0))
0-element Array{Union{},1}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is sometimes useful to specify the container type and shape without specifying its element type (e.g., for implementing multi-dim map in the mutate-or-widen style). Union{} element type is a nice way to encode this. But you can't copy! from it.

julia> copy!(Vector{Int}(undef, 3), Vector{Union{}}(undef, 3))
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(::Array{Union{},1}, ::Int64) at ./array.jl:758
 [2] copyto!(::Array{Int64,1}, ::Int64, ::Array{Union{},1}, ::Int64, ::Int64) at ./abstractarray.jl:842
 [3] append! at ./array.jl:926 [inlined]
 [4] copy!(::Array{Int64,1}, ::Array{Union{},1}) at ./abstractarray.jl:709
 [5] top-level scope at REPL[5]:1

This error itself is not specific to Union{}; i.e., copy! does not handle undef values. eltype(xs) !== Union{} is just an easy shortcut to handle a useful case. But maybe this logic should be moved to copy! function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed this branching for now. Adding this to copy! (in a different PR) seems to be the right direction.

function setindex end

function setindex(xs::AbstractArray, v, I...)
T = promote_type(eltype(xs), typeof(v))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use promote_typejoin instead?

ATM, eltype(setindex([0, 0], 0.5, 1)) is Float64. It'd be Real if we use promote_typejoin. The former is compatible with how vcat work and the latter is compatible with how collect and broadcast work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say promote_type is the right choice. Otherwise you'd very often get abstract element types, which really kill performance.

@@ -2651,6 +2651,14 @@ end
# Throws ArgumentError for negative dimensions in Array
@test_throws ArgumentError fill('a', -10)

@testset "setindex" begin
==ₜ(_, _) = false
==ₜ(x::T, y::T) where T = x == y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

==ₜ seems so handy, maybe it should actually live in e.g. Test?

base/array.jl Show resolved Hide resolved
setindex(collection, value, key...)

Create a new collection with the element/elements of the location specified by `key`
replaced with `value`(s).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also mention that the new collection may have a different element type, as determined by promote_type.

function setindex end

function setindex(xs::AbstractArray, v, I...)
T = promote_type(eltype(xs), typeof(v))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say promote_type is the right choice. Otherwise you'd very often get abstract element types, which really kill performance.

base/dict.jl Show resolved Hide resolved
test/arrayops.jl Outdated
==ₜ(_, _) = false
==ₜ(x::T, y::T) where T = x == y

@test Base.setindex(Int[1, 2], 3.0, 2) ==ₜ [1.0, 3.0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@test Base.setindex(Int[1, 2], 3.0, 2) ==ₜ [1.0, 3.0]
@test Base.setindex([1, 2], 3.0, 2) ==ₜ [1.0, 3.0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good to have Int explicit, since the test is about change of eltype.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point since that doesn't make any difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emphasis

test/arrayops.jl Outdated
==ₜ(x::T, y::T) where T = x == y

@test Base.setindex(Int[1, 2], 3.0, 2) ==ₜ [1.0, 3.0]
@test Base.setindex(Int[1, 2, 3], :two, 2) ==ₜ [1, :two, 3]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@test Base.setindex(Int[1, 2, 3], :two, 2) ==ₜ [1, :two, 3]
@test Base.setindex([1, 2, 3], :two, 2) ==ₜ [1, :two, 3]

@jw3126
Copy link
Contributor

jw3126 commented Apr 23, 2021

I would like to have this functionality in Base. @tkf I guess you don't have much time for this. Would it be okay for you if I take this PR and rebase/finish it?

@MasonProtter
Copy link
Contributor

@jw3126 Are you still interested in doing that? If so, I heartily encourage you to do it!

@jw3126
Copy link
Contributor

jw3126 commented May 10, 2023

@MasonProtter I would love to have that. There is also #46453 and I think there is also at least one additional similar PR somewhere.
I am willing to try yet another PR, but only if somebody (ideally with merge rights) promises to do the required lobby work so it gets either merged or rejected for good technical reasons.

@vchuravy
Copy link
Member

I am going to close this in favor of #46453 (and me sneaking in parts of it in #51066)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants