Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement broadcasting for AbstractDict and NamedTuple #26158

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andyferris
Copy link
Member

This is an implementation of broadcast and broadcast! over the indices of both AbstractDicts and NamedTuples. The current behavior is to treat these types as a scalar for the purpose of broadcasting. I feel it would be quite useful to allow users to broadcast over the values in these containers - I know I'd use it, at least.

I followed the semantic that for 1D containers the job of broadcast is to match up the indices of the containers (and broadcast the scalars) and apply the given function to the resulting values. Here, dictionaries cannot be broadcast with matrices or higher-dimensional arrays, but if the keys of a vector or tuple happen to be the same as the keys of a dictionary than that's OK.

There are a few things I just "made up" such as the new precedence orders between dictionaries <--> arrays / tuples / named tuples; comments welcome.

There are a few things that could be improved (in this PR or later):

  • Friendlier support for dictionary types other than Dict.
  • Make NamedTuple broadcasting faster (I currently observe type inference issues with the constant propagation of the field names).

I'm pinging triage to consider this now as v1.0 users may rely on the broadcast-as-a-scalar behavior of dictionaries and named tuples and it should thus be considered a breaking change for v1.x.

I'll also note that there have been discussions around AbstractDict and how they map and iterate. Note that this acts on dictionary values and not key-value pairs like the current implementation of map. On that note, I have been considering that map has something to do with iteration and broadcast has something (a bit more complex) to do with indexing (and the relationship between iteration and indexing is currently different for dictionaries as compared to arrays, tuples, and named tuples).

A few small examples:

julia> Dict(1 => 10, 2 => 20) .* 2
Dict{Int64,Int64} with 2 entries:
  2 => 40
  1 => 20

julia> (a=1, b=2) .+ (b=2, a=1)
(a = 2, b = 4)

julia> sum.((a=[1,2,3], b=[4,5,6]))
(a = 6, b = 15)

The third one could be considered summing the columns of a "table", or perhaps the groups of a container returned by some kind of groupby function.

Closes #25904. CC @timholy since you have obviously put a lot of effort into our broadcast design.

@andyferris andyferris added triage This should be discussed on a triage call broadcast labels Feb 22, 2018
@mbauman mbauman added the breaking This change will break code label Feb 22, 2018
function Dict{K,V}(::Uninitialized, inds::KeySet{<:Dict{K}}) where {K, V}
d = inds.dict
n = length(d.keys)
Dict{K,V}(copy(d.slots), copy(d.keys), Vector{V}(uninitialized, n), d.ndel, d.count, d.age, d.idxfloor, d.maxprobe)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably need to go inside the struct block.

@JeffBezanson
Copy link
Member

I like it; I do think we should treat Dicts as having indices consisting of their keys (in broadcasting and elsewhere). But I think this is the kind of thing we should be working on in 1.x, so maybe we should give an error for now.

@mbauman mbauman removed the triage This should be discussed on a triage call label Feb 22, 2018
@mbauman
Copy link
Member

mbauman commented Feb 22, 2018

I'm currently working on a prototype that would fix #18618 — with my current WIP it will be very easy to add a deprecation/error for AbstractDicts.

@andyferris
Copy link
Member Author

so maybe we should give an error for now.

Sure, that too. I thought I'd provide a (tested) implementation since it doesn't seem to hurt unless there's some risk the semantics are wrong or the implementation is dodgy. I was also thinking it could potentially be a good deprecation path for map if we choose to fiddle with that (e.g. map(f, dict) -> broadcast(f, pairs(dict)) or something...)

a prototype that would fix #18618

What does "fix" mean in this context? I'm asking because I actually kinda disagree with the premise of that issue - "broadcast is not working with iterators" - to my mind broadcasting has to do with indexing and is somewhat orthogonal to iteration per se. I feel people should stick to map (and anymous functions as necessary) for iterables that don't have indexes (we don't have many of those containers anyway).

@JeffBezanson
Copy link
Member

I agree broadcast inherently requires indexing. That f.(xs) syntax is just so tempting though...

@mbauman mbauman added broadcast Applying a function over a collection and removed broadcast labels Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants