Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Nullables as collections #16961

Merged
merged 10 commits into from
Dec 29, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,13 @@ This section lists changes that do not have deprecation warnings.

* `broadcast` now treats `Ref` (except for `Ptr`) arguments as 0-dimensional
arrays ([#18965]).
* `broadcast` now handles missing data (`Nullable`s) allowing operations to
be lifted over `Nullable`s, as if the `Nullable` were like an array with
zero or one element. ([#16961]). Note that many situations where `Nullable`
types had been treated like scalars before will no longer work. For
example, `get.(xs)` on `xs::Array{T <: Nullable}` will now treat the
nullables as a container, and attempt to operate on the data contained.
This use case will need to be migrated to `map(get, xs)`.

* The runtime now enforces when new method definitions can take effect ([#17057]).
The flip-side of this is that new method definitions should now reliably actually
Expand Down Expand Up @@ -109,6 +116,10 @@ Library improvements

* Additional methods for `ones` and `zeros` functions to support the same signature as the `similar` function ([#19635]).

* Methods for `map` and `filter` with `Nullable` arguments have been
implemented; the semantics are as if the `Nullable` were a container with
zero or one elements ([#16961]).

Compiler/Runtime improvements
-----------------------------

Expand Down Expand Up @@ -620,6 +631,7 @@ Language tooling improvements
calling C++ code from Julia.

<!--- generated by NEWS-update.jl: -->
[#265]: https://github.com/JuliaLang/julia/issues/265
[#550]: https://github.com/JuliaLang/julia/issues/550
[#964]: https://github.com/JuliaLang/julia/issues/964
[#1090]: https://github.com/JuliaLang/julia/issues/1090
Expand Down Expand Up @@ -731,10 +743,12 @@ Language tooling improvements
[#16731]: https://github.com/JuliaLang/julia/issues/16731
[#16854]: https://github.com/JuliaLang/julia/issues/16854
[#16953]: https://github.com/JuliaLang/julia/issues/16953
[#16961]: https://github.com/JuliaLang/julia/issues/16961
[#16972]: https://github.com/JuliaLang/julia/issues/16972
[#16986]: https://github.com/JuliaLang/julia/issues/16986
[#17033]: https://github.com/JuliaLang/julia/issues/17033
[#17037]: https://github.com/JuliaLang/julia/issues/17037
[#17057]: https://github.com/JuliaLang/julia/issues/17057
[#17075]: https://github.com/JuliaLang/julia/issues/17075
[#17132]: https://github.com/JuliaLang/julia/issues/17132
[#17261]: https://github.com/JuliaLang/julia/issues/17261
Expand All @@ -748,17 +762,17 @@ Language tooling improvements
[#17510]: https://github.com/JuliaLang/julia/issues/17510
[#17546]: https://github.com/JuliaLang/julia/issues/17546
[#17599]: https://github.com/JuliaLang/julia/issues/17599
[#17623]: https://github.com/JuliaLang/julia/issues/17623
[#17668]: https://github.com/JuliaLang/julia/issues/17668
[#17758]: https://github.com/JuliaLang/julia/issues/17758
[#17785]: https://github.com/JuliaLang/julia/issues/17785
[#18330]: https://github.com/JuliaLang/julia/issues/18330
[#18339]: https://github.com/JuliaLang/julia/issues/18339
[#18346]: https://github.com/JuliaLang/julia/issues/18346
[#18442]: https://github.com/JuliaLang/julia/issues/18442
[#18473]: https://github.com/JuliaLang/julia/issues/18473
[#18628]: https://github.com/JuliaLang/julia/issues/18628
[#18644]: https://github.com/JuliaLang/julia/issues/18644
[#18690]: https://github.com/JuliaLang/julia/issues/18690
[#18628]: https://github.com/JuliaLang/julia/issues/18628
[#18839]: https://github.com/JuliaLang/julia/issues/18839
[#18931]: https://github.com/JuliaLang/julia/issues/18931
[#18965]: https://github.com/JuliaLang/julia/issues/18965
Expand All @@ -768,3 +782,6 @@ Language tooling improvements
[#19288]: https://github.com/JuliaLang/julia/issues/19288
[#19305]: https://github.com/JuliaLang/julia/issues/19305
[#19469]: https://github.com/JuliaLang/julia/issues/19469
[#19543]: https://github.com/JuliaLang/julia/issues/19543
[#19598]: https://github.com/JuliaLang/julia/issues/19598
[#19635]: https://github.com/JuliaLang/julia/issues/19635
124 changes: 106 additions & 18 deletions base/broadcast.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,27 @@ module Broadcast

using Base.Cartesian
using Base: promote_eltype_op, linearindices, tail, OneTo, to_shape,
_msk_end, unsafe_bitgetindex, bitcache_chunks, bitcache_size, dumpbitcache
_msk_end, unsafe_bitgetindex, bitcache_chunks, bitcache_size, dumpbitcache,
nullable_returntype, null_safe_eltype_op, hasvalue, is_nullable_array
import Base: broadcast, broadcast!
export bitbroadcast, dotview
export broadcast_getindex, broadcast_setindex!

## Broadcasting utilities ##

broadcast_array_type() = Array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a more general mechanism for this? Cc: @pabloferz @martinholters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as a marker to select whether to recurse or otherwise. Perhaps that's best done with a different trait.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new subtyping algorithm gets merged before, I believe you could remove this and have

@inline broadcast(f, As::AbstractArray...) = broadcast_c(f, Array, As...)
@inline broadcast(f, As::A...) where A<:AbstractArray{E} where E<: Nullable = broadcast_c(f, Array{Nullable}, As...)
containertype{T<:AbstractArray}(::Type{T}) = Array
containertype{E<:Nullable,T<:AbstractArray{E}}(::Type{T}) = Array{Nullable}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a more generally useful design indeed. It's likely that these type system improvements will be merged before or at the same time as this PR, since both need to be in before the feature freeze. Though for now let's keep the custom version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to avoid changing this for now because the subtyping algorithm will likely provide a better way to do it.

broadcast_array_type(A, As...) =
if is_nullable_array(A) || broadcast_array_type(As...) === Array{Nullable}
Array{Nullable}
else
Array
end

# fallbacks for some special cases
@inline broadcast(f, x::Number...) = f(x...)
@inline broadcast{N}(f, t::NTuple{N}, ts::Vararg{NTuple{N}}) = map(f, t, ts...)
@inline broadcast(f, As::AbstractArray...) = broadcast_c(f, Array, As...)
@inline broadcast(f, As::AbstractArray...) =
broadcast_c(f, broadcast_array_type(As...), As...)

# special cases for "X .= ..." (broadcast!) assignments
broadcast!(::typeof(identity), X::AbstractArray, x::Number) = fill!(X, x)
Expand All @@ -30,7 +40,9 @@ containertype(::Type) = Any
containertype{T<:Ptr}(::Type{T}) = Any
containertype{T<:Tuple}(::Type{T}) = Tuple
containertype{T<:Ref}(::Type{T}) = Array
containertype{T<:AbstractArray}(::Type{T}) = Array
containertype{T<:AbstractArray}(::Type{T}) =
is_nullable_array(T) ? Array{Nullable} : Array
containertype{T<:Nullable}(::Type{T}) = Nullable
containertype(ct1, ct2) = promote_containertype(containertype(ct1), containertype(ct2))
@inline containertype(ct1, ct2, cts...) = promote_containertype(containertype(ct1), containertype(ct2, cts...))

Expand All @@ -39,16 +51,26 @@ promote_containertype(::Type{Array}, ct) = Array
promote_containertype(ct, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{Any}) = Tuple
promote_containertype(::Type{Any}, ::Type{Tuple}) = Tuple
promote_containertype(::Type{Any}, ::Type{Nullable}) = Nullable
promote_containertype(::Type{Nullable}, ::Type{Any}) = Nullable
promote_containertype(::Type{Nullable}, ::Type{Array}) = Array{Nullable}
promote_containertype(::Type{Array}, ::Type{Nullable}) = Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ::Type{Array{Nullable}}) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this one covered by the ::Type{T} method below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to avoid an ambiguity.

Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ::Type{Array}) = Array{Nullable}
promote_containertype(::Type{Array}, ::Type{Array{Nullable}}) = Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ct) = Array{Nullable}
promote_containertype(ct, ::Type{Array{Nullable}}) = Array{Nullable}
promote_containertype{T}(::Type{T}, ::Type{T}) = T

## Calculate the broadcast indices of the arguments, or error if incompatible
# array inputs
broadcast_indices() = ()
broadcast_indices(A) = broadcast_indices(containertype(A), A)
broadcast_indices(::Type{Any}, A) = ()
broadcast_indices(::Union{Type{Any}, Type{Nullable}}, A) = ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Nullable is treated as any scalar here. So why do we need a specific method? Same below for Array{Nullable} and _broadcast_getindex.

Copy link
Contributor Author

@TotalVerb TotalVerb Dec 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type information is computed assuming Nullable is a scalar, but that information is not fully propagated through. Perhaps it should be?

Copy link
Member

@nalimilan nalimilan Dec 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this means, I'm not really familiar with that code. @martinholters @pabloferz?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that nullable is only treated as a scalar as a special case.

broadcast_indices(::Type{Tuple}, A) = (OneTo(length(A)),)
broadcast_indices(::Type{Array}, A) = indices(A)
broadcast_indices(::Type{Array}, A::Ref) = ()
broadcast_indices{T<:Array}(::Type{T}, A) = indices(A)
@inline broadcast_indices(A, B...) = broadcast_shape((), broadcast_indices(A), map(broadcast_indices, B)...)
# shape (i.e., tuple-of-indices) inputs
broadcast_shape(shape::Tuple) = shape
Expand Down Expand Up @@ -123,6 +145,8 @@ end
Base.@propagate_inbounds _broadcast_getindex(A, I) = _broadcast_getindex(containertype(A), A, I)
Base.@propagate_inbounds _broadcast_getindex(::Type{Array}, A::Ref, I) = A[]
Base.@propagate_inbounds _broadcast_getindex(::Type{Any}, A, I) = A
Base.@propagate_inbounds _broadcast_getindex(::Union{Type{Any},
Type{Nullable}}, A, I) = A
Base.@propagate_inbounds _broadcast_getindex(::Any, A, I) = A[I]

## Broadcasting core
Expand Down Expand Up @@ -272,19 +296,29 @@ end
ftype(f, A) = typeof(f)
ftype(f, A...) = typeof(a -> f(a...))
ftype(T::Type, A...) = Type{T}
typestuple(a) = (Base.@_pure_meta; Tuple{eltype(a)})
typestuple(T::Type) = (Base.@_pure_meta; Tuple{Type{T}})
typestuple(a, b...) = (Base.@_pure_meta; Tuple{typestuple(a).types..., typestuple(b...).types...})
ziptype(A) = typestuple(A)
ziptype(A, B) = (Base.@_pure_meta; Iterators.Zip2{typestuple(A), typestuple(B)})
@inline ziptype(A, B, C, D...) = Iterators.Zip{typestuple(A), ziptype(B, C, D...)}

_broadcast_type(f, T::Type, As...) = Base._return_type(f, typestuple(T, As...))
_broadcast_type(f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(A, Bs...), ftype(f, A, Bs...)})
# nullables need to be treated like scalars sometimes and like containers
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The occurrence of "sometimes" here definitely raises a red flag.

# other times, so there are two variants of typestuple.

# if the first argument is Any, then Nullable should be treated like a
# scalar; if the first argument is Array, then Nullable should be treated
# like a container.
typestuple(::Type, a) = (Base.@_pure_meta; Tuple{eltype(a)})
typestuple(::Type{Any}, a::Nullable) = (Base.@_pure_meta; Tuple{typeof(a)})
typestuple(::Type, T::Type) = (Base.@_pure_meta; Tuple{Type{T}})
typestuple{T}(::Type{T}, a, b...) = (Base.@_pure_meta; Tuple{typestuple(T, a).types..., typestuple(T, b...).types...})

# these functions take the variant of typestuple to be used as first argument
ziptype{T}(::Type{T}, A) = typestuple(T, A)
ziptype{T}(::Type{T}, A, B) = (Base.@_pure_meta; Iterators.Zip2{typestuple(T, A), typestuple(T, B)})
@inline ziptype{T}(::Type{T}, A, B, C, D...) = Iterators.Zip{typestuple(T, A), ziptype(T, B, C, D...)}

_broadcast_type{S}(::Type{S}, f, T::Type, As...) = Base._return_type(S, typestuple(S, T, As...))
_broadcast_type{T}(::Type{T}, f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(T, A, Bs...), ftype(f, A, Bs...)})

# broadcast methods that dispatch on the type of the final container
@inline function broadcast_c(f, ::Type{Array}, A, Bs...)
T = _broadcast_type(f, A, Bs...)
T = _broadcast_type(Any, f, A, Bs...)
shape = broadcast_indices(A, Bs...)
iter = CartesianRange(shape)
if isleaftype(T)
Expand All @@ -295,27 +329,59 @@ _broadcast_type(f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(A, Bs
end
return broadcast_t(f, Any, shape, iter, A, Bs...)
end
@inline function broadcast_c(f, ::Type{Array{Nullable}}, A, Bs...)
@inline rec(x) = broadcast(f, x)
@inline rec(x, y) = broadcast(f, x, y)
@inline rec(x, y, z) = broadcast(f, x, y, z)
@inline rec(xs...) = broadcast(f, xs...)
broadcast_c(rec, Array, A, Bs...)
end
function broadcast_c(f, ::Type{Tuple}, As...)
shape = broadcast_indices(As...)
n = length(shape[1])
return ntuple(k->f((_broadcast_getindex(A, k) for A in As)...), n)
end
@inline function broadcast_c(f, ::Type{Nullable}, a...)
nonnull = all(hasvalue, a)
S = _broadcast_type(Array, f, a...)
if isleaftype(S) && null_safe_eltype_op(f, a...)
Nullable{S}(f(map(unsafe_get, a)...), nonnull)
else
if nonnull
Nullable(f(map(unsafe_get, a)...))
else
Nullable{nullable_returntype(S)}()
end
end
end
@inline broadcast_c(f, ::Type{Any}, a...) = f(a...)

"""
broadcast(f, As...)

Broadcasts the arrays, tuples, `Ref` and/or scalars `As` to a container of the
appropriate type and dimensions. In this context, anything that is not a
subtype of `AbstractArray`, `Ref` (except for `Ptr`s) or `Tuple` is considered
a scalar. The resulting container is established by the following rules:
Broadcasts the arrays, tuples, `Ref`, nullables, and/or scalars `As` to a
container of the appropriate type and dimensions. In this context, anything
that is not a subtype of `AbstractArray`, `Ref` (except for `Ptr`s) or `Tuple`,
or `Nullable` is considered a scalar. The resulting container is established by
the following rules:

- If all the arguments are scalars, it returns a scalar.
- If the arguments are tuples and zero or more scalars, it returns a tuple.
- If there is at least an array or a `Ref` in the arguments, it returns an array
(and treats any `Ref` as a 0-dimensional array of its contents and any tuple
as a 1-dimensional array) expanding singleton dimensions.

The following additional rules apply to `Nullable` arguments:

- If there is at least a `Nullable`, and all the arguments are scalars or
`Nullable`, it returns a `Nullable`.
- If there is at least an array or a `Ref` with `Nullable` entries, or there
is at least an array or a `Ref` (perhaps with scalar entries instead of
`Nullable` entries) and a nullable, then the result is an array of
`Nullable` entries.
- If there is a tuple and a nullable, the result is an error, as this case is
not currently supported.

A special syntax exists for broadcasting: `f.(args...)` is equivalent to
`broadcast(f, args...)`, and nested `f.(g.(args...))` calls are fused into a
single broadcast loop.
Expand Down Expand Up @@ -372,6 +438,28 @@ julia> string.(("one","two","three","four"), ": ", 1:4)
"two: 2"
"three: 3"
"four: 4"

julia> Nullable("X") .* "Y"
Nullable{String}("XY")

julia> broadcast(/, 1.0, Nullable(2.0))
Nullable{Float64}(0.5)

julia> [Nullable(1), Nullable(2), Nullable()] .* 3
3-element Array{Nullable{Int64},1}:
3
6
#NULL

julia> [1+im, 2+2im, 3+3im] ./ Nullable{Int}()
3-element Array{Nullable{Complex{Float64}},1}:
#NULL
#NULL
#NULL

julia> Ref(7) .+ Nullable(3)
0-dimensional Array{Nullable{Int64},0}:
10
```
"""
@inline broadcast(f, A, Bs...) = broadcast_c(f, containertype(A, Bs...), A, Bs...)
Expand Down
Loading