Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

julep: Base.summary and ShowItLikeYouBuildIt #18909

Closed
timholy opened this issue Oct 13, 2016 · 19 comments
Closed

julep: Base.summary and ShowItLikeYouBuildIt #18909

timholy opened this issue Oct 13, 2016 · 19 comments
Labels
display and printing Aesthetics and correctness of printed representations of objects. julep Julia Enhancement Proposal needs decision A decision on this change is needed
Milestone

Comments

@timholy
Copy link
Member

timholy commented Oct 13, 2016

I'm preparing to register a package with the name above, but I realized that this may generate discussion and it seems julia may be a better place for that discussion than METADATA.jl. This has overlap with the concerns and goals of #18068, #18228 (CC @KristofferC), but this is focused (for now) on the workings of Base.summary (which prints the "tag line" for AbstractArrays), so to me this seems better as a separate issue. (I'm also happy to copy/paste this to #18068.)

Background

The good

Between base Julia and its packages, we now have a set of view-types for "changing" what appears to be every aspect of an array:

  • changing the apparent element type of an array (or even lazy elementwise computations): MappedArrays (although it's much more than this, it can be thought of as extending reinterpret to arbitrary AbstractArrays, and even making changes that are not possible with a bits-based reinterpretation)
  • selecting a subregion of an array: SubArray (Base)
  • changing the shape/dimensionality of an array: ReshapedArray (Base)
  • changing the order of dimensions of an array: PermutedDimsArray (Base, not exported)
  • changing the indices used to address elements of the array: OffsetArrays.

What's more, these are all composable with one another, with little or no performance cost for doing so.

The bad & ugly

Let's suppose someone wants to read a disk file that corresponds to an Array{UInt8,3}, which represents an RGB image with color along the slowest dimension. In the upcoming rewrite of Images---which leverages all this array goodness very heavily---this is a probable type for the resulting "interpreted" array that would display as an RGB image:

julia> typeof(c)
ImageCore.ColorView{ColorTypes.RGB{FixedPointNumbers.UFixed{UInt8,8}},2,MappedArrays.MappedArray{FixedPointNumbers.UFixed{UInt8,8},3,Base.PermutedDimsArrays.PermutedDimsArray{UInt8,3,(3,1,2),(2,3,1),Array{UInt8,3}},ImageCore.##29#30{FixedPointNumbers.UFixed{UInt8,8}},Base.#reinterpret}}

This is what appears as part of the output for summary, which is called whenever you display the array in the REPL.

As has been pointed out, that's a mouthful---there's a significant cognitive load spent on just counting brackets.

A proposal

Thinking this over, I noticed that it's often the case that the sequence of characters you type to construct an array is considerably shorter than the sequence of characters in the printed type. Consider this example:

julia> a = rand(3,5,7);

julia> b = view(a, :, 3, 2:5)
3×4 SubArray{Float64,2,Array{Float64,3},Tuple{Colon,Int64,UnitRange{Int64}},false}:
 0.616728  0.187771  0.0980549  0.473863
 0.540341  0.102319  0.592159   0.779258
 0.255608  0.790268  0.776601   0.420463

The proposal is that summary (which is called to display objects, not types), the tagline for the above would be shown as something like

3×4 view(::Array{Float64,3}, ::Colon, ::Int, ::UnitRange) with element type Float64

or even

3×4 view(::Array{Float64,3}, :, 3, 2:5) with element type Float64

For the more complex example above, the part corresponding to the type would be

ColorView{RGB}(ufixedview(N0f8, permuteddimsview(::Array{UInt8,3}, (3,1,2)))) with element type ColorTypes.RGB{FixedPointNumbers.UFixed{UInt8,8}}

Without going in to details†, the "radical" thing about this proposal is that we are communicating information about the type from a nested sequence of function calls that would produce such a type.

I am planning to experiment with this in a package at first, but in the long run we may want to contemplate whether show(IOContext(STDERR, compact=true), ::Type{T}) should behave in this way.


ColorView is both a type and a constructor; the parameter of RGB gets inferred from the eltype of the array you call it on. N0f8 comes from JuliaMath/FixedPointNumbers.jl#51. ufixedview "reinterprets" (via MappedArrays) the entries of an arbitrary (suitable) array as UFixed numbers. permuteddimsview is just a wrapper for Base.PermutedDimsArrays.PermutedDimsArray. For the moment, the trailing element type is completely explicit about the element type---it might stay that way or get abbreviated.

@kshyatt kshyatt added the julep Julia Enhancement Proposal label Oct 13, 2016
@andyferris
Copy link
Member

Yes I've been torn between using the type and using a minimal constructor syntax. I think the latter is much better for users to read but the type has seemed more idiomatic. I feel this proposal is a good move.

(Generally, I prefer output that you can copy-paste and reconstruct the same (or a very similar) object. It would be wonderful (perhaps somewhat magical/impossible) if the AbstractArray output could achieve this and still look as good as it does now.)

@timholy
Copy link
Member Author

timholy commented Oct 14, 2016

(Generally, I prefer output that you can copy-paste and reconstruct the same (or a very similar) object.

I like that idea a lot, and frequently something along those lines kinda works, but there are cases where this runs into fundamental conceptual difficulties:

julia> using MappedArrays

julia> a = randn(3,3)
3×3 Array{Float64,2}:
  0.458636  -1.19035    1.53639 
  0.526736  -0.424218   1.42559 
 -0.526185  -0.143567  -0.135398

julia> b = mappedarray(x->x+1, a)
3×3 MappedArrays.ReadonlyMappedArray{Float64,2,Array{Float64,2},##5#6}:
 1.45864   -0.190348  2.53639 
 1.52674    0.575782  2.42559 
 0.473815   0.856433  0.864602

julia> b.data === a
true

Do you print the values that b has, or do you print the values you'd need to make an identical b? In this case they are not the same thing.

@andyferris
Copy link
Member

I agree, it's an aspiration only and quite probably is impossible.

What about:

3×3 mapped(::##5#6, ::Array{Float64,2}):
 1.45864   -0.190348  2.53639 
 1.52674    0.575782  2.42559 
 0.473815   0.856433  0.864602

Probably a little silly to have a non-type in the header though?

@timholy
Copy link
Member Author

timholy commented Oct 14, 2016

In some cases that may be impractical or unavoidable if we adopt this strategy, see discussion in JuliaImages/Images.jl#542 (comment)

@nalimilan
Copy link
Member

+1 for making the summary line more readable, whatever solution is chosen.

I just wanted to note that long types happen for other array types which are not necessarily transformations of standard arrays. For these cases, printing the name and parameters of the transformation doesn't work.

For example, a NamedArray representing a frequency table (cf. my FreqTables package) gives:

julia> freqtable(x)
4-element NamedArrays.NamedArray{Int64,1,Array{Int64,1},Tuple{Dict{ASCIIString,Int64}}}
a 100
b 100
c 100
d 100

That's not the end of the world, but the final Tuple isn't of the highest importance. One simple solution is to override summary to only print the first 3 type parameters.

Anyway, it might need a different solution we could discuss separately, but I thought I would mention it in case we can find a common strategy.

@timholy
Copy link
Member Author

timholy commented Oct 14, 2016

In the idiom described here, this would print as

4-element freqtable(::Array{Int64,1}) with element type Int64

Not sure whether you like that more or less. Certainly, specializing summary to not show the last parameter is a viable option (and has been for some time).

@nalimilan
Copy link
Member

nalimilan commented Oct 14, 2016

I'm not sure I like it. In the one-argument case I presented above, freqtable looks like an array transformation just like view, but it's less obvious when passing several arguments, or when passing a data frame and column names like freqtable(df, :age, :sex). This also obscures the fact that the object is a NamedArray, which is of course not a problem when there's a one-to-one mapping between the function and the array type.

@timholy
Copy link
Member Author

timholy commented Oct 14, 2016

If there isn't a 1-1 mapping, then indeed that abbreviated form should not be defined. I think you can only define this when the calls produce the type deterministically from any inputs that match the specified types.

I also agree this may not be a win in every case; in the package (now at https://github.com/JuliaArrays/ShowItLikeYouBuildIt.jl), see the shows_compactly logic. One could probably put some kind of threshold on it. Or always show the "outer" type completely, and use such logic for simplifying only the parameters of the type.

@oxinabox
Copy link
Contributor

Generally, I prefer output that you can copy-paste and reconstruct the same (or a very similar) object.

This is how Python defines repr; that eval(repr(x)) == x.
Julia does not define repr this way, in the docs.
It is often true that this works.
(Conversely in python, it is not particularly unusual if it doesn't.)

Is this a seperate issue though?
The correct behavior of repr, vs the behavour of summary.
In general, I think if repr is defined as eval-able,
than a repr is always a acceptable (if not a particulakr good) summary.
But a summary makes no promises to be a repr.

@JeffBezanson JeffBezanson added the display and printing Aesthetics and correctness of printed representations of objects. label Jan 3, 2017
@timholy timholy added this to the 1.0 milestone Jul 11, 2017
@timholy timholy added the needs decision A decision on this change is needed label Jul 11, 2017
@timholy
Copy link
Member Author

timholy commented Jul 11, 2017

Added a 1.0 milestone so we decide whether we want this or not. Relevant links:

Note that the important thing for the milestone is the decision; if we decide we want it, since ShowItLikeYouBuildIt already exists it could be moved into Base pretty quickly. So, decide!

@StefanKarpinski
Copy link
Member

I like the idea of defining repr to give an parse+evalable string representation of something. That implies that repr should fail for objects that we don't know how to do that for, which would be quite breaking.

@timholy
Copy link
Member Author

timholy commented Jul 11, 2017

I agree with the sentiment. OTOH, (1) we don't currently display enough digits with repr to reconstruct the array exactly (or even all that close), and (2) I think something like this proposal still makes sense even for giant arrays where that type-summary line can be rough going but where you don't want the value-dump to be the end of your julia session.

At least for the world of AbstractArrays, I think the current behavior of ShowItLikeYouBuildIt is python's_repr_for_the_type_portion but is agnostic about how the actual values are printed.

@oxinabox
Copy link
Contributor

I like the idea of defining repr to give an parse+evalable string representation of something. That implies that repr should fail for objects that we don't know how to do that for, which would be quite breaking.

Yes, please.
See my comments on: https://discourse.julialang.org/t/expr-to-string-conversion/4118/3

But that is really its own separate issue

@JeffBezanson
Copy link
Member

I think we already have at least an informal recommendation that (2-argument) show should print a parse-and-eval-able representation. This works in many cases. It would be fine to make it work in more cases, and clarify this in the docs and help. But I don't think we should give more errors, or add another kind of printing. If we really wanted that, a good approach might be to add an :evalable IOContext property, that tells you to throw an error from show if you can't print something appropriate. Seems minimally useful though.

@timholy
Copy link
Member Author

timholy commented Jul 17, 2017

The whole eval-able point has gotten a bit conflated here with the original intent of this issue. In

julia> view(reshape(view(rand(10), 2:7), 2, 3), :, 1:2)
2×2 SubArray{Float64,2,Base.ReshapedArray{Float64,2,SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true},Tuple{}},Tuple{Base.Slice{Base.OneTo{Int64}},UnitRange{Int64}},true}:
 0.564978   0.434754
 0.0895906  0.520575

if you really wanted to print a statement that could eval to the output, you'd technically have to print all 10 random values. Likewise with

julia> a = reshape(mappedarray(sqrt, 1:12), 3, 4)
3×4 Base.ReshapedArray{Float64,2,MappedArrays.ReadonlyMappedArray{Float64,1,UnitRange{Int64},Base.#sqrt},Tuple{}}:
 1.0      2.0      2.64575  3.16228
 1.41421  2.23607  2.82843  3.31662
 1.73205  2.44949  3.0      3.4641 

you should display the output values rather than focusing on 1:12.

This proposal is really focused on Base.summary, and proposes something like

2×2 view(reshape(view(::Array{Float64,1}, 2:7), 2, 3), :, 1:2) with element type Float64

for the summary line in the first case, and

3×4 reshape(mappedarray(sqrt, ::UnitRange{Int}), 3, 4)) with element type Float64

for the second.

@JeffBezanson
Copy link
Member

It seems like we should probably do this; summary has not gotten much love. I believe this just boils down to (1) add the showarg function and tell people to define it for this kind of printing (which is neither a repr representation, nor a summary itself), (2) define it for some Base types, (3) update the AbstractArray summary method to use it. So if people are ok with adding showarg, everything else just falls out.

We should also perhaps have summary methods print types in a mode that excludes module names, for conciseness. Separate issue though.

@StefanKarpinski
Copy link
Member

What is showarg?

@JeffBezanson
Copy link
Member

@KristofferC --- any opinion as one of our resident experts in attractive terminal output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
display and printing Aesthetics and correctness of printed representations of objects. julep Julia Enhancement Proposal needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

7 participants