-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Julep: extended proposal for fixing show, print, & friends #14052
Comments
Nice julep. Thanks for writing this up. |
Defining new array printing is quite convoluted right now. How you thought about how that would look like in this proposal? |
Seems reasonable expect for the HTMLBuilder part. Henceforth the name "< Noun >Builder" is banished from the Julia language. |
Hear, hear. |
I really like #13825. It would fit really well with serial port communications. It would make it easy to use a dictionary of |
Yes. In fact, #13825 already addressed it (although a final round of cleanup and deprecation still remains).
Haha, OK. the text/plain-Builder is named IOBuffer, so accordingly, the text/html-Builder should probably be called HTMLBuffer :). As a practical matter, I actually expect to remove this code entirely from the PR once the PR is finished and move it to a separate branch. I developed it here as a way of understanding the generic interactions.
I'm not sure I follow what you meant, since that PR only deals with the output side. Matching them to the equivalent method on the input side does reveal a small clarification I need to make to the descriptive text in the above Julep, however: write -> read (aka serialize -> deserialize) |
trying to think out loud about this more, i think this statement might have been wrong. parse (like deserialize) is a special-case of read. this implies that write -> byte I/O So to fix the But trying to implement this poses a method ambiguity problem: write(::IO, ::Any) = write(reinterpret(bytes))
write(::IO, ::AbstractString) = write(string)
write(::IO, ::Markdown) = write(render(string))
write(::StringBuilder, ::Any) = show() # method ambiguity!
write(::StringBuilder, ::is-overloaded-for-text-io) = write() # method ambiguity resolver Which is a lot of text to arrive at the conclusion that I don't know how to sanely alias print and write, so they may need to remain independent. |
IJulia shouldn't call show to an HTMLBuilder. This is not how Jupyter works. You don't (generally) send rendered HTML, you send a "MIMEbundle" of different MIME representations of the object (text/html, text/latex, image/png, text/plain) and the front-end(s) picks which one to display and how. So, IJulia still needs to continue to call |
The basic question in my mind is whether to attach metadata (like whether to print colors, whether to use compact or full display, etcetera) to the |
To summarize my reasons above, I'm not sure that adding metadata to MIME types generalizes particularly well beyond the unstructured text/plain format. But my strongest argument for the IOContext approach is that it doesn't require modifying any existing methods to add a new parameter. If the method correctly passes along IO, that it all that is required to encapsulate arbitrary additional metadata and type info.
Sorry, I didn't flesh that out very clearly since I intended it more as a rough example than a implementation guideline. It is likely true that there could be an It seems this needs another piece to the puzzle above: a conversion table from mime types to IO types as a generalization of the with_formatter(io::IO, ::MIME"text/plain") = io; finalize_formatter(io::IO, io::IO) = assert-that io == io;
with_formatter(io::IO, ::MIME"text/html") = HTMLFormatter(io); finalize_formatter(io, buf) = io << buf;
... and the reverse: mimetype(::IO) = MIME"text/plain"()
mimetype(::HTMLFormatter) = MIME"text/html"()
mimetype(io::IOContext) = mimetype(io.io) which allows the use of THTT to implement output type overloading and transition freely between the two domains (I think this was where #7959 wanted to go with this as well, since this is finally integrating mimewrite with write, show, and the rest): write(io::IO, md::Markdown) = write(io, mimetype(io), md)
write(io::IO, mime::MIME"text/plain", md::Markdown) = write(io, mime, rendered-as-text::AbstractString)
write(io::IO, mime::MIME"text/html", md::Markdown) = write(io, mime, rendered-as-html::AbstractString)
write(io::HTMLFormatter, mime::MIME"text/html", html::AbstractString) = adds-the-html-fragment-to-io
write(io::HTMLFormatter, mime::MIME"text/plain", text::AbstractString) = write(io, text)
write(io::IO, mime::MIME, value::Any) = write_with_format(mime, io, ans) This effectively replaces |
I think I realized the issue above is what Keno was trying to handle in #13256: The old write methods required that both MIME types be implicit in the objects themselves, so that In #7959, the proposal was to provide annotation of the output type via an extra parameter: In #13256, the proposal was to annotate the input type via julia types: In #13825, the proposal was to annotate the output type via julia types: In this Julep, the challenge has been to understand how all of these proposals can be integrated. In the old methods, the missing piece was that there was no generic way for either the input or output types to declare their mime content. In #7959, the missing piece is that there was no way to annotate the input mime type (which make it hard to create generic structured IO writers), and it is inconvenient to pass around mime types when they can be generally implicit in the IO writer. In #13256, the missing piece is that there was no way to annotate the output mime type (which makes it hard to write converters). In #13825, like the old methods, there was no generic way for the input and output types to declare their mime content for method dispatch. So here's my proposal for merging all of the above such that the user doesn't have to specify mime types, but that the system is optionally mime-aware where it matters: # Basic byte IO
write(io::IO, data::Any) = io << reinterpret{bytes}(data)
write(io::IO, vector::Bytes) = io << vector
write(io::IO, str::String) = io << str
write(io::IO, char::Char) = io << char
# Basic MIME-aware declaration for input objects (THTT)
write(io::IO, md::Markdown) = write(io, mimetype(io), md)
write(io::IO, html::HTML) = write(io, mimetype(io), html)
write(io::IO, data::MIMEData) = write(io, mimetype(io), data)
# Basic MIME-half-aware declaration for text/plain objects
write(io::IO, mime::MIME"text/plain", str::String) = write(io, str)
write(io::IO, mime::MIME"text/plain", char::Char) = write(io, char)
mimetype(::IO) = MIME"text/plain"()
# Catch-all MIME writers fallback methods
write{mime<:MIME}(io::IO, ::mime, data::MIMEData{mime}) = write(io, data.vector)
write(io::IO, ::MIME"text/plain", data::MIMEData{MIME"text/plain"}) = write(io, data.vector) # ambiguity resolver for below
write(io::IO, mime::MIME"text/plain", obj::Any) = show(io, obj) # calls e.g. write(io, "obj")
# Markdown input (with specification of the output types it understands & generic `::IO`)
write(io::IO, mime::MIME"text/plain", md::Markdown) = write(io, mime, md as MIMEData{MIME"text/plain"}) # or write(io, md as String)
write(io::IO, mime::MIME"text/html", md::Markdown) = write(io, mime, md as MIMEData{MIME"text/html"})
# HTML output (with unknown object & unkown IO; and with specific override for catchall writer for the right mimetype)
write(io::IO, mime::MIME"text/html", obj::Any) = write_with_format(mime, io, MIME"text/plain"(), obj) # calls write(::HTMLOutput, ::MIME"text/plain", obj)
write(io::HTMLOutput, mime::MIME"text/html", dom::MIMEData{MIME"text/html"}) = append-child(io, dom)
write{imagetype}(io::IO, mime::MIME"text/html", data::MIMEData{MIME{imagetype}}) = if isimage(data)
write(io, mime, "<img src=\"data:$imagetype;base64,$(base64encode(data.vector))\">")
else
write(io, MIME"text/plain", data)
end
mimetype(::HTMLOutput) = MIME"text/html"() edit: I forgot to add that, as mentioned in #7959 and above, we would have to provide a replacement for print also. I've added the missing mime declarations for Char and String above such that the following is correct: |
To better integrate into the current system, we could change the fallback text/plain definition above to call print: write(io::IO, mime::MIME"text/plain", obj::Any) = print(io, obj) # instead of show Any object (like Markdown) that wanted to overload this method would instead continue to overload print(io::IO, str::String) = write(io, str) # instead of write(io::IO, mime::MIME"text/plain", str::String)
print(io::IO, char::Char) = write(io, char) # instead of write(io::IO, mime::MIME"text/plain", char::Char)
print(io::IO, md::Markdown) = write(io, md as String) This effectively requires the usage of a short-hand method definition notation when the mime type is "text/plain", which is perhaps not a bad thing (instead of deprecating it). The fallback print methods then are mostly unchanged, aside from being now becoming text/plain specialized and mime-aware otherwise: print(io::IO, data...) = for-each(data) do x; print(io, x); end
print(io::IO, data::Any) = if is(mimetype(io), MIME"text/plain"())
show(io, data)
else
write(io, mimetype(io), x)
end |
@vtjnash, I agree that the backwards compatibility of putting the metadata in We discussed replacing |
I wasn't specifically aware of that method, since it doesn't seem particularly necessary. I could swap the argument order, but in reality, there's no need for the two-argument form of In #8987, I see your concern was that |
So, how would IJulia decide whether to send a |
Swapping the argument order to |
i'm not sure IJulia can reliably determine that ahead-of-time. it could post-process the html and determine that it has no formatting marks (e.g. but this Julep is supposed to allow for seamless merging mime-aware and mime-unaware objects, so that if a mime-unaware (text/plain) object tries to write a mime-aware object to a mime-aware stream, the child object will be able to render itself correctly. For example, this property is illustrated by the IOInterpolate type mentioned originally.
can you explain this more? even if they share the same name, I believe they are relatively independent concepts (in all cases, distinguished by their argument number). and If they didn't share the same name, I believe it would make sense to define: |
It's just that I dislike having more Detecting HTML text seems likely to be unreliable; I'm worried about false positives. I suppose you could manually define |
|
@vtjnash, suppose I am trying to display an object It sounds like, in your scheme, I would have to write |
The lossless representation of a plain text string in html requires the addition of |
Automatically adding |
inline html tags are valid for nesting inside in general, i believe this nesting question is no harder than for printing a text/plain array: the first array above prints nicely only because the alignment and printing methods have been hand-customized to degrade gracefully as the content becomes more complex. (note, my primary design goal with this part of the Julep is to fix JuliaLang/IJulia.jl#260; although I plan to leave the final implementation of the html backend to a separate PR, or external package) |
Right, I was thinking of |
Losslessly rending random text into Markdown is probably a harder but semi-related problem. I'm not sure many properties even can be represented (color? structure? nested attributes?) I think my biggest question would be how to render mixed text: if the user inserts a small amount of text/html into a block of text, should it wrap the whole thing in a |
Keep in mind that you don't need a markdown syntax for everything you can render. The way of getting colored text could be something like this: """
This is some **Markdown** text with $(Color(:red, "colorful text")).
""" Then you support rendering Color span nodes to various output formats. |
that's no longer a markdown document, but a text-serialized Julia AST |
Maybe this? #18634 (comment) |
I think the main pieces are done. What remains seems to essentially be cleanup to make this solid for other packages to extend it:
|
This still needs a pass for 1.0 but it's mostly there. |
Anything left for 1.0 here? |
Are we happy with how |
There was a proposal at some point to make |
Currently we define that
|
I'm fine with that too. Given how often I use |
Only documentation seems to be needed here. |
On the topic of color / format: I've decided to go in a different direction with handling formatting commands in the out, and so far am happy with how it's looking. The PR for this exploratory current work is #27430. So far, I like to think of this new approach as the dual of IOContext: where IOContext lets you pass additional metadata into the input, the new IOFormatBuffer lets you return additional metadata in the output. This gives the consumer ultimate control over formatting decisions: with an Anyways, I still need to do more work to finish it (esp. fixing up array_show, documenting it with examples, and adding |
Anything left to be done here? Write the docs? |
Status?Hi. I'm wondering if this is documented yet. If not, I might give it a try. I must admit I am often confused on this subject, and I'm pretty certain I have misused the show/display/print/... system in the past. I am looking to answer questions such as:
Also:
Some useful references I have found:
Sample contradictory infoNote that in his talk, Fredrik seems to imply (by his examples) we should typically be defining:
And yet on of the posts form @vtjnash seems seems to imply we should typically be defining the opposite:
(Though I admit this post is a bit old, and might be out-of-date). |
Clearly (I hope) explained in the manual: https://docs.julialang.org/en/v1/manual/types/#man-custom-pretty-printing
Generally neither. As noted in the link above, (A rare exception is if you are defining a new |
My suspicion is that this issue from 2015 can be closed nowadays. |
Thanks. I did not notice that before. I really would like to map out these call trees in a more visual way somehow. I'd also like to link this to the display subsystem more explicitly as well. I think what is a bit difficult for me is to extract the implicit purpose of a given function given its position the call hierarchy. For example, the pretty-printing section puts alot of emphasis to tie
(Well, at least that's the message that comes across the strongest for me) But I find the following intent (or purpose) of this method to be much more useful:
And @stevengj : I think your "exception" for the
I agree that the default behaviour of |
Problem statement:
Some objects have two irreconcilable representations: one view shows the structure of the object, while the other renders the content of the object. (historically in julia, these have been named
show
andprint
respectively).The REPL usually wants to show the structure of an object, not just its value, so it wants to call show. This leads to the desired method call tree
display(ans) -> display(d, ans) -> show(io, ans) -> print(io, 'ans')
On the other-hand, when outputting a document, the goal is to show only the value. This leads to the desired method call tree
stringmime(mime, ans) -> mimewrite(io, mime, ans) -> print(io, ans) -> print(io, 'ans')
.In short,
display/show
are a pair, andmimewrite/print
are a pair, and there are important differences between the two. The first pair prints a representation, the second pair print the rendered content. Within each pair, there is also a difference that the first item is a document-based operation while the second is a streaming operation.I posit that these are fundamental distinctions.
You might notice this is not the situation today, but currently there is some confusion over this in Base:
mimewrite
is being called bydisplay
, which has meant that it defaults to callingshow
to make the REPL case not look broken. But the Markdown code correctly implements this as print, making the actual resulting behavior inconsistent across types. The problem is, that means that doc-string printing works at the expense of Markdown objects not interacting properly in the REPL like other types.By analogy to another parts of the Julia system, you could think of printing as the evaluation of a particular sort of AST tree, where evaluation = printing and AST tree = objects. For some objects, the evaluation step has no effect (e.g.
print = show
). For other objects (e.g. text and text-like documents), the evaluation step has the effect of stripping formatting from the string. I make this analogy because it hints at component potentially missing from our IO system: quote and interpolation nodes.More food-for-thought:
Because show/print are streaming operations, it is generally valid to make multiple calls and assume that the end result will be a valid / cohesive unit.
By contrast, it is typically invalid to call mimewrite multiple times on an IO object. It should be assumed that mimewrite also writes all of the header and footer information to complete the file object.
Display is a bit different from mimewrite in that it is valid to call it multiple times with the same display object. However, since display manages the document context internally, it is generally assumed that it is a document creator (while mimewrite is a document writer). I assume this dichotomy is what drove the current design of
display -> mimewrite -> (show | print)
, but I believe this may have been incorrect (per above).Proposed solution elements:
with_output_color
in introduce IOContext and ImmutableDict to fix some of show, print, & friends #13825 for the intended implementation of this)print(out::IO, io::IO)
as equivalent tosendfile(out, in)
(meaning that the rendered form of an IO object is the content that it contains). this has application for item 2@doc
would return anIOInterpolate(MarkdownDoc)
. Although, I think IOInterpolate would still print some sort of header like "$type Rendering:\n" for text/plain, or make a scrollable frame for text/html).show
toSTDOUT
. IJulia would callshow
to anHTMLBuilder
print(::HTMLBuilder, ::OtherType)
to directly add HTML content. The fallback implementation of mimewrite fortext/html
would beprint(io, "<html>", print(HTMLBuilder, value), "</html>")
.I've completed a large portion of the work already in #13825 during my quest to better understand the nuances of this problem. (In particular, I believe IOContext and
with_output_color
are the complicated additions while the remaining pieces now are potentially just a bit of restructuring of the IO usage).The text was updated successfully, but these errors were encountered: