Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update docs following CSV.jl 0.9 release #2865

Merged
merged 9 commits into from
Sep 9, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/abstractdataframe/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ function getmaxwidths(df::AbstractDataFrame,

undefstrwidth = ourstrwidth(io, "#undef", buffer, truncstring)

ct = show_eltype ? batch_compacttype(Any[eltype(c) for c in eachcol(df)]) : String[]
ct = show_eltype ? batch_compacttype(Any[eltype(c) for c in eachcol(df)], 9) : String[]
j = 1
for (col_idx, (name, col)) in enumerate(pairs(eachcol(df)))
# (1) Consider length of column name
Expand Down Expand Up @@ -211,7 +211,7 @@ function _show(io::IO, ::MIME"text/html", df::AbstractDataFrame;
# which the users can hover over. The limit of 256 characters is arbitrary, but
# we want some maximum limit, since the types can sometimes get really-really long.
types = Any[eltype(df[!, idx]) for idx in 1:mxcol]
ct, ct_title = batch_compacttype(types), batch_compacttype(types, 256)
ct, ct_title = batch_compacttype(types, 9), batch_compacttype(types, 256)
for j in 1:mxcol
s = html_escape(ct[j])
title = html_escape(ct_title[j])
Expand Down Expand Up @@ -380,7 +380,7 @@ function _show(io::IO, ::MIME"text/latex", df::AbstractDataFrame;
write(io, "\t\\hline\n")
if eltypes
write(io, "\t& ")
ct = batch_compacttype(Any[eltype(df[!, idx]) for idx in 1:mxcol])
ct = batch_compacttype(Any[eltype(df[!, idx]) for idx in 1:mxcol], 9)
header = join(latex_escape.(ct), " & ")
write(io, header)
mxcol < size(df, 2) && write(io, " & ")
Expand Down
4 changes: 2 additions & 2 deletions src/abstractdataframe/show.jl
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ function batch_compacttype(types::Vector{Any}, maxwidths::Vector{Int})
end
end

function batch_compacttype(types::Vector{Any}, maxwidth::Int=8)
function batch_compacttype(types::Vector{Any}, maxwidth::Int)
cache = Dict{Type, String}()
return map(types) do T
get!(cache, T) do
Expand All @@ -100,7 +100,7 @@ For displaying data frame we do not want string representation of type to be
longer than `maxwidth`. This function implements rules how type names are
cropped if they are longer than `maxwidth`.
"""
function compacttype(T::Type, maxwidth::Int=8)
function compacttype(T::Type, maxwidth::Int)
maxwidth = max(8, maxwidth)

T === Any && return "Any"
Expand Down
12 changes: 6 additions & 6 deletions test/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import Main: QuoteTestType
\\begin{tabular}{r|ccccccc}
\t& A & B & C & D & E & F & G\\\\
\t\\hline
\t& Int64 & String & String & Float64? & Cat…? & String & MD\\\\
\t& Int64 & String & String & Float64? & Cat…? & String & MD\\\\
\t\\hline
\t1 & 1 & \\\$10.0 & A & 1.0 & a & \\emph{\\#undef} & \\href{http://juliadata.github.io/DataFrames.jl}{DataFrames.jl} \\\\
\t2 & 2 & M\\&F & B & 2.0 & \\emph{missing} & \\emph{\\#undef} & \\#\\#\\#A \\\\
Expand Down Expand Up @@ -167,7 +167,7 @@ end
@test repr(MIME("text/html"), df) ==
"<div class=\"data-frame\"><p>4 rows × 2 columns</p>" *
"<table class=\"data-frame\"><thead><tr><th></th><th>A</th><th>B</th></tr><tr><th></th>" *
"<th title=\"Int64\">Int64</th><th title=\"Markdown.MD\">MD</th></tr></thead>" *
"<th title=\"Int64\">Int64</th><th title=\"Markdown.MD\">MD</th></tr></thead>" *
"<tbody><tr><th>1</th><td>1</td>" *
"<td><div class=\"markdown\"><p><a href=\"http://juliadata.github.io/DataFrames.jl\">DataFrames.jl</a></p>\n</div></td></tr>" *
"<tr><th>2</th><td>4</td><td><div class=\"markdown\"><p>###A</p>\n</div></td></tr>" *
Expand Down Expand Up @@ -196,7 +196,7 @@ end
"<th></th>" *
"<th title=\"String\">String</th>" *
"<th title=\"Any\">Any</th>" *
"<th title=\"QuoteTestType{&apos;&quot;&apos;}\">QuoteTe…</th>" *
"<th title=\"QuoteTestType{&apos;&quot;&apos;}\">QuoteTes…</th>" *
"</tr>" *
"</thead><tbody>" *
"<tr>" *
Expand Down Expand Up @@ -290,7 +290,7 @@ end
"""
8×2 DataFrame
Row │ A B
│ Int64 MD
│ Int64 MD
─────┼──────────────────────────────────────────
1 │ 1 DataFrames.jl (http://juliadat…
2 │ 4 \\frac{x^2}{x^2+y^2}
Expand All @@ -305,7 +305,7 @@ end
"""
8×2 DataFrame
Row │ A B
│ Int64 MD
│ Int64 MD
─────┼──────────────────────────────────────────
1 │ 1 DataFrames.jl (http://juliadat…
2 │ 4 \\frac{x^2}{x^2+y^2}
Expand Down Expand Up @@ -355,7 +355,7 @@ end
"<div class=\"data-frame\"><p>8 rows × 2 columns</p>" *
"<table class=\"data-frame\"><thead>" *
"<tr><th></th><th>A</th><th>B</th></tr>" *
"<tr><th></th><th title=\"Int64\">Int64</th><th title=\"Markdown.MD\">MD</th></tr>" *
"<tr><th></th><th title=\"Int64\">Int64</th><th title=\"Markdown.MD\">MD</th></tr>" *
"</thead>" *
"<tbody>" *
"<tr><th>1</th><td>1</td><td><div class=\"markdown\">" *
Expand Down
4 changes: 2 additions & 2 deletions test/show.jl
Original file line number Diff line number Diff line change
Expand Up @@ -301,8 +301,8 @@ end
@test sprint(show, df) == """
1×3 DataFrame
Row │ a b c
│ Date DateTime Day
─────┼────────────────────────────────────────
│ Date DateTime Dates.Day
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: why don't we print Dates.DateTime?

Anyway it would make sense to never print the module name to avoid this kind of variation (which is hard to understand for users).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: why don't we print Dates.DateTime?

Because we match type-width only to column name, not shown contents (which can vary). I guess @ronisbr could change this if we asked for this. E.g.:

julia> DataFrame(x=Dates.DateTime(1))
1×1 DataFrame
 Row │ x
     │ DateTime…
─────┼─────────────────────
   1 │ 0001-01-01T00:00:00

julia> DataFrame(xxxxxxxxxxxxxxxx=Dates.DateTime(1))
1×1 DataFrame
 Row │ xxxxxxxxxxxxxxxx
     │ Dates.DateTime
─────┼─────────────────────
   1 │ 0001-01-01T00:00:00

Anyway it would make sense to never print the module name to avoid this kind of variation (which is hard to understand for users).

We could go for finding the last dot (i.e. .) where the sequence XXX.YYYY.ZZZZ consists of valid Julia identifiers and strip the front of the string, but it is quite tricky to do in 100% correct way (of course it is doable).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's doable, I'd say that taking into account the final size width of the column (including contents) would be better. Otherwise there's some wasted space which in some cases could be used to print relevant information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronisbr - is this EASILY doable? (if it is hard it is probably better to focus on HTML/LaTeX support first). Thank you!

─────┼────────────────────────────────────────────
1 │ 2020-02-11 2020-02-11T15:00:00 1 day"""

# Irrational
Expand Down