update docs following CSV.jl 0.9 release #2865

bkamins · 2021-09-08T06:42:00Z

Fixes #2864

bkamins · 2021-09-08T08:04:16Z

waiting for a decision in JuliaData/WeakRefStrings.jl#85

bkamins · 2021-09-08T22:35:32Z

@nalimilan - I have updated the docs. I left String[X] types and added a paragraph explaining how they work.

quinnj

I restarted the jobs now that WeakRefStrings has released

nalimilan

Looks much better now!

nalimilan · 2021-09-09T08:06:26Z

Though doctests fail due to that String15… weirdness:

│ Evaluated output:
│ 
│ 3×5 DataFrame
│  Row │ Species          SepalLength_mean  SepalWidth_mean  PetalLength_mean  P ⋯
│      │ String15…        Float64           Float64          Float64           F ⋯
│ ─────┼──────────────────────────────────────────────────────────────────────────
│    1 │ Iris-setosa                 5.006            3.418             1.464    ⋯
│    2 │ Iris-versicolor             5.936            2.77              4.26
│    3 │ Iris-virginica              6.588            2.974             5.552
│                                                                 1 column omitted
│ 
│ Expected output:
│ 
│ 3×5 DataFrame
│  Row │ Species          SepalLength_mean  SepalWidth_mean  PetalLength_mean  P ⋯
│      │ String15         Float64           Float64          Float64           F ⋯
│ ─────┼──────────────────────────────────────────────────────────────────────────
│    1 │ Iris-setosa                 5.006            3.418             1.464    ⋯
│    2 │ Iris-versicolor             5.936            2.77              4.26
│    3 │ Iris-virginica              6.588            2.974             5.552
│                                                                 1 column omitted

Cc: @ronisbr

bkamins · 2021-09-09T08:44:04Z

Though doctests fail due to that String15… weirdness:

I will check what is the reason of failure and fix it - it is most likely on DataFrames.jl side. The strange thing is that what I put into this PR passes on my local machine.

bkamins · 2021-09-09T09:06:27Z

The problem was the following:

julia> using DataFrames

julia> import Dates

julia> DataFrame(d=Dates.Date(1))
1×1 DataFrame
 Row │ d
     │ Date…
─────┼────────────
   1 │ 0001-01-01

julia> using Dates

julia> DataFrame(d=Dates.Date(1))
1×1 DataFrame
 Row │ d
     │ Date
─────┼────────────
   1 │ 0001-01-01

Now I have fixed it to never try printing type module name as a prefix (so the latter output is presented consistently).

In this way data frame is displayed in the same way no matter if the module is loaded with using or with import or only indirectly by some other module.

nalimilan · 2021-09-09T09:25:17Z

src/abstractdataframe/show.jl

@@ -106,12 +106,12 @@ function compacttype(T::Type, maxwidth::Int=8)
    T === Any && return "Any"
    T === Missing && return "Missing"

-    sT = string(T)
+    sT = string(T isa Union ? T : nameof(T))


For full consistency we should recursively call nameof on each part of the union, as otherwise the module name is printed.

It is even more complicated 😞. I will try to propose something reasonable. The problem is that nameof drops parameters from parametric types.

bkamins · 2021-09-09T10:58:41Z

The patches lead me to uncover that we printed MD type incorrectly previously which got uncaught and that we inconsistently printed type information in text vs HTML/LaTeX. Now I made it more consistent (use 9 text width characters by default - @ronisbr - this might affect your work on HTML and LaTeX output, but these are minor changes)

bkamins · 2021-09-09T12:52:47Z

Just as a comment why 9 if someone would ask. The String[X] types have up to 9 characters of length (and also this is the number we used in text/plain already, I just synced HTML and LaTeX with this reference)

bkamins · 2021-09-09T12:53:11Z

Thank you!

nalimilan · 2021-09-09T15:09:22Z

test/show.jl

@@ -301,8 +301,8 @@ end
    @test sprint(show, df) == """
        1×3 DataFrame
         Row │ a           b                    c
-             │ Date        DateTime             Day
-        ─────┼────────────────────────────────────────
+             │ Date        DateTime             Dates.Day


Just curious: why don't we print Dates.DateTime?

Anyway it would make sense to never print the module name to avoid this kind of variation (which is hard to understand for users).

Just curious: why don't we print Dates.DateTime?

Because we match type-width only to column name, not shown contents (which can vary). I guess @ronisbr could change this if we asked for this. E.g.:

julia> DataFrame(x=Dates.DateTime(1)) 1×1 DataFrame Row │ x │ DateTime… ─────┼───────────────────── 1 │ 0001-01-01T00:00:00 julia> DataFrame(xxxxxxxxxxxxxxxx=Dates.DateTime(1)) 1×1 DataFrame Row │ xxxxxxxxxxxxxxxx │ Dates.DateTime ─────┼───────────────────── 1 │ 0001-01-01T00:00:00

Anyway it would make sense to never print the module name to avoid this kind of variation (which is hard to understand for users).

We could go for finding the last dot (i.e. .) where the sequence XXX.YYYY.ZZZZ consists of valid Julia identifiers and strip the front of the string, but it is quite tricky to do in 100% correct way (of course it is doable).

If that's doable, I'd say that taking into account the final size width of the column (including contents) would be better. Otherwise there's some wasted space which in some cases could be used to print relevant information.

@ronisbr - is this EASILY doable? (if it is hard it is probably better to focus on HTML/LaTeX support first). Thank you!

update docs following CSV.jl 0.9 release

be6158b

bkamins added the doc label Sep 8, 2021

bkamins added this to the patch milestone Sep 8, 2021

bkamins requested a review from nalimilan September 8, 2021 06:42

update for CSV.jl 0.9.1

10efa01

quinnj approved these changes Sep 9, 2021

View reviewed changes

nalimilan approved these changes Sep 9, 2021

View reviewed changes

fix type printing inconsistency

d79aa5d

fix union case

27fdc88

nalimilan reviewed Sep 9, 2021

View reviewed changes

bkamins added 3 commits September 9, 2021 11:35

another attempt to fix things

4c21c80

correct off-by-one issue in printing

2edfa1e

remove NEWS.md entry

336f66e

nalimilan approved these changes Sep 9, 2021

View reviewed changes

small updates in printing

c6a99c2

small test update

aeae38b

bkamins merged commit 39c04ac into main Sep 9, 2021

bkamins deleted the bk/docs_csv branch September 9, 2021 12:53

nalimilan reviewed Sep 9, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update docs following CSV.jl 0.9 release #2865

update docs following CSV.jl 0.9 release #2865

bkamins commented Sep 8, 2021

bkamins commented Sep 8, 2021

bkamins commented Sep 8, 2021

quinnj left a comment

nalimilan left a comment

nalimilan commented Sep 9, 2021

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

nalimilan Sep 9, 2021

bkamins Sep 9, 2021

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

nalimilan Sep 9, 2021

bkamins Sep 9, 2021

nalimilan Sep 11, 2021

bkamins Sep 11, 2021

update docs following CSV.jl 0.9 release #2865

update docs following CSV.jl 0.9 release #2865

Conversation

bkamins commented Sep 8, 2021

bkamins commented Sep 8, 2021

bkamins commented Sep 8, 2021

quinnj left a comment

Choose a reason for hiding this comment

nalimilan left a comment

Choose a reason for hiding this comment

nalimilan commented Sep 9, 2021

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

nalimilan Sep 9, 2021

Choose a reason for hiding this comment

bkamins Sep 9, 2021

Choose a reason for hiding this comment

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

bkamins commented Sep 9, 2021

nalimilan Sep 9, 2021

Choose a reason for hiding this comment

bkamins Sep 9, 2021

Choose a reason for hiding this comment

nalimilan Sep 11, 2021

Choose a reason for hiding this comment

bkamins Sep 11, 2021

Choose a reason for hiding this comment