-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
complete update for 0.7 and lots of test fixes #1364
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks mostly good, but it can indeed be cleaned/simplified in light of #1364.
src/other/index.jl
Outdated
@@ -158,7 +158,7 @@ function Base.getindex(x::AbstractIndex, idx::AbstractVector) | |||
end | |||
length(idxs) == 0 && return Int[] # special case of empty idxs | |||
if idxs[1] isa Real | |||
if !all(v -> v isa Integer && !(v isa Bool), idxs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it was breaking when a Symbol
was passed in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what changed between 0.6 and 0.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing, it was breaking in 0.6 also. Since it only produced a warning nobody seems to have noticed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bkamins Does that look OK to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was breaking because we do not allow mixing indexing by number and symbol in the same index. It has to be either all numbers or all symbols. We have to exclude Bool
because we do not allow indexing by Bool
in this case and Bool <: Integer
.
Now conversion to Int
will fail on Symbol
anyway in line 169.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I think I was very confused about what the intent was there. will change.
src/other/utils.jl
Outdated
@@ -46,7 +46,9 @@ function makeidentifier(s::AbstractString) | |||
end | |||
end | |||
|
|||
return String(take!(res)) | |||
# return String(take!(res)) | |||
# TODO this is to fix a bug in Compat for 0.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give more details? Is there an issue open in Compat.jl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see this issue. I'm hoping it will be resolved one way or another before I'm even done with this PR...
src/DataFrames.jl
Outdated
esc(:(isdefined($sym))) | ||
end | ||
using Compat.IOBuffer | ||
const normalize = normalize_string # requires that we don't use LinearAlgebra normalize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather wrap this inside module Unicode
, and avoid using Unicode: normalize
in the other branch, so that you write Unicode.normalize
everywhere to avoid the ambiguity with LinearAlgebra.normalize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do...
src/DataFrames.jl
Outdated
const kwpairs = pairs | ||
else | ||
kwpairs(x::AbstractArray) = (first(v) => last(v) for v in x) | ||
macro isdefined(sym) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to contribute this to Compat if you have the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that a while back but it was rejected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, as far as I know @isdefined
has been around for quite a while, so I think there's some reason it's not in Compat
, though I have no idea what that is. @ararslan , do you think you could find that issue or thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, can we perhaps keep it in DataFrames though? I'm trying very hard to put only 0.7 valid code into the main body so that we don't really have to do anything once we no longer support 0.6 (which considering the nature of 0.7 may not be very long)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's discuss this in Compat. But here I think you can just do isdefined(Base, :unique!)
, the fact that Base
was omitted is most likely an oversight.
Please don't hesitate to merge #1363 on account of this PR. I plan to incorporate those changes as well... |
src/DataFrames.jl
Outdated
@@ -99,6 +108,6 @@ include("dataframerow/show.jl") | |||
include("abstractdataframe/sort.jl") | |||
include("dataframe/sort.jl") | |||
|
|||
include("deprecated.jl") | |||
# include("deprecated.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commenting this out was done for testing purposes... will be restored, please ignore
Can I remove code for pre 0.6? see here |
Ok, this has been way worse than I was expecting, but I'm slowly getting there. CategoricalArrays is in pretty bad shape for 0.7, looks like there will be a ton of warnings and even a couple of errors that may have to wait for that. The |
@@ -781,6 +782,11 @@ Base.hcat(df1::AbstractDataFrame, df2::AbstractDataFrame, dfn::AbstractDataFrame | |||
end | |||
end | |||
|
|||
|
|||
# TODO this function to be removed after CategoricalArrays update | |||
_promote_col_type(::Type{T}, n::Integer) where T = T(uninitialized, n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of update is needed? Just support for the uninitialized
syntax?
(FWIW, use {T}
and {T<:...}
with one-line function syntax.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the semantics consistent, they should support the T(uninitialized, n)
constructor. They don't necessarily have to do that, as far as I know Julia isn't necessarily trying to enforce this as a standard. Of course, I think they should support it. It's a design decision that will have to be made at some point, but for now, here's a work-around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've prepared JuliaData/CategoricalArrays.jl#130.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought I'm not even sure defining this constructor for CategoricalArray
would be consistent. See my response in your PR.
src/abstractdataframe/io.jl
Outdated
@@ -139,10 +139,11 @@ function latex_char_escape(char::AbstractString) | |||
return string("\\", char) | |||
end | |||
end | |||
# in 0.7 this function will only ever get passed Char, in 0.6 always gets SubString | |||
latex_char_escape(char::Char) = latex_char_escape(string(char)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then better put definitions which are not needed on 0.7 under a version check, even if you need to duplicate code. The goal is that in a few months we have the cleanest code possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. 0.6 only works with SubString
and 0.7 only works with Char
, so I'll have to think about the best way of doing this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The easiest solution is to have two completely separate blocks, one for each version, and duplicate the code as needed.
src/dataframe/dataframe.jl
Outdated
@@ -121,6 +121,8 @@ function DataFrame(pairs::Pair{Symbol,<:Any}...; makeunique::Bool=false)::DataFr | |||
DataFrame(columns, Index(colnames, makeunique=makeunique)) | |||
end | |||
|
|||
DataFrame(dict::Dict{Symbol,<:Any}) = DataFrame((x for x ∈ dict)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this was really interesting I forgot to mention it. 0.7 throws a deprecation warning without this explicit constructor because the "automatic conversion" constructors are deprecated, but for the life of me I have no idea what the Dict
constructor actually wound up calling before. My first instinct was to do away with the Pair
constructor and have it feed into a Dict
constructor, but I had to do it the other way around because of makeunique
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Then Base.convert(::Type{DataFrame}, d::AbstractDict)
was called automatically on 0.6. On 0.7, the recommendation added in JuliaLang/julia#23273 is to define constructors, and have convert
methods call constructors explicitly (so the inverse of what we do now, mostly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that splatting should generally be avoided when the number of elements can be large, as is the case here: it's much more efficient to pass the collection and loop over it inside the function.)
src/other/index.jl
Outdated
@@ -158,7 +158,7 @@ function Base.getindex(x::AbstractIndex, idx::AbstractVector) | |||
end | |||
length(idxs) == 0 && return Int[] # special case of empty idxs | |||
if idxs[1] isa Real | |||
if !all(v -> v isa Integer && !(v isa Bool), idxs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what changed between 0.6 and 0.7?
test/cat.jl
Outdated
# | ||
# hcat | ||
# | ||
|
||
nvint = [1, 2, missing, 4] | ||
nvstr = ["one", "two", missing, "four"] | ||
|
||
df2 = DataFrame(Any[nvint, nvstr]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are no longer needed now.
test/show.jl
Outdated
@@ -1,12 +1,19 @@ | |||
module TestShow | |||
using Compat, Compat.Test, DataFrames | |||
|
|||
if VERSION ≥ v"0.7.0-" | |||
using Random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Compat.Random
. Instead of dataframe_str
, I think you can just do $DataFrame
inside strings.
test/utils.jl
Outdated
@@ -1,5 +1,11 @@ | |||
module TestUtils | |||
using Compat, Compat.Test, DataFrames, StatsBase | |||
if VERSION ≤ v"0.7.0-" | |||
using Compat.Random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be under a version check. Also use Compat.@warn
.
test/utils.jl
Outdated
# @test_throws ArgumentError DataFrames.make_unique([:x, :x, :x_1, :x2], makeunique=false) | ||
@test DataFrames.make_unique([:x, :x_1, :x2], makeunique=false) == [:x, :x_1, :x2] | ||
# TODO below is commented out for now for warning sanity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, I don't even remember... will change
test/utils.jl
Outdated
@@ -30,7 +37,7 @@ module TestUtils | |||
@test rw == DataFrames.RESERVED_WORDS | |||
end | |||
else | |||
warn("Unable to validate reserved words against parser. ", | |||
@warn("Unable to validate reserved words against parser. ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation.
test/utils.jl
Outdated
@@ -150,6 +157,8 @@ module TestUtils | |||
|
|||
""" | |||
out = String(take!(io)) | |||
# TODO this test is very sensitive to irrelevant details of show implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you can use $Missing
and $CategoricalValue
to include some variations automatically.
Ok! I have cleaned everything up the best I think is currently possible. Some comments:
Hopefully we can avoid master having DataFrames related test errors in the future? 😉 Tagging @nalimilan , please let me know any changes you'd like me to make so we can get this merged. |
Hello all, I'd like to make a general plea that warnings be taken more seriously in the future. Getting this going on 0.7 was surprisingly painful and that was much exacerbated by the many warnings already appearing in the tests under normal circumstances. Not only to these make updates more difficult, but they can hide real issues, a couple of examples of which can be seen in this PR. This also makes it a lot harder to notice legitimate warnings from Base Julia concerning needed changes and warnings that might be inappropriately given to users. Lastly, the warnings make it more difficult for people to contribute to the package, since it can be hard to figure out what then intent of some of the tests and warnings are (as it sometimes was for me). Note that I have moved all tests of deprecated functionality which deliberately produce warnings to a separate |
src/dataframe/dataframe.jl
Outdated
@@ -121,6 +121,8 @@ function DataFrame(pairs::Pair{Symbol,<:Any}...; makeunique::Bool=false)::DataFr | |||
DataFrame(columns, Index(colnames, makeunique=makeunique)) | |||
end | |||
|
|||
DataFrame(dict::Dict{Symbol,<:Any}) = convert(DataFrame, dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be Dict
or AbstractDict
?
src/other/index.jl
Outdated
@@ -158,7 +158,7 @@ function Base.getindex(x::AbstractIndex, idx::AbstractVector) | |||
end | |||
length(idxs) == 0 && return Int[] # special case of empty idxs | |||
if idxs[1] isa Real | |||
if !all(v -> v isa Integer && !(v isa Bool), idxs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was breaking because we do not allow mixing indexing by number and symbol in the same index. It has to be either all numbers or all symbols. We have to exclude Bool
because we do not allow indexing by Bool
in this case and Bool <: Integer
.
Now conversion to Int
will fail on Symbol
anyway in line 169.
z = vcat(v, x) | ||
|
||
# TODO: uncomment the line below after deprecation | ||
# @test_throws ArgumentError z[:, [1, 1, 2]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line (61 and 62) is removed? I think we should test that it throws an error after deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but currently this is not fully deprecated behavior so it does not yet throw an error. When automatic generation of column names from indexing is fully deprecated, this line should be un-commented. Note that we test the current behavior in deprecation.jl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better keep this commented out then.
test/index.jl
Outdated
@@ -13,11 +14,11 @@ inds = Any[1, | |||
[big(1)], | |||
big(1):big(1), | |||
[:A], | |||
Union{Bool, Missing}[true, false], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my machine it does not throw deprecation warning. The vector does not contain any missing and getindex
for vector of Union{Bool, Missing}
tests for it. And this specific case is not tested in "deprecated.jl".
@ExpandingMan Huge work. Thank you. I have left my comments inline. |
Think I've complied with all comments now, so we should be good. Glad to help! I have a few things that used DataFrames that I was eager to get running on 0.7, so this seemed fairly high priority. Note that I still get a non- CategoricalArrays seems like it's in much worse shape. I hope to at least make enough fixes to that so that it doesn't screw everything else up, including DataFrames (currently we still have failing tests on 0.7). |
I agree that warnings are problematic in general. I don't like warnings because they indicate something is wrong, yet no failure happens. Ideally we would turn deprecations from DataFrames into errors (but not those from other packages), like Julia does, but I don't think that's possible currently, and tests for deprecated features would not be possible. |
test/cat.jl
Outdated
@test df[1] == CategoricalVector(1:10) | ||
DataFrames.hcat!(df, 1:10, makeunique=true) | ||
@test df[2] == collect(1:10) | ||
DataFrames.hcat!(df, collect(1:10), makeunique=true) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this new line break.
test/cat.jl
Outdated
@@ -51,23 +51,27 @@ module TestCat | |||
|
|||
@testset "hcat ::AbstractVectors" begin | |||
df = DataFrame() | |||
DataFrames.hcat!(df, CategoricalVector{Union{Int, Missing}}(1:10)) | |||
dfb = DataFrame(x1=CategoricalVector{Union{Int,Missing}}(1:10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed? Same below. Please revert the name changes.
test/data.jl
Outdated
using Compat, Compat.Test, DataFrames | ||
importall Base # so that we get warnings for conflicts | ||
using Compat, Compat.Test, DataFrames, Compat.Random | ||
# TODO this already generates a huge wall of warnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving a note that something needs to be done about this.
test/data.jl
Outdated
df3 = DataFrame(Any[[1, 2, missing, 4]]) | ||
df4 = DataFrame(Any[Vector{Union{Int, Missing}}(1:4), Vector{Union{Int, Missing}}(1:4)]) | ||
df5 = DataFrame(Any[Union{Int, Missing}[1, 2, 3, 4], ["one", "two", missing, "four"]]) | ||
df2 = DataFrame(Any[[1, 2, missing, 4], ["one", "two", missing, "four"]], [:x1, :x2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are no longer needed AFAICT. Same below in several places.
z = vcat(v, x) | ||
|
||
# TODO: uncomment the line below after deprecation | ||
# @test_throws ArgumentError z[:, [1, 1, 2]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better keep this commented out then.
test/join.jl
Outdated
@test join(nameid, jobid, on = :ID, kind = :right) == right[on] | ||
@test join(nameid, jobid, on = :ID, kind = :semi) == semi[on] | ||
@test join(nameid, jobid, on = :ID, kind = :anti) == anti[on] | ||
on_ = [:ID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid renaming variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All such renames are in order to avoid extraneous warnings. Do you want me to revert to warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as I said I prefer keeping the code clean and wait until the (spurious) warnings go away than living with weird variable names forever.
test/join.jl
Outdated
@@ -71,63 +72,56 @@ module TestJoin | |||
# Cross joins don't take keys | |||
@test_throws ArgumentError join(df1, df2, on = :A, kind = :cross) | |||
|
|||
@testset "Test empty inputs 1" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but these are the same warnings as when reusing object names, we don't care since we don't want to overwrite the method. Please keep this.
test/utils.jl
Outdated
Number Unique: 4 | ||
Number Missing: 0 | ||
% Missing: 0.000000 | ||
|
||
""" | ||
out = String(take!(io)) | ||
@test (out == missingfirst || out == missingsecond) | ||
# TODO this test is very sensitive to irrelevant details of show implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better remove this comment.
test/utils.jl
Outdated
@@ -66,7 +67,7 @@ module TestUtils | |||
describe(io, df) | |||
DRT = CategoricalArrays.DefaultRefType | |||
# Julia 0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing 0.7-specific AFAICT. Same below.
@@ -781,6 +782,11 @@ Base.hcat(df1::AbstractDataFrame, df2::AbstractDataFrame, dfn::AbstractDataFrame | |||
end | |||
end | |||
|
|||
|
|||
# TODO this function to be removed after CategoricalArrays update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once JuliaLang/METADATA.jl#13598 is merged, please revert this, require CategoricalArrays 0.3.5, and also adjust uses of the current constructors, e.g. in allocate
.
@ExpandingMan Regarding your question about |
Yeah, my confusion has cleared up a bit. It seems that |
Exactly! Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think it's almost ready after fixing the things I've noted. Just remove all the unnecessary churn due to renaming variables, which should make the PR much more readable.
test/data.jl
Outdated
const ≅ = isequal | ||
|
||
#test_group("constructors") | ||
df1 = DataFrame(Any[[1, 2, missing, 4], ["one", "two", missing, "four"]], [:Ints, :Strs]) | ||
df2 = DataFrame(Any[[1, 2, missing, 4], ["one", "two", missing, "four"]]) | ||
df3 = DataFrame(Any[[1, 2, missing, 4]]) | ||
df3 = DataFrame(Any[[1, 2, missing, 4]], [:x1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
test/data.jl
Outdated
using Compat, Compat.Test, DataFrames | ||
importall Base # so that we get warnings for conflicts | ||
using Compat, Compat.Test, DataFrames, Compat.Random | ||
# TODO: need to find a better way of accomplishing below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving another note since the previous one has been closed for some reason. Please either keep this or remove it, but adding comments doesn't help fixing the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want me to retain the importall
or remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say remove it, anyway the comment probably referred to ambiguity warnings, which are no longer printed.
src/DataFrames.jl
Outdated
const kwpairs = pairs | ||
else | ||
kwpairs(x::AbstractArray) = (first(v) => last(v) for v in x) | ||
using Compat.IOBuffer | ||
import Base.LinAlg: normalize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I suggested before, what happens if you remove this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a Compat update so I was really hoping I could do this properly, but it's still really screwed up. When you do using Compat.Unicode: normalize
in 0.6 it doesn't import the normalize_str
methods. Anyway, this can't be removed, because it throws an error telling you that you must explicitly import LinAlg.normalize
. Sorry I've been hesitant to make PR's to Compat, but there are so many gotchas and existing bugs that doing that is rather difficult.
This should not break if Compat gets fixed, especially because Unicode
doesn't even export normalize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Doesn't it work if you do using Compat.Unicode
and then Unicode.normalize
for every use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, you should be able to use it soon (don't forget to update REQUIRE): JuliaLang/METADATA.jl#13615
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, thanks, I will wait until METADATA gets updated and try it out.
test/dataframe.jl
Outdated
@@ -522,7 +508,7 @@ module TestDataFrame | |||
end | |||
|
|||
@testset "misc" begin | |||
df = DataFrame(Any[collect('A':'C')]) | |||
df = DataFrame(Any[collect('A':'C')], [:x1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed (same below).
src/dataframe/dataframe.jl
Outdated
@@ -121,6 +121,8 @@ function DataFrame(pairs::Pair{Symbol,<:Any}...; makeunique::Bool=false)::DataFr | |||
DataFrame(columns, Index(colnames, makeunique=makeunique)) | |||
end | |||
|
|||
DataFrame(dict::AbstractDict) = convert(DataFrame, dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You appear to have missed my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? Wasn't it regarding splatting? This is no longer splatting. Can you explain what you'd like me to change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this:
On 0.7, the recommendation added in JuliaLang/julia#23273 is to define constructors, and have convert methods call constructors explicitly (so the inverse of what we do now, mostly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry I had missed that. Will change now.
src/abstractdataframe/io.jl
Outdated
return "\\textasciitilde{}" | ||
else | ||
return string("\\", char) | ||
if VERSION ≥ v"0.7.0-" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's 0.7.0-DEV.4059
.
x = DataFrame(a = [1, 2, 3], b = [4, 5, 6]) | ||
v = DataFrame(a = [5, 6, 7], b = [8, 9, 10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to do something about this: https://github.com/JuliaData/DataFrames.jl/pull/1364/files#r171626211
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did, it's in deprecated.jl
now. This dataframe v
is removed here because it is no longer used in this file (appears again in deprecated.jl
).
@test df[1] == CategoricalVector(1:10) | ||
DataFrames.hcat!(df, 1:10, makeunique=true) | ||
@test df[2] == collect(1:10) | ||
DataFrames.hcat!(df, collect(1:10), makeunique=true) | ||
@test df[3] == collect(1:10) | ||
|
||
df = DataFrame() | ||
df2 = hcat(CategoricalVector{Union{Int, Missing}}(1:10), df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert all the renames in this file and the following files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, don't worry, I'm not ignoring you, I was busy...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. I leave new comments because GitHub hides the old ones and it's hard to keep track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I also never know if you are aware that I made changes, though by now it's seeming that you must get a notification whenever I commit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I send approve not to block merging when @nalimilan is OK with the PR (but I will try to review the final version of changes again if I will have time).
test/grouping.jl
Outdated
df = DataFrame(Key1 = ["A", missing, "B", "B", "A"], | ||
Key2 = CategoricalArray(["B", "A", "A", missing, "A"]), | ||
Value = 1:5) | ||
df= DataFrame(Key1 = ["A", missing, "B", "B", "A"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong indentation.
test/show.jl
Outdated
│ Col # │ Name │ Eltype │ Missing │ Values │ | ||
├───────┼──────┼──────────────────────────────────┼─────────┼─────────────────┤ | ||
│ 1 │ Fish │ String │ 0 │ Suzy … Amir │ | ||
│ 2 │ Mass │ Union{Float64, Missings.Missing} │ 1 │ 1.5 … missing │""" | ||
│ 2 │ Mass │ Union{Float64, $Missing} │ 1 │ 1.5 … missing │""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realized this probably won't work on 0.7, where the Missing
prefix won't be present, as the other columns will be less wide than in the string. So I guess we'll need two blocks with a VERSION
check. There's another case below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
Thanks for pushing that last commit. Sorry, as this wraps up I'm starting to lose track of things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, look good now!
Yay! In the future if I ever do something this big I'll probably split it into multiple PR's. I suspect that might save us all a bit of sanity. |
Hmm, I wonder why some lines are not covered by tests anymore. For example, compare: EDIT: scratch that, the links I posted actually show that the regression is older. EDIT: filed #1372 |
That was added for a reason, though now I don't remember exactly what that was. I expect this to break something. |
Yeah, there's a deprecation warning about it, but let's handle this in the next round of fixes. Else we need more tests, which is going to hold off the PR. |
Oh! I remember what it fixes. Without it you can't do e.g. |
Let's handle this in another PR, anyway we're not there yet! |
Acknowledged, opened an issue so we don't forget. |
Hello all, I'm not done with this yet but I wanted to put the PR up now so people know that it's being worked on and I have made significant progress.
Apart from the fixes for 0.7 I have noticed that the tests are quite a mess right now. It is very difficult to get 0.7 up and working without clean tests, so I am prioritizing getting the tests nice and spiffy.
I am fixing a huge number of deprecation warnings (that originate in DataFrames.jl) in the tests. I would like to suggest that if we want to keep tests of deprecated functionality, these should be moved to a dedicated test module because otherwise it makes cleaning up tests that are actually problematic rather difficult. This can also cover up real issues. For instance, does
getindex
require amakeunique
keyword? Suppose you dodf[:, [1,1,2]]
as in the tests? I'll post an issue about this (if there isn't one already).I may not get to finish this until the end of the week, but I will post updates soon.
Wow, what timing, looks like there will be significant overlap between here and #1363. Perhaps I'll go back and see if I can branch off of that.