-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add :setequal to cols kwarg in vcat, push! and append! #1958
Conversation
Good catch. I guess we should use |
This PR also should be good for a review |
cc50ea0
to
6a1adaf
Compare
6a1adaf
to
699859b
Compare
This adds @rapus95: could you please have a look:
Thank you! |
this PR should also cover the following case, right? df = DataFrame()
push!(df, (a=1, b=2, c=3), cols=:union) I would found this to be very convenient in practice when I store in a dataframe some observations taken during the iterations of a simulation, but while prototyping I change my mind quite often about which observations actually storing. |
Actually:
works already. This PR would cover the following case additionally:
and |
right, I don't know why I led myself to believe that didn't work. sorry for the noise |
No problem - it is better to be explicit - let us just make sure that what we provide is what users want and would use 😄. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took a lot of time... I hope some of the comments are helpful to you.
src/dataframerow/dataframerow.jl
Outdated
function Base.push!(df::DataFrame, dfr::DataFrameRow; cols::Symbol=:equal, | ||
columns::Union{Nothing,Symbol}=nothing) | ||
if isnothing(columns) | ||
columns = cols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for the other push! (why not rename every columns
)
How about: """
[...] vcat(left, right, cols=X)
where for X you have the following options:
* :identical, order and names of columns need to match
* :equal, names of columns need to match (order is not necessary)
* :intersect, newnames* = names(left) \cap names(right)
* :subset, newnames* = names(left)
* :union, newnames* = names(left) \cup names(right)
* missing data is filled with `missing`
""" |
Btw what should happen in that case: df1 = DataFrame([[1], [1]], [:A, :C])
df2 = DataFrame([[2], [2]], [:B, :D])
df3 = vcat(df1, df2, cols=:subset)
df4 = push!(df1, df2[1,:], cols=:subset) That situation would make a good test case I guess as it helps spot changing behaviour. |
This is what I came up with after pondering on it:
|
I have also updated the tests to take into account the changes we make here. There are two notes for the future:
I have also resolved all comments (there were many of them so by resolving I kept track of what was fixed). Thank you for contributing to this PR. |
How I will update this PR soon (hopefully today :)):
|
This should be good to have a look at (especially test coverage as we added many options in this PR). Thank you!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I've spotted more stylistic things. Since I have still one substantial point, probably worth fixing them too.
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
@nalimilan - as a second thought - given the discussion in #1993 we sort columns of In other words - the constructor considers the order of columns in |
Well it sounds OK to sort column names by default, but if you pass |
OK |
Thank you for working on it. |
Fixes #1904
I think we should make
vcat
andpush!
consistent with the name of kwarg. Should it becols
(now used invcat
) orcolumns
(now used inpush!
). Any opinions?