-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handle Tables.AbstractRow in operation specficiation #3348
Conversation
This is mildly breaking, but I assume it is OK to add it in 1.6 release. The point is that I assume that when someone uses |
@@ -770,6 +773,15 @@ function _add_multicol_res(res::DataFrameRow, newdf::DataFrame, df::AbstractData | |||
_insert_row_multicolumn(newdf, df, allow_resizing_newdf, colnames, res) | |||
end | |||
|
|||
function _add_multicol_res(res::Tables.AbstractRow, newdf::DataFrame, df::AbstractDataFrame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at it, maybe we should make DataFrameRow <: AbstractRow
? That would avoid duplicating a few methods and simplifying type unions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it.
First (for future readers) DataFrameRow
fully supports Tables.AbstractRow
interface, so this is just an issue of code design.
Pros of doing subtyping:
- Less code.
- It is more clear which functionalities are on more "abstract level".
Cons of doing subtyping:
- Most of methods for
DataFrameRow
andTables.AbstractRow
are the same. However, not all of them. Some methods are different, becauseDataFrameRow
has a richer functionality thanTables.AbstractRow
. The challenge is that keepingDataFrameRow
andTables.AbstractRow
separate makes it easier (at least for me) in the future to find all places in the source code whereDataFrameRow
is used. I know this is not a super strong reason but with the size of the code that we have I often end up doing updates of code by running "find in all files" of a certain code pattern (as otherwise it is easy to forget about some place when some functionality is used).
(for a reference: I started implementing this change and noticed that it would affect more code than only this PR and after this change)
So - we could add it, but it also has some practical downsides. What do you think, given this, we should do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion. Given that Tables.jl uses duck typing anyway it's not super important to have DataFrameRow <: AbstractRow
, and indeed nobody has requested it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - duck typing is my main reason for sticking with what we have.
@@ -2939,4 +2939,82 @@ end | |||
end | |||
end | |||
|
|||
@testset "Tables.AbstractRow interface" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should also be tested in other tests where we cover DataFrameRow
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added more such tests (where I managed to track them down).
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
The 32-bit failure seems unrelated but real? |
Thank you (I will fix the 32-bit error in a separate PR) |
Fixes #3335
After this PR
Tables.AbstractRow
is treated in the same way asDataFrames.DataFrameRow
in allcombine
/select
/transform
operations.