DataFrameRow and NamedTuple comparisons #2668

bkamins · 2021-03-22T16:29:29Z

Should we update isequal and == and isless to allow comparing NamedTuple and DataFrameRow directly without conversion? (this will also impact hash) See #2639 for a related discussion on GroupKey.

Also should GroupKey be allowed to be compared to DataFrameRow then?

The text was updated successfully, but these errors were encountered:

pdeffebach · 2021-03-23T03:12:45Z

I vote no. The user can cast as a NamedTuple if they need to.

Also no for a DataFramesRow. The user can also cast as a NamedTuple. In general I don't see GroupKeys and DataFrameRows as very similr.

nalimilan · 2021-03-23T08:36:05Z

I think it's worth defining == and isequal to be consistent with NamedTuple. That way we are closer to the row object types used by other Tables.jl tables, notably NamedTuple{<:AbstractVector} and AbstractVector{<:NamedTuple}, which use NamedTuple to represent rows. This isn't guaranteed by the Tables.jl interface, but that's convenient to have.

isless and < are not required, and can be added later as long as they currently throw errors.

bkamins · 2021-03-23T09:04:51Z

I agree. I will keep == and isequal but remove < and isless in #2669.
The thing is that this will allow the following code:

for r in eachrow(df)
    if r == (a=1, b=2)
        # do something
    end
end

and it is natural to allow for this without requiring a conversion.

bkamins · 2021-03-23T09:19:01Z

Now - thinking about it more. Since we already allow isless for DataFrameRow the same logic as above applies to it, as the following code is equally natural to the one above:

for r in eachrow(df)
    if r < (a=1, b=2)
        # do something
    end
end

Do we see any downsides of allowing < and isless across: NamedTuple, DataFrameRow and GroupKey?

Regarding the comment that you can always cast to NamedTuple - this is true, but casting to NamedTuple is an expensive operation (and the wider the data frame is the more expensive it gets).

bkamins added the decision label Mar 22, 2021

bkamins added this to the 1.0 milestone Mar 22, 2021

bkamins linked a pull request Mar 23, 2021 that will close this issue

add ==, isequal <, and isless for DataFrameRow and GroupKey #2669

Merged

bkamins closed this as completed in #2669 Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrameRow and NamedTuple comparisons #2668

DataFrameRow and NamedTuple comparisons #2668

bkamins commented Mar 22, 2021

pdeffebach commented Mar 23, 2021

nalimilan commented Mar 23, 2021

bkamins commented Mar 23, 2021

bkamins commented Mar 23, 2021

DataFrameRow and NamedTuple comparisons #2668

DataFrameRow and NamedTuple comparisons #2668

Comments

bkamins commented Mar 22, 2021

pdeffebach commented Mar 23, 2021

nalimilan commented Mar 23, 2021

bkamins commented Mar 23, 2021

bkamins commented Mar 23, 2021