Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join on columns of different name #1297

Closed
cjprybol opened this issue Dec 1, 2017 · 3 comments
Closed

Join on columns of different name #1297

cjprybol opened this issue Dec 1, 2017 · 3 comments

Comments

@cjprybol
Copy link
Contributor

cjprybol commented Dec 1, 2017

Transfer of JuliaData/DataTables.jl#78

idea summary: Use a 2-element tuple to support joining DataFrames on columns that do not have the same names. Currently, we require renaming columns so that they match before columns can be used for joining

join(a, b, on=(:a_name, :b_name))
join(a, b, on=[(:a_name1, :b_name1), (:a_name2, :b_name2)])

Pandas does this more explicitly by having left_on right_on keywords, and R's Data.Table has by.x and by.y (jump to page 61 in docs)

@ararslan
Copy link
Member

ararslan commented Dec 1, 2017

I'd suggest using pairs, e.g. join(a, b, on=[:a_name1 => :b_name1, :a_name2 => :b_name2])

@nalimilan
Copy link
Member

I'm hesitant about using Pair. That kind of suggests a direction or a movement, but here the two column names are treated symmetrically. I would make sense if the second column name was retained in the result, but using the first one would sound more natural (that's what R does at least). OTOH, pairs make it clearer that matching columns need to be grouped, rather than grouping by (left, right). Without them it's really hard to remember.

Anyway, if we use first and last internally, we can support both tuples and pairs, so we just need to choose which syntax we promote in the docs.

@cjprybol
Copy link
Contributor Author

cjprybol commented Dec 9, 2017

closed via #1312

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants