Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select/transform: old_column => fun => new_column_name syntax #2301

Closed
cleytonfar opened this issue Jun 21, 2020 · 1 comment
Closed

select/transform: old_column => fun => new_column_name syntax #2301

cleytonfar opened this issue Jun 21, 2020 · 1 comment

Comments

@cleytonfar
Copy link

Weird output when using select(), select!(), transform() and transform!() with anonymous function.
For example:

using DataFrames
df = DataFrame(a = 1:3, b = 4:6)
select(df, :a => x->x*2)

the output will be

3×1 DataFrame
│ Row │ a_function │
│     │ Int64      │
├─────┼────────────┤
│ 12          │
│ 24          │
│ 36

However, if I try to give a new name to the transformed column:

select(df, :a => x->x*2 =>:d)

the output will be:

│ Row │ a_function    │
│     │ Pair         │
├─────┼───────────────┤
│ 1   │ [2, 4, 6]=>:d │
│ 2   │ [2, 4, 6]=>:d │
│ 3   │ [2, 4, 6]=>:d

One fix is to wrap the anonymous function using ByRow() :

select(df, :a => ByRow(x->x*2) => :d)
3×1 DataFrame
│ Row │ d     │
│     │ Int64 │
├─────┼───────┤
│ 12     │
│ 24     │
│ 36

or define your own function:

function myfun(x)
    x*2
end

select(df, :a => myfun => :d)
3×1 DataFrame
│ Row │ d     │
│     │ Int64 │
├─────┼───────┤
│ 12     │
│ 24     │
│ 36

I am not sure if this behavior is intended or some small issue. I think the code would be clearer if we did not need to wrap the anonymous function with ByRow or define our own functions to pass to select/transform.

@bkamins
Copy link
Member

bkamins commented Jun 21, 2020

I think the code would be clearer if we did not need to wrap the anonymous function with ByRow

ByRow is not needed, you just need to wrap the function in ( and ) due to precedence rules in Julia:

select(df, :a => (x->x*2) =>:d)

When debugging, it is usually simplest just to paste the expression in question in REPL. In this case you see:

julia> :a => x->x*2 =>:d
:a => var"#3#4"()

julia> :a => (x->x*2) =>:d
:a => (var"#5#6"() => :d)

which shows you the root-cause of the problem.

@bkamins bkamins closed this as completed Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants