Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constructor behavior on nested array vs array of tuple #2124

Closed
darrencl opened this issue Feb 19, 2020 · 1 comment
Closed

Constructor behavior on nested array vs array of tuple #2124

darrencl opened this issue Feb 19, 2020 · 1 comment

Comments

@darrencl
Copy link

darrencl commented Feb 19, 2020

Hi,

I notice there is different behavior in constructing over a nested array vs array of tuple.

julia> DataFrame([[1,2,3],[4,5,6]])
3×2 DataFrame
│ Row │ x1    │ x2    │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 114     │
│ 225     │
│ 336     │

julia> DataFrame(Tuple.([[1,2,3],[4,5,6]]))
2×3 DataFrame
│ Row │ 123     │
│     │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┤
│ 1123     │
│ 2456

Converting each element of array to tuple is quite slow (with list comprehension) and broadcasting (like the sample above) in large inner-arrays is prone to stack overflow error.

I am just wondering what are the alternatives to get the same behavior is array of tuple without the need of converting each element to tuple?

Thanks!

@bkamins
Copy link
Member

bkamins commented Feb 19, 2020

I notice there is different behavior in constructing over a nested array vs array of tuple.

The difference is that the first example follows a DataFrame constructor and the second is a conversion from a type supported by Tables.jl API to a DataFrame.

The second operation must be slower than the first one. But what could be potentially faster if you have many rows is:

DataFrame!([getindex.(src, i) for i in 1:maximum(length.(src))]);

(this will also check if all your inner vectors have the same length)

@bkamins bkamins closed this as completed Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants