-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
creating a DTable
eagerly copy all entries?
#59
Comments
Came up during the Julia HEP 2023 workshop. (cc @grasph) |
but this seems to be fine? julia> dt = DTable((a=A(3),));
julia> |
Naive debugging during a talk :) I think the critical line in DTables.jl is https://github.com/JuliaParallel/DTables.jl/blob/main/src/table/dtable.jl#L109. If I didn't make a debugging mistake, it jumps to https://github.com/JuliaData/Tables.jl/blob/175e431eadaae9e439b5a8e020afce06dc6cf4f4/src/namedtuples.jl#L187 and then [...] to https://github.com/JuliaData/Tables.jl/blob/175e431eadaae9e439b5a8e020afce06dc6cf4f4/src/fallbacks.jl#L265 which is CopiedColumns(buildcolumns(schema(r), r)) |
When we do
go into the For
which leads to the fallback that I linked the post above. |
so DTable calls table operation, maybe it should not materialize like this? |
The simple answer without going into details is that if you use the constructors that assume the partitioning of the input then it will not copy the input (at least it shouldn't) The above work on Tables/TableOperations, so they fallback to copies eventually even if the code doesn't really copy it explicitly (not aware if this can be improved) I'm not super happy with the constructors for DTable. There's a lot of cool things we could do to make it more efficient when using different input sources. As of right now it's made to be universal and take any Tables.jl input |
The text was updated successfully, but these errors were encountered: