You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue: view allows to set its :rows field to whatever parent allows, but with 0 columns it should have 0 rows.
Problem 2. dropping columns with view
julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 3 6
julia> sdf = @view df[:, :]
3×2 SubDataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 3 6
julia> dfr = df[1, :]
DataFrameRow
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 4
julia> select!(df)
0×0 DataFrame
julia> sdf[:, :]
ERROR: BoundsError: attempt to access 0×0 DataFrame at index [1:3, 1:0]
julia> dfr
Error showing value of type DataFrameRow{DataFrame, DataFrames.Index}:
ERROR: BoundsError: attempt to access 0×0 DataFrame at index [[1], 1:0]
The issue here is that if we use : as column selector it is a valid operation to drop/add columns to a data frame.
However, if we move past 0-columns case we might even change number of rows in a data frame.
If something is not clear please let me know. In general I was not sure how to best handle it (there are several possible options, but in general it shows a hole in the design so I was not sure what is best).
The text was updated successfully, but these errors were encountered:
Regarding the second issue (changing set of columns in a parent of a view with : as column selector). While it is irritating, this is strictly speaking not an issue. If we drop all columns in the parent we change its number of rows - and this is disallowed when working with views. So we can leave things as is
Regarding the first issue (0-column view of a data frame with some rows). I was thinking of two solutions:
changing the rows argument when sub data frame is created to be empty;
changing the implementation of view and getindex so that they work correctly in this case.
The first (fixing rows) is easier to do, but it will give an incorrect parentindices result (i.e. you cannot recover the row indices used when creating a SubDataFrame. Though it is not strictly required that parentindices should allow for this. Also fixing rows would be technically breaking.
So I am leaning to changing view and getindex implementations, but keeping rows field untouched. What do you think?
@nalimilan we have the following related problems in the core design of DataFrames.jl. What do you think we should do about it?
Problem 1. view with no columns
The issue: view allows to set its
:rows
field to whatever parent allows, but with 0 columns it should have 0 rows.Problem 2. dropping columns with view
The issue here is that if we use
:
as column selector it is a valid operation to drop/add columns to a data frame.However, if we move past 0-columns case we might even change number of rows in a data frame.
If something is not clear please let me know. In general I was not sure how to best handle it (there are several possible options, but in general it shows a hole in the design so I was not sure what is best).
The text was updated successfully, but these errors were encountered: