Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target setindex implementation #1899

Merged
merged 31 commits into from
Sep 2, 2019
Merged

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Jul 26, 2019

This is currently WIP (I have initially implemented new rules for things that we need to change to finish deprecations). It is not finalized yet, so have a look at it only to check what is added 😄.

Before I write tests and clean this up I would like to clarify what we do with df[!, cols] = v and df[!, cols] .= v syntax.

The particular, crucial, question is if we want to allow them to create columns (like df[!, col] allows) or not. If yes then the question is if we allow adding columns only at the end of a data frame or also reorder the columns already present.

The relevant case is for example:

df = DataFrame(a = [1,2], b=[3,4])
df[!, [:a,:c,:b]] .= 10

Do we want column :c to be created, and if yes should it be a last column of df (probably yes).

My problem is that the corner-cases will be surprising for users, so, as opposed to single column assignment, maybe it should be disallowed to add columns (but maybe you feel it is OK).

@nalimilan
Copy link
Member

I'd say it's fine to be restrictive for now. In particular, creating columns while providing names in an order different from the actual order sounds really weird. Let's start with a simple system, and possibly extend it later if we get requests.

docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
src/dataframerow/dataframerow.jl Outdated Show resolved Hide resolved
src/dataframerow/dataframerow.jl Outdated Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Aug 4, 2019

So the "restrictive" approach is to disallow adding columns using df[!, cols] on LHS - right?

@bkamins bkamins changed the title WIP: target setindex implementation target setindex implementation Aug 14, 2019
@bkamins
Copy link
Member Author

bkamins commented Aug 14, 2019

@nalimilan - this should be good to have a look at when you have some free time

docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
test/broadcasting.jl Show resolved Hide resolved
test/dataframe.jl Outdated Show resolved Hide resolved
test/deprecated.jl Show resolved Hide resolved
test/indexing.jl Outdated Show resolved Hide resolved
test/reshape.jl Show resolved Hide resolved
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Member Author

bkamins commented Aug 24, 2019

I guess there are ambiguities if you don't use a loop? That sounds too bad.

Yes - unfortunately if we want to keep the deprecations it is much easier to do it this way. The problem is, as discussed some time ago that using Union is less specific than generating methods for given types.

After the deprecations are removed from code base, we can revert to the standard Union approach.

@bkamins
Copy link
Member Author

bkamins commented Aug 25, 2019

This should be good for a second round of a review (probably there will be some comments - I tried to clean up everything, but the PR grew big). Thank you!

src/dataframerow/dataframerow.jl Outdated Show resolved Hide resolved
src/dataframe/dataframe.jl Outdated Show resolved Hide resolved
src/other/broadcasting.jl Outdated Show resolved Hide resolved
src/other/broadcasting.jl Outdated Show resolved Hide resolved
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Member Author

bkamins commented Sep 2, 2019

Regarding setindex! to a DataFrameRow and also push!. We have two options:

  • leave things as is
  • make LHS to be required to be: Base.Generator, Tuple, NamedTuple, AbstractDict, AbstractVector (or AbstractArray for the last, but I prefer AbstarctVector)

The Base.Generator would cover all generators. Then if we allow AbstractArray we would not check dimensionality of a generator and if we only accept AbstractVector we would check if Base.Generator is unidimensional.

What do you think would be better?

@nalimilan
Copy link
Member

I'd be fine with restricting as you suggest. But better make a new PR?

@bkamins
Copy link
Member Author

bkamins commented Sep 2, 2019

But for setindex! I have to define it in this PR (as we add this functionality in this PR).

The only question is if we should "math dimensionality" (so accept only AbstractVector and 1D generator) or accept all AbstractArrays and all generators?

For push! it will be a separate PR for sure (as we need to go through deprecation cycle).

@nalimilan
Copy link
Member

I don't have a strong opinion really.

@bkamins
Copy link
Member Author

bkamins commented Sep 2, 2019

OK - I will go with what Base allows for Arrays (any dimensionality) then.

@bkamins bkamins merged commit cbfef8d into JuliaData:master Sep 2, 2019
@bkamins
Copy link
Member Author

bkamins commented Sep 2, 2019

@nalimilan - merged. Thank you!

@bkamins bkamins deleted the target_setindex_rules branch September 2, 2019 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants