Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure GroupedDataFrame starts at index 1 #1692

Closed
wants to merge 1 commit into from
Closed

Conversation

nalimilan
Copy link
Member

Storing indices of a ghost group could be confusing and makes the code more complex later.

Storing indices of a ghost group could be confusing
and makes the code more complex later.
Copy link
Member

@bkamins bkamins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and copyto! should work for overlapping blocks of data looking at the code, although it is not a part of the guarantee given in the documentation. Maybe we should ask in Base to confirm and put into the documentation the contract that overlapping blocks of memory may be passed to copyto!.

s = starts[2]
if s > 1
N = length(rperm) - s + 1
copyto!(rperm, 1, rperm, s, N)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered using a view instead of copyto!? I think it should be faster and use less memory.

@bkamins
Copy link
Member

bkamins commented Jan 21, 2019

We might also consider avoiding considering missing group starting from:

rperm = Vector{Int}(undef, length(groups))
  • make this vector shorter from the start and drop first starts and stops if needed. This should be even a bit faster when there are missing values as we would do less copying.

@bkamins
Copy link
Member

bkamins commented Jan 21, 2019

As a second thought - it would probably slow down things a bit when we have a lot of small groups.

@bkamins
Copy link
Member

bkamins commented Jan 22, 2019

If you are happy with current #1689 I think you can close this PR (unless you want to clean it up to always start from 1, but as discussed it is probably not strictly necessary)

@nalimilan nalimilan closed this Jan 23, 2019
@nalimilan nalimilan deleted the nl/starts branch January 23, 2019 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants