-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add allcombinations #3031
add allcombinations #3031
Conversation
src/DataFrames.jl
Outdated
@@ -49,6 +49,7 @@ export AbstractDataFrame, | |||
disallowmissing!, | |||
dropmissing!, | |||
dropmissing, | |||
expandgrid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we find a name more closely related to fillcombinations
, which is similar? Maybe this should even be a method of fillcombinations
?
It would also make sense to mention the relationship between the two functions in their docstrings (dplyr does that IIRC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dplyr mentions that expandgrid
is internally used in their fillcombinations
equivalent. However, it was not possible in DataFrames.jl (the reason is that in dplyr everything is a vector, and in Julia we have scalars and need pseudo-broadcasting).
Conceptually this is very different, as fillcombinations
requires input to have equal lengths of vectors passed (that is why they are stored in a data frame), while here each input can (and usually has) a different length.
We could name this function or just product
and not export it, but document that users can write DataFrames.product
. This would be consistent with Iterators.product
as it does exactly the same thing (but with a set of rules that is consistent with pseudo-broadcasting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we could use expandcombinations
as a name if we want the name to be similar (and then export it).
However, I tend to prefer DataFrames.product
that is documented as part of API but unexported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(as for the reason why I cannot call expandgrid
in fillcombinations
you can have a look at their codes and compare the internals of the core loop that does the expansion - there are subtle differences since there we repeat levels, but need to allocate columns from source vector - e.g. for CategoricalArray
it makes a difference and cannot be handled correctly if passed to expandgrid
)
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
OK - so what do we do about the name? Keep |
The drawback of |
I will go for |
@nalimilan - the PR should be ready for a final review. Thank you! |
@nalimilan - so in the end do we stick with |
@nalimilan - I have added |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Thank you! |
🎉 |
Isn't this a different API for the same functionality that |
is the same as
Most users find the latter a bit inconvenient to write. What would you propose exactly to add to the documentation? |
I was thinking a "See also" entry.
The new function might be a bit simpler to type. I was mostly surprised that |
OK - I will add a link to |
Fixes #3027
Before I finalize this PR (add tests and NEWS.md) please comment if we like the design.
In particular do you think that the set of proposed signatures is OK.
Thank you!