Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support string repeat SQL #2728

Merged
merged 29 commits into from
Aug 5, 2021
Merged

Conversation

ttnghia
Copy link
Collaborator

@ttnghia ttnghia commented Jun 16, 2021

This PR implements repeat SQL, and closes #68.

Example:

strs = ['abc', null, '',  'xyz']
out  = repeat(strs, 3)
out is ['abcabcabc', null, '',  'xyzxyzxyz']

repeatTimes = [1, 2, 3, 4]
out = repeat(strs, repeatTimes)
out is ['abc', null, '',  'xyzxyzxyzxyz']

@ttnghia ttnghia added feature request New feature or request SQL part of the SQL/Dataframe plugin labels Jun 16, 2021
@ttnghia ttnghia requested review from jlowe and revans2 June 16, 2021 21:46
@ttnghia ttnghia self-assigned this Jun 16, 2021
@ttnghia ttnghia changed the title Repeat strings Support repeat SQL Jun 16, 2021
@ttnghia ttnghia changed the title Support repeat SQL Support string repeat SQL Jun 16, 2021
@jlowe
Copy link
Member

jlowe commented Jun 16, 2021

@ttnghia the commit needs to be signed off, see the signoff section in the contributing guide for details.

@ttnghia ttnghia marked this pull request as draft June 17, 2021 02:47
@sameerz sameerz added this to the July 19 - July 30 milestone Jul 16, 2021
@sameerz
Copy link
Collaborator

sameerz commented Jul 22, 2021

@ttnghia please retarget to branch 21.10 when you get a chance.

@ttnghia ttnghia changed the base branch from branch-21.08 to branch-21.10 July 27, 2021 19:10
@ttnghia ttnghia marked this pull request as ready for review July 28, 2021 16:36
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
jlowe
jlowe previously approved these changes Aug 4, 2021
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me other than a small assert result nit.

…UnitTest.scala

Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…UnitTest.scala

Co-authored-by: Jason Lowe <jlowe@nvidia.com>
@ttnghia ttnghia requested a review from jlowe August 4, 2021 13:50
jlowe
jlowe previously approved these changes Aug 4, 2021
@jlowe
Copy link
Member

jlowe commented Aug 4, 2021

build

@revans2
Copy link
Collaborator

revans2 commented Aug 4, 2021

The latest test run failed because of a breaking change in the cudf APIs that was merged in and the corresponding plugin change has not gone in yet.

@revans2
Copy link
Collaborator

revans2 commented Aug 4, 2021

build

1 similar comment
@ttnghia
Copy link
Collaborator Author

ttnghia commented Aug 4, 2021

build

@pxLi
Copy link
Collaborator

pxLi commented Aug 5, 2021

hi. please check the build log before trigger again thx!

[2021-08-04T16:34:40.039Z] found modified files during mvn verify:
[2021-08-04T16:34:40.039Z]  M docs/configs.md
[2021-08-04T16:34:40.039Z]  M docs/supported_ops.md
[2021-08-04T16:34:40.039Z] [ERROR] Command execution failed.

this PR did not include RapidsConf and SupportedOpsDocs update well, so it failed the check

Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
@ttnghia
Copy link
Collaborator Author

ttnghia commented Aug 5, 2021

hi. please check the build log before trigger again thx!

[2021-08-04T16:34:40.039Z] found modified files during mvn verify:
[2021-08-04T16:34:40.039Z]  M docs/configs.md
[2021-08-04T16:34:40.039Z]  M docs/supported_ops.md
[2021-08-04T16:34:40.039Z] [ERROR] Command execution failed.

this PR did not include RapidsConf and SupportedOpsDocs update, so it failed the check

Oh thanks for letting me know. I have updated the docs.
@jlowe I hate to ask you to approve the third time but I have to. I also hate github that discards PR approval every time I push. Maybe we need to modify that behavior for better convenience.

@pxLi
Copy link
Collaborator

pxLi commented Aug 5, 2021

hi. please check the build log before trigger again thx!

[2021-08-04T16:34:40.039Z] found modified files during mvn verify:
[2021-08-04T16:34:40.039Z]  M docs/configs.md
[2021-08-04T16:34:40.039Z]  M docs/supported_ops.md
[2021-08-04T16:34:40.039Z] [ERROR] Command execution failed.

this PR did not include RapidsConf and SupportedOpsDocs update, so it failed the check

Oh thanks for letting me know. I have updated the docs.
@jlowe I hate to ask you to approve the third time but I have to. I also hate github that discards PR approval every time I push. Maybe we need to modify that behavior for better convenience.

Actually I think we should not, otherwise people could push new commits which could do whatever they want to our repo after the initial approval, this could be horrible :P

@ttnghia
Copy link
Collaborator Author

ttnghia commented Aug 5, 2021

Oh thanks for letting me know. I have updated the docs.
@jlowe I hate to ask you to approve the third time but I have to. I also hate github that discards PR approval every time I push. Maybe we need to modify that behavior for better convenience.

Actually I think we should not, otherwise people could push new commits which could do whatever they want to our repo after the initial approval, this could be horrible :P

People should know what they are doing, not "whatever they want" like that 😃. I observed (from cudf) that after approvals the authors usually push more minor changes related to docs/typos etc. If every new change requires approvals again then it is very inconvenient for both the authors and the reviewers.

Of course, if there are new changes in the code then the authors can request re-review anytime.

@pxLi
Copy link
Collaborator

pxLi commented Aug 5, 2021

Oh thanks for letting me know. I have updated the docs.
@jlowe I hate to ask you to approve the third time but I have to. I also hate github that discards PR approval every time I push. Maybe we need to modify that behavior for better convenience.

Actually I think we should not, otherwise people could push new commits which could do whatever they want to our repo after the initial approval, this could be horrible :P

People should know what they are doing, not "whatever they want" like that 😃. I observed (from cudf) that after approvals the authors usually push more minor changes related to docs/typos etc. If every new change requires approvals again then it is very inconvenient for both the authors and the reviewers.

Of course, if there are new changes in the code then the authors can request re-review anytime.

haha yeah, I mean cudf approval is horrible. For our repo, if PR tests pass pre-merge CI but requires to add some minor changes like comments/docs only (not for generated docs like configs.md or supported_ops.md), people could just add [skip ci] to the title to skip the additional full tests run w/ new commit approval :)

@sameerz
Copy link
Collaborator

sameerz commented Aug 5, 2021

Oh thanks for letting me know. I have updated the docs.
@jlowe I hate to ask you to approve the third time but I have to. I also hate github that discards PR approval every time I push. Maybe we need to modify that behavior for better convenience.

Actually I think we should not, otherwise people could push new commits which could do whatever they want to our repo after the initial approval, this could be horrible :P

People should know what they are doing, not "whatever they want" like that 😃. I observed (from cudf) that after approvals the authors usually push more minor changes related to docs/typos etc. If every new change requires approvals again then it is very inconvenient for both the authors and the reviewers.
Of course, if there are new changes in the code then the authors can request re-review anytime.

haha yeah, I mean cudf approval is horrible. For our repo, if PR tests pass pre-merge CI but requires to add some minor changes like comments/docs only (not for generated docs like configs.md or supported_ops.md), people could just add [skip ci] to the title to skip the additional full tests run w/ new commit approval :)

It may be inconvenient, but a review on a change, even a small one, is worth the trouble to minimize potential problems. Doing a review on every change is a good habit. We can still move quickly while doing the reviews.

@jlowe
Copy link
Member

jlowe commented Aug 5, 2021

build

@ttnghia ttnghia merged commit 081ecfc into NVIDIA:branch-21.10 Aug 5, 2021
@ttnghia ttnghia deleted the repeat_strings branch August 19, 2021 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request SQL part of the SQL/Dataframe plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] support StringRepeat
7 participants