-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-842][VL] convert expand op to expand exec in velox #1361
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
We should follow substrait's solution. Does the PR need change to substrait? |
Yes. This PR add new ExpandRel message in algebra.proto, which follow up the substrait community solution except the definition in PR. |
78159d0
to
b1d2288
Compare
f10f041
to
34ac44f
Compare
Thanks, updated. |
0a506e6
to
06c3c8f
Compare
@JkSelf , @zhouyuan and @lgbo-ustc, could you please help to take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except small comments.
gluten-core/src/main/scala/io/glutenproject/execution/ExpandExecTransformer.scala
Show resolved
Hide resolved
a7d71dd
to
91c8666
Compare
You mention that |
new 'ExpandRel' is different and can't be compatible with CH. The original Rel is renamed to 'GroupIdRel' and changes are made in CH also. |
It's greate. This implementation is simple. I think it could solve some problems we have meet. LGTM. |
Thanks for review @lgbo-ustc , will you work on CH side to consume new ExpandRel contract? |
We will do it soon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
* init change * convert expand op to expand exec in velox * add pre-project & add ut * minor change * fix ut * update algebra.proto * fix build * fix build * fix build * add ut * revert velox branch --------- Co-authored-by: zhli1142015 <zhli@pczhlich.fareast.corp.microsoft.com>
* init change * convert expand op to expand exec in velox * add pre-project & add ut * minor change * fix ut * update algebra.proto * fix build * fix build * fix build * add ut * revert velox branch --------- Co-authored-by: zhli1142015 <zhli@pczhlich.fareast.corp.microsoft.com>
I have a question, the two cases, could you give some sql? We don't know when spark will generate the two cases. |
Hello @zhanglistar ,
|
@zhli1142015 Thanks! For the sql |
What changes were proposed in this pull request?
Here are project expressions we observed in expand operation:
agg exprs + group by exprs + gid.
agg exprs + group by exprs + gid + _gen_grouping_pos. --> The last column is for handling duplicate grouping sets.
group by exprs + gid + agg exprs. --> gid is calculated by different way from above two cases. it's assigned with the sequence number of project set.
Original ExpandExecTrandofrmer can only handle the first case. I'm adding the expand exec in velox side by the PR: https://github.com/oap-project/velox/pull/199/files. This PR is for converting spark ExpandExec to Expand OP in Velox, We don't need to do the columns' mapping in Gluten.
The original ExpandExecTrandofrmer is renamed to GroupIdExecTrandofrmer to not break ClicKHouse.
How was this patch tested?
Unit test.