Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-1201][CORE] Skip scalar subquery execution during validation #1202

Merged
merged 5 commits into from
Mar 29, 2023

Conversation

loudongfeng
Copy link
Contributor

@loudongfeng loudongfeng commented Mar 24, 2023

What changes were proposed in this pull request?

Please check #1201

How was this patch tested?

Test manually by explain tcpds q14-1, scale factor 0.1

@github-actions
Copy link

#1201

@zhztheplayer zhztheplayer changed the title [Gluten-1201][Gluten-core]skip scala subquery execution during validation [GLUTEN-1201][CORE] skip scala subquery execution during validation Mar 24, 2023
@zhztheplayer zhztheplayer changed the title [GLUTEN-1201][CORE] skip scala subquery execution during validation [GLUTEN-1201][CORE] Skip scala subquery execution during validation Mar 24, 2023
@PHILO-HE
Copy link
Contributor

So this is an optimization to avoid the unnecessary execution of scalar subquery in validation phase. Currently, it does in validation phase, but there is no functionality issue. Right?

@loudongfeng
Copy link
Contributor Author

Yes, only performance issue.

@PHILO-HE PHILO-HE requested a review from rui-mo March 24, 2023 07:49
Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks the purpose of this PR is partially the same as #1097.

@loudongfeng
Copy link
Contributor Author

Looks the purpose of this PR is partially the same as #1097.

#1097 can fix this issue too.

@loudongfeng
Copy link
Contributor Author

All checks have passed.Will address review comment soon.

@rui-mo rui-mo requested a review from zzcclp March 27, 2023 07:20
@loudongfeng loudongfeng force-pushed the scala_subquery_execute_twice branch from da365a5 to fb6455a Compare March 27, 2023 07:39
rui-mo
rui-mo previously approved these changes Mar 27, 2023
zzcclp
zzcclp previously approved these changes Mar 28, 2023
Copy link
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@loudongfeng loudongfeng dismissed stale reviews from zzcclp and rui-mo via 97bde69 March 28, 2023 08:35
@rui-mo rui-mo added the CORE works for Gluten Core label Mar 29, 2023
@PHILO-HE
Copy link
Contributor

@WangGuangxin

@WangGuangxin
Copy link
Contributor

@PHILO-HE The change LGTM to avoid execute subquery when do validation.

But I think there is another improvement that's submit subquery concurrently before doTransform, by utilize SparkPlan.prepare(), like https://github.com/oap-project/gluten/pull/1097/files#diff-2229bea3ccd763cfbccc11ad3337540c895b216b5568a10b1ddaee56919862faR184

What do you think?

@PHILO-HE
Copy link
Contributor

@PHILO-HE The change LGTM to avoid execute subquery when do validation.

But I think there is another improvement that's submit subquery concurrently before doTransform, by utilize SparkPlan.prepare(), like https://github.com/oap-project/gluten/pull/1097/files#diff-2229bea3ccd763cfbccc11ad3337540c895b216b5568a10b1ddaee56919862faR184

What do you think?

I agree. Please keep that part of fix in your PR and rebase your code after this patch is merged.

@PHILO-HE
Copy link
Contributor

Will merge the patch soon. Thanks for the contribution!

@PHILO-HE PHILO-HE changed the title [GLUTEN-1201][CORE] Skip scala subquery execution during validation [GLUTEN-1201][CORE] Skip scalar subquery execution during validation Mar 29, 2023
@PHILO-HE PHILO-HE merged commit f53d8b1 into apache:main Mar 29, 2023
@loudongfeng
Copy link
Contributor Author

Many thanks for your review. @rui-mo @PHILO-HE @zzcclp @WangGuangxin

Comment on lines +57 to +61
def valueSensitiveDataType(dataType: DataType): Boolean = {
dataType.isInstanceOf[MapType] ||
dataType.isInstanceOf[ArrayType] ||
dataType.isInstanceOf[StructType]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @loudongfeng , I just bumped into this code. Do you still remember the reason of using this condition?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants