Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPPRO-314] Correctly fall back scan operator with unsupported filters #559

Merged
merged 6 commits into from
Nov 16, 2022

Conversation

PHILO-HE
Copy link
Contributor

@PHILO-HE PHILO-HE commented Nov 15, 2022

The code for handling filter push down should check whether scan is transformable, considering unsupported filter can be pushed down to scan.
Reproducible issue link: #560.

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[Gluten-${ISSUES_ID}] ${detailed message}

See also:

@@ -94,26 +94,26 @@ case class TransformPreOverrides() extends Rule[SparkPlan] {
case TransformHint.TRANSFORM_UNSUPPORTED =>
logDebug(s"Columnar Processing for ${plan.getClass} is under row guard.")
plan match {
case plan: ShuffledHashJoinExec =>
case shj: ShuffledHashJoinExec =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not functional, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the changes for this part has no impact on functionality, just for making it more readable. Thanks!

@PHILO-HE PHILO-HE changed the title Fix fallback issues for FileSourceScanExec [OPPRO-314] Fix issue: scan operator with unsupported filters is not correctly falled back Nov 15, 2022
@PHILO-HE PHILO-HE changed the title [OPPRO-314] Fix issue: scan operator with unsupported filters is not correctly falled back [OPPRO-314] Correctly fall back scan operator with unsupported filters Nov 15, 2022
// With this setting, row based input will be used in the fallback case of data source v1.
// TODO: remove this setting if vanilla spark's vectorized input is supported.
conf.setConfString("spark.sql.parquet.enableVectorizedReader", "false")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if we use columnar input? Will it add a columnarToRow, then RowTo ArrowColumnar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for velox,` we have no support to transform vanilla spark columnar input to velox columnar input. So if we do use vanilla columnar input, an exception will be thrown to interrupt the execution. cc @rui-mo.
BTW, we found setting the configuration here can cause some issues, maybe we have to tell user to set it manually.

@rui-mo rui-mo merged commit 61aed95 into apache:main Nov 16, 2022
@FelixYBW
Copy link
Contributor

can fallback to Spark's columnar scan now?

@PHILO-HE
Copy link
Contributor Author

can fallback to Spark's columnar scan now?

Will fix it in another PR. Thanks!

@rui-mo
Copy link
Contributor

rui-mo commented Nov 17, 2022

can fallback to Spark's columnar scan now?

For now, the config to disable Spark's columnar output is still needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants