Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making transformer plan log more obvious #1100

Closed
loudongfeng opened this issue Mar 9, 2023 · 2 comments
Closed

Making transformer plan log more obvious #1100

loudongfeng opened this issue Mar 9, 2023 · 2 comments
Labels
CORE works for Gluten Core enhancement New feature or request

Comments

@loudongfeng
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Making the transformer plan log more obvious. Plan for sql

select count(*) from my_char where name = 'Nemon'

Now is

CHNativeColumnarToRow
+- *(2) HashAggregateTransformer(keys=[], functions=[count(1)], output=[count(1)#7])
   +- ShuffleQueryStage 0
      +- ColumnarExchangeAdaptor SinglePartition, ENSURE_REQUIREMENTS, false, [plan_id=49], [id=#49], [OUTPUT] List(count:LongType), [OUTPUT] List(count:LongType)
         +- *(1) HashAggregateTransformer(keys=[], functions=[partial_count(1)], output=[count#10L])
            +- *(1) ProjectExecTransformer
               +- *(1) FilterExecTransformer (isnotnull(name#2) AND (name#2 = Nemon))
                  +- FileScan orc tpcds_parquet.my_char[name#2] Batched: true, DataFilters: [isnotnull(name#2), (name#2 = Nemon)], Format: ORC, ...

A more obvious one will look like this

CHNativeColumnarToRow
+- ^(2) HashAggregateTransformer(keys=[], functions=[count(1)], output=[count(1)#7])
   +- ShuffleQueryStage 0
      +- ColumnarExchangeAdaptor SinglePartition, ENSURE_REQUIREMENTS, false, [plan_id=49], [id=#49], [OUTPUT] List(count:LongType), [OUTPUT] List(count:LongType)
         +- ^(1) HashAggregateTransformer(keys=[], functions=[partial_count(1)], output=[count#10L])
            +- ^(1) ProjectExecTransformer
               +- ^(1) FilterExecTransformer (isnotnull(name#2) AND (name#2 = Nemon))
                  +- NativeFileScan orc tpcds_parquet.my_char[name#2] Batched: true, DataFilters: [isnotnull(name#2), (name#2 = Nemon)],...
@loudongfeng loudongfeng added the enhancement New feature or request label Mar 9, 2023
@loudongfeng loudongfeng changed the title Making transformer plan more obvious Making transformer plan log more obvious Mar 9, 2023
@PHILO-HE
Copy link
Contributor

PHILO-HE commented Mar 13, 2023

Are you proposing to use NativeFileScan to name the plan node to distinguish with Spark's? It can be achieved by overriding nodeName method. Looks reasonable. I am not quite sure why ^ is replacing * in your proposal. Any reasons?

@loudongfeng
Copy link
Contributor Author

Using ^ , so we can distinguish plan between whole stage codegen and whole stage transformer.
In spark plan , * usually means the plan node is codegened, same ID means they are in the same whole stage codegen.
We can keep the ID part, but use a ^ or other symbols instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CORE works for Gluten Core enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants