-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grouping by column position #110
Labels
datafusion
Changes in the datafusion crate
Comments
Comment from Jorge Leitão(jorgecarleitao) @ 2020-10-23T06:03:45.335+0000: Could you describe what is the use-case? E.g. in which situation does that improve readability over the having the column name, or in which case does the column name not work? Comment from Pavel Tiunov(paveltiunov) @ 2020-10-23T06:41:53.205+0000: Hey [~jorgecarleitao]! Thanks for the quick turnaround. The use case is usually to support grouping by complex expressions. For example {code:java} SELECT date_trunc(convert_tz(`page_views__time_hour`, 'America/Los_Angeles'), 'week') `page_views__time_week`, sum(`page_views__count`) `page_views__count` FROM page_views GROUP BY date_trunc(convert_tz(`page_views__time_hour`, 'America/Los_Angeles'), 'week') LIMIT 10000 {code} becomes {code:java} SELECT date_trunc(convert_tz(`page_views__time_hour`, 'America/Los_Angeles'), 'week') `page_views__time_week`, sum(`page_views__count`) `page_views__count` FROM page_views GROUP BY 1 LIMIT 10000 {code} Calculated columns can be pretty complex and it usually just bloats up the SQL significantly. I believe it's beneficial to have this feature as most of the modern query engines have it. Comment from Jorge Leitão(jorgecarleitao) @ 2020-10-23T08:00:39.544+0000: Thanks for the example, [~paveltiunov] . makes sense Meanwhile, I think that you can use <{{expression> AS name}} and {{GROUP BY name}} . |
Closed
Thanks @jychen7 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-10374
It would be great to have the support of grouping by column position instead of grouping by exact expression. For example:
For example, for a query like
It can be expressed in the same way using numbers to refer to other items in the select list.
However, this does not work today in DataFusion:
The text was updated successfully, but these errors were encountered: