Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy aggregate fix #803

Merged
merged 5 commits into from
Aug 2, 2024
Merged

GroupBy aggregate fix #803

merged 5 commits into from
Aug 2, 2024

Conversation

Jolanrensen
Copy link
Collaborator

With the debug mode enabled, one failing test of #713 was in GroupBy.aggregate { this into "" }.

I narrowed it down to a NamedValue being created in GroupByReceiverImpl with KType AggregateGroupedDsl<DfType>. Since this value is an AggregateInternalDsl<*> too, it is unpacked in the yield() function to get the underlying dataframe (by copying the NamedValue and replacing its value with its value.df), however, the KType of this NamedValue was left unchanged and thus mismatched the value.

This PR fixes that by updating that KType value too

@Jolanrensen Jolanrensen requested a review from zaleslaw July 31, 2024 12:44
@Jolanrensen Jolanrensen force-pushed the groupByAggregateFix branch from cb374a4 to 23046d8 Compare July 31, 2024 12:53
@Jolanrensen Jolanrensen marked this pull request as ready for review August 1, 2024 10:44
@Jolanrensen
Copy link
Collaborator Author

@zaleslaw the last commit can be reviewed if you like. It builds on top of #801 which should be merged first

Copy link
Contributor

github-actions bot commented Aug 1, 2024

Generated sources will be updated after merging this PR.
Please inspect the changes in here.

is AggregateInternalDsl<*> -> yield(value.copy(value = value.value.df))
is AggregateInternalDsl<*> -> {
// Attempt to create DataFrame<Type> from AggregateInternalDsl<Type>
val dfType = value.type?.arguments?.firstOrNull()?.type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried about this expression with 6 ? - what if something goes wrong and will be null, I could not explain myself the logic, should we handle the alternative (or common alternative)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the ?: is for at the end. We try to create DataFrame<Type1> from AggregateInternalDsl<Type1>. However, in theory, any KType can be supplied by the user. KTypes like GroupByReceiverImpl<Type1> , DataFrame<Type1> (which would both still work), but unfortunately also KTypes like CompletelyCustomType without arguments. So in cases where this type argument cannot be fetched, we default to DataFrame<*> with the ?:.

@Jolanrensen Jolanrensen merged commit 82e5555 into master Aug 2, 2024
5 checks passed
@Jolanrensen Jolanrensen deleted the groupByAggregateFix branch August 2, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants