Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update compiler plugin #832

Merged
merged 9 commits into from
Aug 22, 2024
Merged

Update compiler plugin #832

merged 9 commits into from
Aug 22, 2024

Conversation

koperagen
Copy link
Collaborator

No description provided.

@koperagen koperagen requested a review from Jolanrensen August 21, 2024 18:50
@koperagen koperagen added the Compiler plugin Anything related to the DataFrame Compiler Plugin label Aug 21, 2024
Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have a lot more functions supported in the plugin :) I just have a few small comments, mostly regarding clarification of the new toDataFrame overload, otherwise, it's great. Could you mention which new functions are now supported in the PR notes? That'll help us find it back later

@@ -209,7 +213,7 @@ public fun DataFrame.Companion.readExcel(
* @param range comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”)
*/
@JvmInline
public value class StringColumns(public val range: String)
public value class StringColumns @Interpretable("StringColumns") constructor(public val range: String)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter fails here, it expects it to be like:

@JvmInline
public value class StringColumns
    @Interpretable("StringColumns")
    constructor(public val range: String)

I recommend using the KtLint plugin (if you run the IDE in K1 mode) or run the ktlint task manually

@@ -151,6 +151,19 @@ df.add("length") { value.length }

<!---END-->

Creates a [`DataFrame`](DataFrame.md) with one column
made from [`Iterable`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-iterable/) of values:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to figure out how this case is different from the other cases. I'd specify a) that this uses non-basic types, since we have overloads for those already, and b) that by specifying the column name, the properties of the given objects aren't unfolded like in the case below. This should probably also be in kdocs next to the function

Copy link
Collaborator Author

@koperagen koperagen Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creates a DataFrame from Iterable<T> with one column: "columnName: DataColumn<T>"
Is it better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! But probably also specify that the properties are not unfolded, aka, you get one "value column" and not a "column group".

@@ -436,6 +436,8 @@ internal fun guessValueType(values: Sequence<Any?>, upperBound: KType? = null, l
collectionClasses.add(it.javaClass.kotlin)
}

is Function<*> -> classes.add(Function::class)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not :) Maybe we should change the rendering for functions in dataframes though. After a quick test I found it looks like:

⌌---------------------------------------------------------------------------------------------------------------⌍
|  |                                                                                        a:Function<*>| b:Int|
|--|-----------------------------------------------------------------------------------------------------|------|
| 0| org.jetbrains.kotlinx.dataframe.testSets.person.DataFrameTests$$Lambda$60/0x000000010013a040@64ee819|     2|
⌎---------------------------------------------------------------------------------------------------------------⌏

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly for such lambda objects toString is weird. I tried to look at the object in the debugger, but there's literally nothing that hints at signature or anything useful

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm you'd think there was a way in kotlin to detect it's a () -> Int or something :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually...

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already renders correctly often!

image

might just be a fluke in the tests if the lambda is serialized as interface

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kernel version do you use?
image

Copy link
Collaborator

@Jolanrensen Jolanrensen Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run the dev version of this PR's branch in the notebook. (so publish to maven local and use v=0.14.0-dev)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, so the fix is needed anyway

fun box(): String {
val df = dataFrameOf("a", "b")(1, null, null, "")
val df1 = df.fillNulls { b }.with { "empty" }
val b: DataColumn<String> = df1.b
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is beautiful behavior :) Literally how DataFrame is meant to be! Anatoly would be proud I'm sure

val Arguments.separator: String by arg(defaultValue = Present("."))

override fun Arguments.interpret(): PluginDataFrameSchema {
return receiver.asDataFrame().flatten(keepParentNameForColumns, separator).toPluginDataFrameSchema()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use the actual flatten function :D awesome!

@Jolanrensen
Copy link
Collaborator

oh btw, can we already update the kotlin version of the compiler plugin to 2.0.20-RC2? That's the latest now, the beta version is not available anymore

public fun <T, C> Update<T, C>.with(expression: UpdateExpression<T, C, C?>): DataFrame<T> =
@Refine
@Interpretable("UpdateWith0")
public fun <T, C, R : C?> Update<T, C>.with(expression: UpdateExpression<T, C, R>): DataFrame<T> =
Copy link
Collaborator

@Jolanrensen Jolanrensen Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testrun of the compiler plugin I now cannot use update {}.with {} anymore, just fillNulls {}.with {}. It gives

[NONE_APPLICABLE] None of the following candidates is applicable: val DataRow<Into_93I>.age: Int? val ColumnsContainer<Into_93I>.age: DataColumn<Int?>

when trying to access the updated column. Is this intended for now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now yes, there's such issue because plugin fails to interpret update { }.with { } (update not supported) and fallback to an empty schema. Will fix

@koperagen koperagen force-pushed the compiler-plugin-member-functions branch from 327ec74 to 525d6b1 Compare August 22, 2024 11:15
`with` used to have C? in UpdateExpression return type position, and so it was always inferred as nullable. Even for fillNulls { }.with { 123 }
From my quick research, reflection doesn't know anything about these values. They don't have invoke methods, nor any supertypes. So for now i decided to simply fix NPE by falling back to generic Function type for such columns. It will then at least work together with compiler plugin
@koperagen koperagen force-pushed the compiler-plugin-member-functions branch from 525d6b1 to d86d589 Compare August 22, 2024 12:25
@koperagen koperagen self-assigned this Aug 22, 2024
Copy link
Contributor

Generated sources will be updated after merging this PR.
Please inspect the changes in here.

@koperagen koperagen merged commit a268c40 into master Aug 22, 2024
3 checks passed
@koperagen koperagen deleted the compiler-plugin-member-functions branch August 26, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compiler plugin Anything related to the DataFrame Compiler Plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants