Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementations of UDF for different parameterizations of Map or List are not distinguished #2085

Closed
anekdoti opened this issue Oct 23, 2018 · 2 comments · May be fixed by #2094
Closed

Implementations of UDF for different parameterizations of Map or List are not distinguished #2085

anekdoti opened this issue Oct 23, 2018 · 2 comments · May be fixed by #2094
Labels
enhancement user-defined-functions Tickets about UDF, UDAF, UDTF

Comments

@anekdoti
Copy link

anekdoti commented Oct 23, 2018

I have implemented a simple UDF to return the length of an array:

package test.ksql.udf;

import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;

import java.util.List;

@UdfDescription(name = "sizeof", description = "returns the size of an array")
public class SizeOf {
    @Udf(description = "returns the size of an array") 
    public long sizeOfListInteger(final List<Integer> list) { return list.size(); }
    @Udf(description = "returns the size of an array") 
    public long sizeOfListLong(final List<Long> list) { return list.size(); }
}

Loading this UDF in the KSQL server on startup results in a KsqlException:

[2018-10-23 15:13:45,840] INFO Adding function sizeof for method public long test.ksql.udf.SizeOf.sizeOfListLong(java.util.Map) (io.confluent.ksql.function.UdfLoader:238)
[2018-10-23 15:13:45,848] WARN Failed to add UDF to the MetaStore. name=sizeof method=public long escid.esp.analytics.ksql.udf.SizeOf.sizeOfListLong(java.util.List) (io.confluent.ksql.function.UdfLoader:213)
io.confluent.ksql.util.KsqlException: Can't add function KsqlFunction{returnType=Schema{INT64}, arguments=[ARRAY], functionName='sizeof', kudfClass=class io.confluent.ksql.function.udf.PluggableUdf, description='returns the size of a map
', pathLoadedFrom='[OMITTED]/ksql/ext/analytics-ksql.1.0-SNAPSHOT-standalone.jar'} as a function with the same name and argument types already exists KsqlFunction{returnType=Schema{INT64}, arguments=[ARRAY], functionName='sizeof', kudfClass=class io.confluent.ksql.function.udf.PluggableUdf, description='returns the size of a map', pathLoadedFrom='[OMITTED]/ksql/ext/analytics-ksql.1.0-SNAPSHOT-standalone.jar'}
        at io.confluent.ksql.function.UdfFactory.checkCompatible(UdfFactory.java:69)
        at io.confluent.ksql.function.UdfFactory.addFunction(UdfFactory.java:56)
        at io.confluent.ksql.function.InternalFunctionRegistry.addFunction(InternalFunctionRegistry.java:104)
        at io.confluent.ksql.metastore.MetaStoreImpl.addFunction(MetaStoreImpl.java:227)
        at io.confluent.ksql.function.UdfLoader.addFunction(UdfLoader.java:246)
        at io.confluent.ksql.function.UdfLoader.lambda$handleUdfAnnotation$8(UdfLoader.java:208)
        at io.github.lukehutch.fastclasspathscanner.scanner.ScanSpec$9.lookForMatches(ScanSpec.java:1390)
        at io.github.lukehutch.fastclasspathscanner.scanner.ScanSpec.callMatchProcessors(ScanSpec.java:696)
        at io.github.lukehutch.fastclasspathscanner.FastClasspathScanner.scan(FastClasspathScanner.java:1606)
        at io.github.lukehutch.fastclasspathscanner.FastClasspathScanner.scan(FastClasspathScanner.java:1678)
        at io.github.lukehutch.fastclasspathscanner.FastClasspathScanner.scan(FastClasspathScanner.java:1704)
        at io.confluent.ksql.function.UdfLoader.loadUdfs(UdfLoader.java:138)
        at io.confluent.ksql.function.UdfLoader.lambda$load$2(UdfLoader.java:108)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at io.confluent.ksql.function.UdfLoader.load(UdfLoader.java:108)
        at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:254)
        at io.confluent.ksql.rest.server.KsqlServerMain.createExecutable(KsqlServerMain.java:83)
        at io.confluent.ksql.rest.server.KsqlServerMain.main(KsqlServerMain.java:45)

It seems that the different type parameters (Integer, Long) of the List in the two implementations are not distinguished in the signature of the KsqlFunction (in both cases it's arguments=[ARRAY]).

The same problem also arises when using parameterized Maps as the argument types.

The bug might be related to #2029 .

@anekdoti
Copy link
Author

I took a closer look at the sources. It seems that in io.confluent.ksql.function.UdfFactory.mapToFunctionParameter the FunctionParameter object representing the (parameterizied) argument type of the method definining the UDF (e.g., List) is created from theschema.type() of the org.apache.kafka.connect.data.SchemaBuilder object schema for the argument. However, this only returns the raw type org.apache.kafka.connect.data.Schema.Type.ARRAY but not the schema.valueSchema that represents the type parameter String.

I would suggest adding attributes for the keySchema as well as the valueSchema in the case of complex types to the class FunctionParameter and updating the methods of the class correspondingly.

@rodesai
Copy link
Contributor

rodesai commented Nov 12, 2019

This will be fixed in KSQL 5.4

@rodesai rodesai closed this as completed Nov 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement user-defined-functions Tickets about UDF, UDAF, UDTF
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants