Custom Call Signature Per Script Context #20426

jdconrad · 2016-09-12T16:12:36Z

We should make each context execute scripts based on exactly what gets passed in and returned. This eliminates a huge number of potential boxes for something like scoring that returns a double value out of a script. This may also eliminate a large number of hash look ups as certain parameters will be passed directly into the method instead of accessed from a map. Over millions of documents these performance gains really add up. Another benefit will be prevention of accessing null variables that may not belong in a context. For this to work the context will have to be passed in at compile-time.

I need to run through all the script cases and figure out what the contexts should be as I'm not sure if the current ones cover all the situations.

For file based scripts, I would propose for Painless and Expressions that the script be named something like script.score.painless now to embed the context in the name. For something like Groovy and other languages we can have an "unknown" context so these remain backwards compatible. The "unknown" context would revert to the old call signatures. Once the older languages are fully removed then we can remove the "unknown" context.

nik9000 · 2016-09-12T16:47:18Z

I agree with the idea of having the context at compile time. Are you thinking you'd have to pass in a Class and we'd would analyze the publicly visible getters? Something more explicit and less magic? Like a Map<String, ContextField> where ContextField has Method references for getting and (optionally) setting?

jdconrad · 2016-09-12T16:55:15Z

@nik9000 I can't say this is 100% thought-through yet as I need to go through the code and figure out what the contexts should be (I don't think the existing ones completely fit all the cases), but as I think you've seen in the past from me I prefer explicit over magic typically :) . I was thinking more along the lines of perhaps an interface per context with the signature that must be implemented by the script object that is returned from compile. CompiledScripts that implement the specific interface are what get's executed rather than a call back to the ScriptService. I envision a CompiledScript containing more than one interface possibly so we hopefully wouldn't end up with an excessive number of new classes.

nik9000 · 2016-09-12T17:11:06Z

That makes sense. I think it'd be pretty cool to say "here is some text that a user sent me, please compile it and return something that implements this interface". We wouldn't have to pass in there ctx objects any more at all which'd be cool.

I like the flexibility of letting the user of the script engine specify the interface that it needs. It lines up pretty well with "what can this script engine do?" because implementations like Mustache and Expressions can just choke when the caller asks them for an interface they don't understand but groovy and painless can actually implement the interface. Groovy through dynamic stuff and painless with static stuff.

jdconrad · 2016-09-12T17:17:12Z

@nik9000 Exactly what I was thinking! As a bonus, I really think this will help clean up the ScriptService quite a bit too.

nik9000 · 2016-09-12T17:17:29Z

++

jdconrad · 2016-09-12T21:46:32Z

So I began to modify the code to allow context at compilation for script engines that would like it, and ran into an issue with stored scripts.

In RestPutStoredScriptAction we add handlers like this:

    protected RestPutStoredScriptAction(Settings settings, RestController controller, boolean registerDefaultHandlers) {
        super(settings);
        if (registerDefaultHandlers) {
            controller.registerHandler(POST, "/_scripts/{lang}/{id}", this);
            controller.registerHandler(PUT, "/_scripts/{lang}/{id}", this);
        }
    }

I would like to optionally have a context as part of this API, so I would propose that we change the handlers to be /_scripts/{id} and take in JSON as we do for other API's. This would allow us to add an optional context parameter.

Definitely looking for some feedback on this issue.

rjernst · 2016-09-12T22:29:06Z

Changing to just key on id in the stored scripts api sounds right to me. The cache is based on script id alone right?

jdconrad · 2016-09-12T22:31:55Z

The cache is based on both id and lang. And with context added, it would be based on all three.

s1monw · 2016-09-13T07:50:04Z

I would like to optionally have a context as part of this API, so I would propose that we change the handlers to be /_scripts/{id} and take in JSON as we do for other API's. This would allow us to add an optional context parameter.

you can still add an optional parameter here even with keeping this URL?

 controller.registerHandler(POST, "/_scripts/{lang}/{id}", this);
 controller.registerHandler(PUT, "/_scripts/{lang}/{id}", this);
 controller.registerHandler(POST, "/_scripts/{lang}/{id}/{context}", this);
 controller.registerHandler(PUT, "/_scripts/{lang}/{id}/{context}", this);

I am not sure if that is sufficient but I thought I'd bring it up.

++ on this issue in order to get good performance from scripts we have to require this and I really wonder if we should make it non-optional and require it for all scripts. Existing ones can still use Unknown or All? We have the infra to do this added when we required the lang to be present on all scripts. WDYT?

clintongormley · 2016-09-13T11:01:02Z

I love the idea but i would prefer it if it were implicit (ie not user specified). Also, I think that specifying eg aggs is insufficient as some aggs scripts return doubles and some return strings/objects.

How about this:

We have a number of contexts defined internally, eg returns_double, whatever. When we execute (eg) a file script, we pass in the context plus the script file name to the script service. The service tries to retrieve a compiled script that is keyed off the file name PLUS the context. If it doesn't exist, it retrieves the script from the file, compiles it for the context, and caches it with the context.

So if the same script is used with different contexts, then it will be cached multiple times. That said, most scripts will only be cached once or maybe twice.

Our contexts can be as specific as we need and this is totally transparent to the user.

nik9000 · 2016-09-13T11:27:53Z

I suppose the nice thing about knowing the context up front is that you can
compile the script in the context right away and get useful error messages.

I, too, had envisioned this be done when the script is first used and not
at storage time, but I'd jack likes it at storage time that is fine with me
so long as we can work it out. I wonder if we can do this at first usage
time in one PR so we and then look at the separate storage in a second PR?

On Sep 13, 2016 7:01 AM, "Clinton Gormley" notifications@github.com wrote:

I love the idea but i would prefer it if it were implicit (ie not user
specified). Also, I think that specifying eg aggs is insufficient as some
aggs scripts return doubles and some return strings/objects.

How about this:

We have a number of contexts defined internally, eg returns_double,
whatever. When we execute (eg) a file script, we pass in the context plus
the script file name to the script service. The service tries to retrieve a
compiled script that is keyed off the file name PLUS the context. If it
doesn't exist, it retrieves the script from the file, compiles it for the
context, and caches it with the context.

So if the same script is used with different contexts, then it will be
cached multiple times. That said, most scripts will only be cached once or
maybe twice.

Our contexts can be as specific as we need and this is totally transparent
to the user.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#20426 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AANLohtYrmaoHlQl5EnFiiijXQZUoJHOks5qpoJygaJpZM4J6wjq
.

s1monw · 2016-09-13T14:52:04Z

I mean that magic applied here is really nice until it doesn't work. Then all kinds of ugly stuff happens. What if a script can't compile for a certain context? Do we reject it when it's added? I think we should? Do we require the contexts that it's valid for in such a case?
I really wonder if we are trying to be too user friendly on the wrong end here?

rjernst · 2016-09-13T15:20:00Z

I love the idea but i would prefer it if it were implicit (ie not user specified).

Implicit here is not actually user friendly: it means the user can write a broken script that uses something not available, but not find out until runtime. For file scripts, this is inevitable, since they are loaded completely asynchronously. However, for stored scripts, I think this is important feedback to have as soon as possible, rather than when later (possibly a different) user hits an error at runtime for using something not available.

Also, I think that specifying eg aggs is insufficient as some aggs scripts return doubles and some return strings/objects.

This is why we need more contexts than what we have now. There are actually many more uses of scripts than the existing contexts lead you to think. Scripted metrics is even more complicated, it should have multiple contexts, one for each phase.

jdconrad · 2016-09-13T16:44:54Z

The contexts that are there would only be the starting point, but once we make the changes to handle custom signatures with the existing contexts, it should be easier to add more. I think that's the time to go through all the uses of compile and then figure out which contexts make sense and add them. I envision us having an interface for each context's signature, and having a subclass of CompiledScript implement the interface(s) that will then be called directly by the CompiledScript requester. There is no need to have these calling back to the ScriptService when they already have a CompiledScript.

It would be great to enforce that all scripts be required to have context, but this would kill backcompat for Groovy, Expressions and Javascript right now. I think the thing to do about this is to make Painless require it for 5.0, then remove Groovy, Javascript, and have Expressions require it for 6.0.

I also don't think implicit scripts is a good idea (we do need some of this for backcompat unfortunately), but moving forward I would absolutely require the context be provided with all scripts. It will eliminate much frustration to the issues that Simon pointed out, and most scripts won't be able to be useful in multiple contexts anyway.

On the storage issue, we are not very well set up to compile with context on first use, since we only cache the already compiled scripts. Painless will be able to make good use of the context at compile, so I would prefer to have it with storage the way it is now, otherwise a lot more needs to suddenly be refactored.

jdconrad · 2016-09-13T16:47:13Z

@s1monw I actually did consider adding more handlers, and I do believe that's the easiest way to do that, but @rjernst was thinking it might be better to only have the API cache on an ID (just the name) for a stored script and possibly file scripts too, and that would require a JSON blob to take in other pieces like lang and context.

rjernst · 2016-09-13T17:44:20Z

Adding more handlers isn't good for a couple reasons:

It is confusing because really the id should be globally unique. Right now we have IMO a difficult to use api to reference stored/file scripts in that you must also specify the language. We should be able to lookup a stored script simply by its id. Changing the handlers to /_scripts/{id} will make it clear all scripts share the same namespace, regardless of lang or context.
Accepting the same json that the eg search api accepts normally allows for changes there to be reflected automatically in stored scripts. For example, if we add an options map with compile time options, there would be a place for that in stored scripts. Right now, there is no way to add such options.

apatrida · 2017-03-28T11:14:19Z

This will benefit the new secure Kotlin script engine for Elasticsearch as well. It can execute both text based scripts where we can compile the script to a different base class that handles the context interface, and also can ship client-side lambdas as binary inline scripts that are extensions to a receiver class which represents the context interface or the Lambda as a SAM conversion depending on how a context's interface will be represented. Then type inference, code completion and other IDE features can kick in to help in script development.

Currently everything compiles against one class that unifies the contexts and helps a bit with extensions trying to make things a bit more uniform, but it is still confusing and has lower performance than if each call/context was uniquely suited to the script's real job. And as the comments about Painless above mention, we can provide more useful errors at compilation since we know what can/should be accessed.

Is there any branch with changes underway for this? I'd like to track it if so.

nik9000 · 2017-03-28T12:19:46Z

Is there any branch with changes underway for this? I'd like to track it if so.

Given how we work long running branches are a huge pain to maintain. No one is actively working on this at the moment. I was a few days ago, but that turned out to be a dead end: #23744

jdconrad · 2017-05-10T18:41:31Z

Been a bit, but this is progressing again. Some notes on the current discussions:

Allow users to create custom contexts that specify the name of the context, the signature for the method to call against a generated script, and a whitelist of classes, methods, and fields that are to be allowed be used in the script (may not matter for some languages).
Allow users to create custom languages that will have to decide how and which contexts to support. Custom languages do not necessarily need to be able to support all contexts. Expressions will only support search, scripted fields, and possibly aggs.
Possibly remove stored templates and only use the stored scripting API. (Remove Stored Template API in favor of Stored Scripting API #24596). With custom contexts, a template context can be made that accepts a Map<String, Object> and returns a String. With this in place, there is no reason to only allow Mustache for templates. Any language that supports this context should be allowed.
Settings will be allowed to turn enable and disable any context.

rjernst · 2017-05-30T03:18:25Z

This work has been slowly progressing in #24868, #24818, #24817, #24873, #24877, #24883, #24896 and #24897.

…n script contexts ScriptContexts currently understand a FactoryType that can produce instances of the script InstanceType. However, for search scripts, this does not work as we have the concept of LeafSearchScript that is created per lucene segment. This commit effectively renames the existing SearchScript class into SearchScript.LeafFactory, which is a new, optional, class that can be defined within a ScriptContext. LeafSearchScript is effectively renamed back into SearchScript. This change allows the model of stateless factory -> stateful factory -> script instance to continue, but in a generic way that any script context may take advantage of. relates elastic#20426

…n script contexts (#24974) ScriptContexts currently understand a FactoryType that can produce instances of the script InstanceType. However, for search scripts, this does not work as we have the concept of LeafSearchScript that is created per lucene segment. This commit effectively renames the existing SearchScript class into SearchScript.LeafFactory, which is a new, optional, class that can be defined within a ScriptContext. LeafSearchScript is effectively renamed back into SearchScript. This change allows the model of stateless factory -> stateful factory -> script instance to continue, but in a generic way that any script context may take advantage of. relates #20426

This commit creates TemplateScript and associated classes so that templates no longer need a special ScriptService.compileTemplate method. The execute() method is equivalent to the old run() method. relates elastic#20426

This commit creates TemplateScript and associated classes so that templates no longer need a special ScriptService.compileTemplate method. The execute() method is equivalent to the old run() method. relates #20426

jdconrad · 2017-11-29T16:04:14Z

This is done. Closing.

jdconrad added discuss :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache :Plugin Lang Painless v5.0.0-beta1 labels Sep 12, 2016

nik9000 mentioned this issue Sep 13, 2016

Support assert in Painless #20264

Closed

clintongormley added v5.0.0 and removed v5.0.0-beta1 labels Sep 14, 2016

jdconrad mentioned this issue Sep 21, 2016

Update Scripting API to Store Context for Scripts #20621

Closed

3 tasks

jasontedor removed the discuss label Sep 23, 2016

clintongormley added v5.1.1 and removed v5.0.0 labels Oct 18, 2016

clintongormley added v5.1.2 and removed >enhancement v5.1.1 labels Dec 7, 2016

clintongormley added >enhancement and removed v5.1.2 labels Jan 12, 2017

clintongormley mentioned this issue Mar 28, 2017

Update Script plugin API so that script compilation knows what type of script it is compiling #23751

Closed

rjernst mentioned this issue May 30, 2017

Specialize the search script API #18863

Closed

rjernst mentioned this issue May 30, 2017

Scripting: Conslidate SearchScript and LeafSearchScript #24916

Closed

rjernst mentioned this issue May 30, 2017

Add StatefulFactoryType as optional intermediate factory in script contexts #24974

Merged

rjernst mentioned this issue Jun 2, 2017

Scripting: Convert CompiledTemplate to a ScriptContext #25032

Merged

rjernst mentioned this issue Jun 12, 2017

Get all search templates/scripts #25179

Closed

jdconrad closed this as completed Nov 29, 2017

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Call Signature Per Script Context #20426

Custom Call Signature Per Script Context #20426

jdconrad commented Sep 12, 2016 •

edited

Loading

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016 •

edited by nik9000

Loading

rjernst commented Sep 12, 2016

jdconrad commented Sep 12, 2016

s1monw commented Sep 13, 2016

clintongormley commented Sep 13, 2016

nik9000 commented Sep 13, 2016

s1monw commented Sep 13, 2016

rjernst commented Sep 13, 2016

jdconrad commented Sep 13, 2016

jdconrad commented Sep 13, 2016

rjernst commented Sep 13, 2016

apatrida commented Mar 28, 2017 •

edited

Loading

nik9000 commented Mar 28, 2017

jdconrad commented May 10, 2017 •

edited

Loading

rjernst commented May 30, 2017

jdconrad commented Nov 29, 2017

Custom Call Signature Per Script Context #20426

Custom Call Signature Per Script Context #20426

Comments

jdconrad commented Sep 12, 2016 • edited Loading

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016

nik9000 commented Sep 12, 2016

jdconrad commented Sep 12, 2016 • edited by nik9000 Loading

rjernst commented Sep 12, 2016

jdconrad commented Sep 12, 2016

s1monw commented Sep 13, 2016

clintongormley commented Sep 13, 2016

nik9000 commented Sep 13, 2016

s1monw commented Sep 13, 2016

rjernst commented Sep 13, 2016

jdconrad commented Sep 13, 2016

jdconrad commented Sep 13, 2016

rjernst commented Sep 13, 2016

apatrida commented Mar 28, 2017 • edited Loading

nik9000 commented Mar 28, 2017

jdconrad commented May 10, 2017 • edited Loading

rjernst commented May 30, 2017

jdconrad commented Nov 29, 2017

jdconrad commented Sep 12, 2016 •

edited

Loading

jdconrad commented Sep 12, 2016 •

edited by nik9000

Loading

apatrida commented Mar 28, 2017 •

edited

Loading

jdconrad commented May 10, 2017 •

edited

Loading