-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Call Signature Per Script Context #20426
Comments
I agree with the idea of having the context at compile time. Are you thinking you'd have to pass in a |
@nik9000 I can't say this is 100% thought-through yet as I need to go through the code and figure out what the contexts should be (I don't think the existing ones completely fit all the cases), but as I think you've seen in the past from me I prefer explicit over magic typically :) . I was thinking more along the lines of perhaps an interface per context with the signature that must be implemented by the script object that is returned from compile. CompiledScripts that implement the specific interface are what get's executed rather than a call back to the ScriptService. I envision a CompiledScript containing more than one interface possibly so we hopefully wouldn't end up with an excessive number of new classes. |
That makes sense. I think it'd be pretty cool to say "here is some text that a user sent me, please compile it and return something that implements this interface". We wouldn't have to pass in there I like the flexibility of letting the user of the script engine specify the interface that it needs. It lines up pretty well with "what can this script engine do?" because implementations like Mustache and Expressions can just choke when the caller asks them for an interface they don't understand but groovy and painless can actually implement the interface. Groovy through dynamic stuff and painless with static stuff. |
@nik9000 Exactly what I was thinking! As a bonus, I really think this will help clean up the ScriptService quite a bit too. |
++ |
So I began to modify the code to allow context at compilation for script engines that would like it, and ran into an issue with stored scripts. In RestPutStoredScriptAction we add handlers like this:
I would like to optionally have a context as part of this API, so I would propose that we change the handlers to be Definitely looking for some feedback on this issue. |
Changing to just key on id in the stored scripts api sounds right to me. The cache is based on script id alone right? |
The cache is based on both id and lang. And with context added, it would be based on all three. |
you can still add an optional parameter here even with keeping this URL? controller.registerHandler(POST, "/_scripts/{lang}/{id}", this);
controller.registerHandler(PUT, "/_scripts/{lang}/{id}", this);
controller.registerHandler(POST, "/_scripts/{lang}/{id}/{context}", this);
controller.registerHandler(PUT, "/_scripts/{lang}/{id}/{context}", this); I am not sure if that is sufficient but I thought I'd bring it up. ++ on this issue in order to get good performance from scripts we have to require this and I really wonder if we should make it non-optional and require it for all scripts. Existing ones can still use |
I love the idea but i would prefer it if it were implicit (ie not user specified). Also, I think that specifying eg How about this: We have a number of contexts defined internally, eg returns_double, whatever. When we execute (eg) a file script, we pass in the context plus the script file name to the script service. The service tries to retrieve a compiled script that is keyed off the file name PLUS the context. If it doesn't exist, it retrieves the script from the file, compiles it for the context, and caches it with the context. So if the same script is used with different contexts, then it will be cached multiple times. That said, most scripts will only be cached once or maybe twice. Our contexts can be as specific as we need and this is totally transparent to the user. |
I suppose the nice thing about knowing the context up front is that you can I, too, had envisioned this be done when the script is first used and not On Sep 13, 2016 7:01 AM, "Clinton Gormley" notifications@github.com wrote:
|
I mean that magic applied here is really nice until it doesn't work. Then all kinds of ugly stuff happens. What if a script can't compile for a certain context? Do we reject it when it's added? I think we should? Do we require the contexts that it's valid for in such a case? |
Implicit here is not actually user friendly: it means the user can write a broken script that uses something not available, but not find out until runtime. For file scripts, this is inevitable, since they are loaded completely asynchronously. However, for stored scripts, I think this is important feedback to have as soon as possible, rather than when later (possibly a different) user hits an error at runtime for using something not available.
This is why we need more contexts than what we have now. There are actually many more uses of scripts than the existing contexts lead you to think. Scripted metrics is even more complicated, it should have multiple contexts, one for each phase. |
The contexts that are there would only be the starting point, but once we make the changes to handle custom signatures with the existing contexts, it should be easier to add more. I think that's the time to go through all the uses of compile and then figure out which contexts make sense and add them. I envision us having an interface for each context's signature, and having a subclass of CompiledScript implement the interface(s) that will then be called directly by the CompiledScript requester. There is no need to have these calling back to the ScriptService when they already have a CompiledScript. It would be great to enforce that all scripts be required to have context, but this would kill backcompat for Groovy, Expressions and Javascript right now. I think the thing to do about this is to make Painless require it for 5.0, then remove Groovy, Javascript, and have Expressions require it for 6.0. I also don't think implicit scripts is a good idea (we do need some of this for backcompat unfortunately), but moving forward I would absolutely require the context be provided with all scripts. It will eliminate much frustration to the issues that Simon pointed out, and most scripts won't be able to be useful in multiple contexts anyway. On the storage issue, we are not very well set up to compile with context on first use, since we only cache the already compiled scripts. Painless will be able to make good use of the context at compile, so I would prefer to have it with storage the way it is now, otherwise a lot more needs to suddenly be refactored. |
@s1monw I actually did consider adding more handlers, and I do believe that's the easiest way to do that, but @rjernst was thinking it might be better to only have the API cache on an ID (just the name) for a stored script and possibly file scripts too, and that would require a JSON blob to take in other pieces like lang and context. |
Adding more handlers isn't good for a couple reasons:
|
This will benefit the new secure Kotlin script engine for Elasticsearch as well. It can execute both text based scripts where we can compile the script to a different base class that handles the context interface, and also can ship client-side lambdas as binary inline scripts that are extensions to a receiver class which represents the context interface or the Lambda as a SAM conversion depending on how a context's interface will be represented. Then type inference, code completion and other IDE features can kick in to help in script development. Currently everything compiles against one class that unifies the contexts and helps a bit with extensions trying to make things a bit more uniform, but it is still confusing and has lower performance than if each call/context was uniquely suited to the script's real job. And as the comments about Painless above mention, we can provide more useful errors at compilation since we know what can/should be accessed. Is there any branch with changes underway for this? I'd like to track it if so. |
Given how we work long running branches are a huge pain to maintain. No one is actively working on this at the moment. I was a few days ago, but that turned out to be a dead end: #23744 |
Been a bit, but this is progressing again. Some notes on the current discussions:
|
…n script contexts ScriptContexts currently understand a FactoryType that can produce instances of the script InstanceType. However, for search scripts, this does not work as we have the concept of LeafSearchScript that is created per lucene segment. This commit effectively renames the existing SearchScript class into SearchScript.LeafFactory, which is a new, optional, class that can be defined within a ScriptContext. LeafSearchScript is effectively renamed back into SearchScript. This change allows the model of stateless factory -> stateful factory -> script instance to continue, but in a generic way that any script context may take advantage of. relates elastic#20426
…n script contexts (#24974) ScriptContexts currently understand a FactoryType that can produce instances of the script InstanceType. However, for search scripts, this does not work as we have the concept of LeafSearchScript that is created per lucene segment. This commit effectively renames the existing SearchScript class into SearchScript.LeafFactory, which is a new, optional, class that can be defined within a ScriptContext. LeafSearchScript is effectively renamed back into SearchScript. This change allows the model of stateless factory -> stateful factory -> script instance to continue, but in a generic way that any script context may take advantage of. relates #20426
This commit creates TemplateScript and associated classes so that templates no longer need a special ScriptService.compileTemplate method. The execute() method is equivalent to the old run() method. relates elastic#20426
This commit creates TemplateScript and associated classes so that templates no longer need a special ScriptService.compileTemplate method. The execute() method is equivalent to the old run() method. relates #20426
This is done. Closing. |
We should make each context execute scripts based on exactly what gets passed in and returned. This eliminates a huge number of potential boxes for something like scoring that returns a double value out of a script. This may also eliminate a large number of hash look ups as certain parameters will be passed directly into the method instead of accessed from a map. Over millions of documents these performance gains really add up. Another benefit will be prevention of accessing null variables that may not belong in a context. For this to work the context will have to be passed in at compile-time.
I need to run through all the script cases and figure out what the contexts should be as I'm not sure if the current ones cover all the situations.
For file based scripts, I would propose for Painless and Expressions that the script be named something like script.score.painless now to embed the context in the name. For something like Groovy and other languages we can have an "unknown" context so these remain backwards compatible. The "unknown" context would revert to the old call signatures. Once the older languages are fully removed then we can remove the "unknown" context.
The text was updated successfully, but these errors were encountered: