-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ingest] Add comparison processor for supporting conditionals in pipeline definition #14647
Comments
Should we try and expand a bit more on what can be a condition and what syntax should be supported there? For instance can there be field values on the right side or only constants? I would probably change naming, if it's an if why call it compare? and I think true_processors and false_processors might get confusing, why not use simply if and else and maybe call the processor "condition" or something along those lines? I think we should have @clintongormley have a look at the api and validate it also :) |
I was thinking of enabling the same mustache templating that is introduced here. an example templated field value: This would enable fields to be compared to other fields.
do you propose something more similar to this syntax?
|
OK, @javanna @martijnvg, let me know what you think! here is an in-progress branch of what I've come up with so far: link a bit about what has been done: Conditionals would be described as a separate processor within the pipeline. Branches of execution would be delegated within the here is an example ConditionalProcessor config definition:
To not create a dependency on the metadata/mustache-templating work, this Processor will start off only supporting hardcoded values provided within the config. One major limitation of what has been done so far is that there is no "OR" logic anywhere... I will keep exploring to see how I could get that in. Also, all comparisons except for ** all naming is still up in the air, just trying out other names. |
I think this is ok for the first version of the
Maybe we should then add an extra level to the ...
"field_conditions": {
"and" : {
"message" : { "neq": "hello world" },
"age": { "gte": 13, "lt": 18 }
},
"or" : {
...
}
}
... If any comparisons are defined in the |
I realize there is a huge limitation with this format... a check to see if a field is either all lowercase or all uppercase for example... if we place the boolean operator outside the field map, this would not be possible since the field name is a key. "field_conditions": {
"or": {
"message" : { "eq": "hello world" },
"message" : { "eq": "HELLO WORLD" }
}
} if we placed the boolean operator within each field, it would still not work since "field_conditions": {
"message" : { "or" : { "eq": "hello world", "eq": "HELLO WORLD"} }
} How about this? "field_conditions": {
("and"|"or") : [
{ "message" : { "neq" : "hello world" } },
{ "message" : { "neq" : "HELLO WORLD" } },
{ "age" : { "gte" : 13 } },
{ "age" : { "lt" : 18 } }
]
} the biggest beef I have with the suggestion above is that is can look rather |
since it is starting to grow in complexity, I thought it would be a good thought experiment to see what "field_conditions": {
"or" : [
{
"and" : [
{ "teenager" : { "eq" : true } },
{ "age" : { "gte" : 13 } },
{ "age" : { "lte" : 19 } }
]
},
{
"and" : [
{ "teenager" : { "eq" : false } },
{ "not" : { "and" : [
{ "age" : {"gte" : 13} },
{ "age" : {"lte" : 19} }
] }
}
]
}
]
} hmmm... looks like too much for me. what do you think? |
This looks complex :) My idea is that the conditions don't support nesting. (all conditions need to be on the same level, so there is a list of 'and' conditions and a list of 'or' conditions) Nesting could be achieved via adding another |
so, I think I have implemented most of what you meant. I'll go ahead and make a PR for it now so that it is easier to view the branch |
How about this:
So the |
oh i mixed |
thanks @clintongormley! Offline, we were possibly discussing a simpler processor that would only operate on a single field. here is what that looks like: https://github.com/talevy/elasticsearch/blob/if_conditions_else_processor/docs/plugins/ingest.asciidoc#field-conditional-processor
The statement above demonstrates something that is similar, but still lacking in the way this processor
Since the subject here is not a field, and instead it is a simple string value, this relationship will probably need to be supported. Since we do not yet have string templates (mustache templates) for field referencing, I think only supporting field references instead of raw string values is ok for now. does that make sense? |
why are we trying to write a scripting language in json? this all looks off to me. We should probably just use real scripts for conditions... otherwise, there's no end to this. |
another option is to have the following structure: {
"condition" : {
"if" : {
// a conditional structure
},
"then" : [
// a list of processor that will run if the condition is not met
],
"else" : [
// a list of processor that will run if the condition is not met
]
}
} where there can be different conditional structures: {
"if" : {
"script" : "_doc.key == 2"
}
} or {
"if" : {
"compare_fields" : {
{ "eq" : { "field" : "key", "value" : "2"} }
}
}
} |
From reading this issue, I understand that there are multiple ways of implementing such a feature. Are conditionals still on the roadmap, or have you dropped this for now? |
@byronvoorbach. Conditionals have been pushed off the roadmap for now. The plan is to assess its use-cases after the initial release of the Ingest API. For now, any error-related branching can be achieved via the on_failure option |
I still agree with @uboness on
We have a real scripting language in Elasticsearch that we've now embedded in the Ingest Node processors. If we do this, I think all the manual |
@eskibars indeed, now that we have the flexibility of a real language and the boolean operators that come with it... there is no need for this. in the future, if someone wants to make a designated Closing |
The Ingest Pipeline should support the ability for conditional branching of processor execution.
The proposal here is to define a
CompareProcessor
which will accept a list of conditions that must all be met for specific processors to be executed. This processor would have aconditions
parameter that would accept various boolean operators, such asgt
,lt
,gte
,lte
,eq
,neq
. If all of the provided conditions evaluate to true, then a set of processors defined in a field potentially calledif_processors
would be executed. If any conditions in this set evaluate to false, then processors in theelse_processors
list option would be executed. This type of conditional can be though of as an "if/else" processor, where theconditions
field is a compounded boolean expression joined with anAND
. Continuing with this analogy -- theif_processors
block would be theif
block, and theelse_processors
would be the else block.Whether both
if_processors
andelse_processors
configuration options would be necessary is still up in the air, at least one must be defined.Since the
CompareProcessor
is just another processor, nested comparisons would be possible.Here is a potential processor definition for the
CompareProcessor
Here is an example pipeline definition with a Compare Processor
The text was updated successfully, but these errors were encountered: