-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impossible to create pipeline with multiple processors with same name #1810
Comments
Probably the best workaround would be to add full or at least have some support for this "natively" in Elastica? |
I think it will need some enhancements of how processors are stored in Pipeline (maybe even params) and/or how Pipeline is serialized in order to be sent to Elasticsearch. But I still don't understand enough that part of the code to be sure of how to do that. |
I saw you opened a PR, probably that will make the discussion easier. |
So! I found a much better workaround (might be an idea to follow) use Elastica\Processor\Attachment;
use Elastica\Processor\Lowercase;
use Elastica\Processor\Remove;
use Elastica\Request;
//Create first attachment processor
$attachprocDesc = new Attachment('descBinary');
$attachprocDesc->setIndexedChars(-1);
$attachprocDesc->setTargetField('desc_attachment');
//Create first remove processor
$removeprocDesc = new Remove('descBinary');
//Create second attachment processor (used in a foreach processor)
$attachprocNeweditor = new Attachment('_ingest._value.contentBinary');
$attachprocNeweditor->setIndexedChars(-1);
$attachprocNeweditor->setTargetField('_ingest._value.content');
//Create second remove processor (used in a foreach processor)
$removeprocNeweditor = new Remove('_ingest._value.contentBinary');
$pipelineId = 'mypipeline';
$pipeline = [
'description' => 'a pipeline',
'processors' => [
$attachprocDesc->toArray(), //1st attachment
$removeprocDesc->toArray(), //1st remove
[ //1st foreach (manual because not implemented in Elastica)
'foreach' => [
'field' => 'subContents',
'ignore_missing' => true,
'processor' => $attachprocNeweditor->toArray(), //2nd attachment
],
],
[ //2nd foreach (manual because not implemented in Elastica)
'foreach' => [
'field' => 'subContents',
'ignore_missing' => true,
'processor' => $removeprocNeweditor->toArray(), //2nd remove
],
],
(new Lowercase('somefield'))->toArray(),
],
];
$path = "_ingest/pipeline/{$pipelineId}";
$client->request($path, Request::PUT, json_encode($pipeline)); Will produce the following pipeline (named 'mypipeline') {
"description": "a pipeline",
"processors": [
{
"attachment": {
"field": "descBinary",
"indexed_chars": -1,
"target_field": "desc_attachment"
}
},
{
"remove": {
"field": "descBinary"
}
},
{
"foreach": {
"field": "subContents",
"ignore_missing": true,
"processor": {
"attachment": {
"field": "_ingest._value.contentBinary",
"indexed_chars": -1,
"target_field": "_ingest._value.content"
}
}
}
},
{
"foreach": {
"field": "subContents",
"ignore_missing": true,
"processor": {
"remove": {
"field": "_ingest._value.contentBinary"
}
}
}
},
{
"lowercase": {
"field": "somefield"
}
}
]
} |
Refactored processor handling to more closely resemble what Elasticsearch ingest pipeline endpoint expects. Fixes ruflin#1810
Edit 1
Last sample workaround in #1810 (comment)
Original
Hi,
Use
Elasticsearch 7.9.2
Elastica 7.0.0
I would like to create the following pipeline from PHP with Elastica :
Through manual curl query and some indexation test it works well, but when I want to do this by PHP all processors type which appears more that one time are erased (only the last added is keeped)
But then the produced pipeline is :
As we can see, only the second 'attachement' and 'remove' processors are keeped.
The problem is caused by the associative array used by Pipeline class which associate the processor type as a key :
I tried to find a workarround by using setRawProcessors
But this produces an error on Elasticsearch (certainly because php uses 0,1,2... as keys)
Will try to use Pipeline processor as second workaround and do nested pipelines.
Open for better workaround because this one isn't ideal.
The text was updated successfully, but these errors were encountered: