-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Remove noisy 'Could not find trained model' message #100760
Conversation
Hi @davidkyle, I've created a changelog YAML for you. |
Pinging @elastic/ml-core (Team:ML) |
@@ -745,14 +750,13 @@ private void cacheEvictionListener(RemovalNotification<String, ModelAndConsumer> | |||
|
|||
@Override | |||
public void clusterChanged(ClusterChangedEvent event) { | |||
final boolean prefetchModels = event.state().nodes().getLocalNode().isIngestNode(); | |||
// If we are not prefetching models and there were no model alias changes, don't bother handling the changes | |||
if ((prefetchModels == false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this changing the behaviour beyond what the release note says?
Previously in the case where an ingest pipeline referenced a model that didn't exist, that model would be prefetched on ingest nodes as soon as it was created. But now that doesn't happen? Or am I misreading this change?
Maybe the old behaviour was unacceptably resource-intensive, because it was doing a lot of processing on ingest nodes on every cluster state update, and changing it is justified. But in that case I think the release note should say this rather than pretend the change is only about logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make those changes in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
💚 Backport successful
|
) A 'Could not find trained model' was logged when a new ingest pipeline using a inference processor was created but the referenced model could not be found. This isn't unusual as it is common to create a pipeline before the model is loaded. This change does not log the error message in this situation
…100775) A 'Could not find trained model' was logged when a new ingest pipeline using a inference processor was created but the referenced model could not be found. This isn't unusual as it is common to create a pipeline before the model is loaded. This change does not log the error message in this situation
ModelLoadingService watches for the creation of ingest pipelines that use the ingest processor and pre-loads the model referenced in the new pipeline. If the model does not exist because the pipeline has been created before the model an error message is logged, this can happen quite a lot producing spurious error messages. Often this will be because a solution creates ingest pipeline referencing the ELSER model before ELSER has been downloaded.
ModelLoadingService
only handles DFA/Boosted tree models which are loaded on ingest nodes, it does not load PyTorch/NLP models but it first has to read the model config to know what type the model is.If is very easy to cause the error message to be logged, just create a ingest processor with an unknown model id
And the following is logged.