Enrich documents with inference results at Fetch #53230

davidkyle · 2020-03-06T15:50:44Z

Why here

Search hits can be modified at fetch with new fields added. Fetch sub phases run the on the data node so additional features used by the model can be extracted from Lucene.

Configuration

There isn't a direct way of configuring FetchSubPhases so I have commandeered SearchExtSpec for the purpose. The ext spec is accessible via the SearchContext passed to the fetch sub phase. Parsed here SearchExtSpecs come under the "ext" field forcing this rather clunky nested config upon us:

    "query": { },
    "ext": {
        "ml_inference" : {
            "model_id" : "an_inference_model",
            "target_field" : "ml_results_field",
            "field_mappings": {"doc_field_name": "name_expected_by_model"},
            "inference_config": {
                "classification": {
                      "num_top_classes": 1
                }
            }
        }
    }

The usual config options apply.

Modifying the Search Hit

The goal is to append a field to each search hit with the inference result. I see 2 options for doing so:

Modify the document source
Add a DocumentField. The new field will appear under the fields section of the search hit as if it had been asked for in the search request via docvalue_fields

I've opted for the 2nd choice as modifying the source seems a little underhand. Again this is awkward putting the result where we would expect doc value fields, depending on the outcome of #49028 the future may offer another way to add fields to the search hit.

"hits" : [
 {
   "_index" : "store",
   "_id" : "NIxgsHABziobK4pvOW_2",
   "_score" : 0.52354836,
   "_source" : {
        ...
   },
   "fields" : {           <-- Usually doc value fields
     "ml_results_field" : [
       {
         "top_classes" : [
           {
             "class_name" : "hot dog",
             "class_probability" : 0.99,
             "class_score" : 1.0
           }
         ],
         "predicted_value" : "hot dog"
       }
     ]
   }
 },

The Problem

The InferencePhase class has access to the ModelLoadingService which neatly deals with the model caching problem but there is still a blocking call to load the model (which may or not be cached) the first time InferencePhase.hitsExecute(SearchContext, SearchHit[]) is called.

Wish List

A way to configure the fetch phase without nesting it inside the "ext" field

Why Here (Reprise)

Executing locally on the data node has the advantage of being close to any shard level features we want extract and use in inference. But it now occurs to me that those features could be extracted in a fetch sub phase and returned with the hit. Inference would then run on the coordinating node and the blocking call to load the model could be dropped.

This PR is raised against the feature branch feature/search-inference

elasticmachine · 2020-03-06T15:50:46Z

Pinging @elastic/es-search (:Search/Search)

elasticmachine · 2020-03-06T15:50:47Z

Pinging @elastic/ml-core (:ml)

benwtrent · 2020-03-06T16:04:58Z

but there is still a blocking call to load the model (which may or not be cached) the first time

I think it would be beneficial to have a way to "deploy" models on to nodes. The downside is deployment vs. access race conditions would probably still result in a synchronous loading or something.

jtibshirani · 2020-03-06T18:02:47Z

I haven't taken a close look at the code yet, but have some high-level comments first.

Again this is awkward putting the result where we would expect doc value fields

I actually don't find this awkward, since the fields section in the response contains not only the result of docvalue_fields, but also the result of other request fields like script_fields and stored_fields.

Building on this observation, I wanted to share another option for the API. We could model the inference as a new section similar to script fields, perhaps called ml_inference_fields:

"query": { ... },
"ext": {
  "ml_inference_fields": {
    "ml_results_field": {
      "model_id" : "an_inference_model",
      "field_mappings": {"doc_field_name": "name_expected_by_model"},
      "inference_config": {
        "classification": {
          "num_top_classes": 1
        }
      }
    }
  }
}

Note that the target_field parameter is no longer present, instead we are defining a new 'dynamic' field named ml_results_field. I'm curious what you think of this option, given your knowledge of the inference API and where it's headed.

A way to configure the fetch phase without nesting it inside the "ext" field

Ack, will give this some thought. I think my API suggestion above would also look much more natural without being nested under ext.

Inference would then run on the coordinating node and the blocking call to load the model could be dropped.

We've been quite unsure if it's better to run inference on the coordinating vs. data nodes. Is there a known set of investigations/ discussions we need to complete to reach clarity on this decision? Perhaps this would involve determining the types of models we want to support in a v1, and thinking through @benwtrent's idea about model deployment? It would be nice to have this list somewhere (I'm happy to move this conversation to an issue/ design doc to not make the PR too noisy).

davidkyle · 2020-03-09T11:27:01Z

"query": { ... },
"ext": {
  "ml_inference_fields": {
    "ml_results_field": {
      "model_id" : "an_inference_model",
      "field_mappings": {"doc_field_name": "name_expected_by_model"},
      "inference_config": {
        "classification": {
          "num_top_classes": 1
        }
      }
    }
  }
}

Note that the target_field parameter is no longer present

I like this syntax it looks like aggregations and fits better with the query DSL. The only problem is that the inference processor is configured differently and it would be obtuse to make it so the config cannot be copy and pasted.

But we should consider the option.

davidkyle · 2020-03-09T17:59:21Z

run elasticsearch-ci/2
run elasticsearch-ci/bwc
run elasticsearch-ci/default-distro

benwtrent · 2020-03-10T16:30:06Z

...n/core/src/main/java/org/elasticsearch/xpack/core/ml/inference/results/InferenceResults.java

 public interface InferenceResults extends NamedWriteable {

    void writeResult(IngestDocument document, String parentResultField);

+    Map<String, Object> writeResultToMap(String parentResultField);


I wonder if we should implement ToXContentObject instead of having a bespoke method that creates a map.

The map is converted to DocumentFields we couldn't do that with ToXContentObject

benwtrent · 2020-03-10T16:33:16Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/loadingservice/LocalModel.java

-            listener.onResponse(trainedModelDefinition.infer(fields, config));
-        } catch (Exception e) {
-            listener.onFailure(e);
+    public InferenceResults infer(Map<String, Object> fields, InferenceConfig config) {


I think this is backwards.

We should have the synchronous method call the asynchronous method. This is the prevelant pattern everywhere else. Also, it is possible to make something asynchronous -> synchronous, not really the other way around.

Yes it is backwards.

The function called by LocalModel::infer is TrainedModelDefinition::infer which does not have an async version. In this case we want to work to be done in the calling thread because the model is local to the call, for single threaded models I can't think of a situation where we would want to spawn another thread to do the work as we know inference is cheap. For models that could be parallelised and the work split over multiple threads then yes you would want to make it async.

Do we even need the async method right now? It is only called by TransportInternalInferModelAction and could easily be changed.

I removed the default method because it is backwards and wrong then implemented it in LocalModel.

Do we even need the async method right now?

Maybe not, but it is much more difficult to make things asynchronous once they are synchronous.

Assumptions are made about where the model lives when it is synchronous.

What if this was a natively loaded model?
Would we pause the calling thread for the data to be serialized down the native process?

I am not sure about this, but I did not want to paint us in a corner.

benwtrent · 2020-03-10T19:35:10Z

x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/search/InferencePhase.java

+
+        modelLoadingService.get().getModel(infBuilder.getModelId(), listener);
+        try {
+            // Eeek blocking on a latch we can't be doing that


agreed :D. We may want this call to fail if the model is not deployed in the provided model service. Especially since there is no way to load it just in time :/

Adds a FetchSubPhase which adds a new field to the search hits with the result of the model inference performed on the hit. There isn't a direct way of configuring FetchSubPhases so SearchExtSpec is used for the purpose.

davidkyle added :Search/Search Search-related issues that do not fall into other categories :ml Machine learning labels Mar 6, 2020

davidkyle requested a review from jtibshirani March 6, 2020 15:51

davidkyle force-pushed the fetch-inference branch from 4c29d66 to 333ba97 Compare March 9, 2020 11:38

davidkyle changed the base branch from feature/search-inference to master March 9, 2020 12:26

davidkyle changed the base branch from master to feature/search-inference March 9, 2020 12:26

davidkyle force-pushed the fetch-inference branch from 989c215 to 0d3aa04 Compare March 10, 2020 11:05

jtibshirani and others added 7 commits March 10, 2020 12:37

Sketch out how to pass a service to a FetchSubPhase.

1cdea63

Make sure to lazily access ModelLoadingService.

c5e61c4

Use search ext

f6b2725

Inference at fetch

de90bae

Add target field

818ac7d

add disclaimer

66f6b05

Blacklist tests that fail security checks

3088b85

davidkyle force-pushed the fetch-inference branch from 0d3aa04 to 3088b85 Compare March 10, 2020 12:38

benwtrent self-requested a review March 10, 2020 16:27

benwtrent reviewed Mar 10, 2020

View reviewed changes

remove default sync method

6a102b9

benwtrent approved these changes Mar 10, 2020

View reviewed changes

davidkyle merged commit 54fb29f into elastic:feature/search-inference Mar 11, 2020

davidkyle deleted the fetch-inference branch March 11, 2020 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich documents with inference results at Fetch #53230

Enrich documents with inference results at Fetch #53230

davidkyle commented Mar 6, 2020

elasticmachine commented Mar 6, 2020

elasticmachine commented Mar 6, 2020

benwtrent commented Mar 6, 2020

jtibshirani commented Mar 6, 2020 •

edited

Loading

davidkyle commented Mar 9, 2020

davidkyle commented Mar 9, 2020

benwtrent Mar 10, 2020

davidkyle Mar 10, 2020

benwtrent Mar 10, 2020

benwtrent Mar 10, 2020

davidkyle Mar 10, 2020

benwtrent Mar 10, 2020

benwtrent Mar 10, 2020

Enrich documents with inference results at Fetch #53230

Enrich documents with inference results at Fetch #53230

Conversation

davidkyle commented Mar 6, 2020

Why here

Configuration

Modifying the Search Hit

The Problem

Wish List

Why Here (Reprise)

elasticmachine commented Mar 6, 2020

elasticmachine commented Mar 6, 2020

benwtrent commented Mar 6, 2020

jtibshirani commented Mar 6, 2020 • edited Loading

davidkyle commented Mar 9, 2020

davidkyle commented Mar 9, 2020

benwtrent Mar 10, 2020

Choose a reason for hiding this comment

davidkyle Mar 10, 2020

Choose a reason for hiding this comment

benwtrent Mar 10, 2020

Choose a reason for hiding this comment

benwtrent Mar 10, 2020

Choose a reason for hiding this comment

davidkyle Mar 10, 2020

Choose a reason for hiding this comment

benwtrent Mar 10, 2020

Choose a reason for hiding this comment

benwtrent Mar 10, 2020

Choose a reason for hiding this comment

jtibshirani commented Mar 6, 2020 •

edited

Loading