Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich documents with inference results at Fetch #53230

Merged
merged 8 commits into from
Mar 11, 2020

Conversation

davidkyle
Copy link
Member

Why here

Search hits can be modified at fetch with new fields added. Fetch sub phases run the on the data node so additional features used by the model can be extracted from Lucene.

Configuration

There isn't a direct way of configuring FetchSubPhases so I have commandeered SearchExtSpec for the purpose. The ext spec is accessible via the SearchContext passed to the fetch sub phase. Parsed here SearchExtSpecs come under the "ext" field forcing this rather clunky nested config upon us:

    "query": { },
    "ext": {
        "ml_inference" : {
            "model_id" : "an_inference_model",
            "target_field" : "ml_results_field",
            "field_mappings": {"doc_field_name": "name_expected_by_model"},
            "inference_config": {
                "classification": {
                      "num_top_classes": 1
                }
            }
        }
    }

The usual config options apply.

Modifying the Search Hit

The goal is to append a field to each search hit with the inference result. I see 2 options for doing so:

  1. Modify the document source
  2. Add a DocumentField. The new field will appear under the fields section of the search hit as if it had been asked for in the search request via docvalue_fields

I've opted for the 2nd choice as modifying the source seems a little underhand. Again this is awkward putting the result where we would expect doc value fields, depending on the outcome of #49028 the future may offer another way to add fields to the search hit.

"hits" : [
 {
   "_index" : "store",
   "_id" : "NIxgsHABziobK4pvOW_2",
   "_score" : 0.52354836,
   "_source" : {
        ...
   },
   "fields" : {           <-- Usually doc value fields
     "ml_results_field" : [
       {
         "top_classes" : [
           {
             "class_name" : "hot dog",
             "class_probability" : 0.99,
             "class_score" : 1.0
           }
         ],
         "predicted_value" : "hot dog"
       }
     ]
   }
 },

The Problem

The InferencePhase class has access to the ModelLoadingService which neatly deals with the model caching problem but there is still a blocking call to load the model (which may or not be cached) the first time InferencePhase.hitsExecute(SearchContext, SearchHit[]) is called.

Wish List

  • A way to configure the fetch phase without nesting it inside the "ext" field

Why Here (Reprise)

Executing locally on the data node has the advantage of being close to any shard level features we want extract and use in inference. But it now occurs to me that those features could be extracted in a fetch sub phase and returned with the hit. Inference would then run on the coordinating node and the blocking call to load the model could be dropped.

This PR is raised against the feature branch feature/search-inference

@davidkyle davidkyle added :Search/Search Search-related issues that do not fall into other categories :ml Machine learning labels Mar 6, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@davidkyle davidkyle requested a review from jtibshirani March 6, 2020 15:51
@benwtrent
Copy link
Member

but there is still a blocking call to load the model (which may or not be cached) the first time

I think it would be beneficial to have a way to "deploy" models on to nodes. The downside is deployment vs. access race conditions would probably still result in a synchronous loading or something.

@jtibshirani
Copy link
Contributor

jtibshirani commented Mar 6, 2020

I haven't taken a close look at the code yet, but have some high-level comments first.

Again this is awkward putting the result where we would expect doc value fields

I actually don't find this awkward, since the fields section in the response contains not only the result of docvalue_fields, but also the result of other request fields like script_fields and stored_fields.

Building on this observation, I wanted to share another option for the API. We could model the inference as a new section similar to script fields, perhaps called ml_inference_fields:

"query": { ... },
"ext": {
  "ml_inference_fields": {
    "ml_results_field": {
      "model_id" : "an_inference_model",
      "field_mappings": {"doc_field_name": "name_expected_by_model"},
      "inference_config": {
        "classification": {
          "num_top_classes": 1
        }
      }
    }
  }
}

Note that the target_field parameter is no longer present, instead we are defining a new 'dynamic' field named ml_results_field. I'm curious what you think of this option, given your knowledge of the inference API and where it's headed.

A way to configure the fetch phase without nesting it inside the "ext" field

Ack, will give this some thought. I think my API suggestion above would also look much more natural without being nested under ext.

Inference would then run on the coordinating node and the blocking call to load the model could be dropped.

We've been quite unsure if it's better to run inference on the coordinating vs. data nodes. Is there a known set of investigations/ discussions we need to complete to reach clarity on this decision? Perhaps this would involve determining the types of models we want to support in a v1, and thinking through @benwtrent's idea about model deployment? It would be nice to have this list somewhere (I'm happy to move this conversation to an issue/ design doc to not make the PR too noisy).

@davidkyle
Copy link
Member Author

"query": { ... },
"ext": {
  "ml_inference_fields": {
    "ml_results_field": {
      "model_id" : "an_inference_model",
      "field_mappings": {"doc_field_name": "name_expected_by_model"},
      "inference_config": {
        "classification": {
          "num_top_classes": 1
        }
      }
    }
  }
}

Note that the target_field parameter is no longer present

I like this syntax it looks like aggregations and fits better with the query DSL. The only problem is that the inference processor is configured differently and it would be obtuse to make it so the config cannot be copy and pasted.

But we should consider the option.

@davidkyle davidkyle changed the base branch from feature/search-inference to master March 9, 2020 12:26
@davidkyle davidkyle changed the base branch from master to feature/search-inference March 9, 2020 12:26
@davidkyle
Copy link
Member Author

run elasticsearch-ci/2
run elasticsearch-ci/bwc
run elasticsearch-ci/default-distro

@benwtrent benwtrent self-requested a review March 10, 2020 16:27
public interface InferenceResults extends NamedWriteable {

void writeResult(IngestDocument document, String parentResultField);

Map<String, Object> writeResultToMap(String parentResultField);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should implement ToXContentObject instead of having a bespoke method that creates a map.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map is converted to DocumentFields we couldn't do that with ToXContentObject

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :D

listener.onResponse(trainedModelDefinition.infer(fields, config));
} catch (Exception e) {
listener.onFailure(e);
public InferenceResults infer(Map<String, Object> fields, InferenceConfig config) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is backwards.

We should have the synchronous method call the asynchronous method. This is the prevelant pattern everywhere else. Also, it is possible to make something asynchronous -> synchronous, not really the other way around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is backwards.

The function called by LocalModel::infer is TrainedModelDefinition::infer which does not have an async version. In this case we want to work to be done in the calling thread because the model is local to the call, for single threaded models I can't think of a situation where we would want to spawn another thread to do the work as we know inference is cheap. For models that could be parallelised and the work split over multiple threads then yes you would want to make it async.

Do we even need the async method right now? It is only called by TransportInternalInferModelAction and could easily be changed.

I removed the default method because it is backwards and wrong then implemented it in LocalModel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need the async method right now?

Maybe not, but it is much more difficult to make things asynchronous once they are synchronous.

Assumptions are made about where the model lives when it is synchronous.

What if this was a natively loaded model?
Would we pause the calling thread for the data to be serialized down the native process?

I am not sure about this, but I did not want to paint us in a corner.


modelLoadingService.get().getModel(infBuilder.getModelId(), listener);
try {
// Eeek blocking on a latch we can't be doing that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed :D. We may want this call to fail if the model is not deployed in the provided model service. Especially since there is no way to load it just in time :/

@davidkyle davidkyle merged commit 54fb29f into elastic:feature/search-inference Mar 11, 2020
@davidkyle davidkyle deleted the fetch-inference branch March 11, 2020 10:15
davidkyle added a commit that referenced this pull request Mar 13, 2020
Adds a FetchSubPhase which adds a new field to the search hits with the result of the model 
inference performed on the hit. There isn't a direct way of configuring FetchSubPhases so 
SearchExtSpec is used for the purpose.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants