Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Separate Local Model Registration to support Custom, Pretrained and Sparse Encoding Models #377

Closed
joshpalis opened this issue Jan 5, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request v2.12.0

Comments

@joshpalis
Copy link
Member

joshpalis commented Jan 5, 2024

Is your feature request related to a problem?

Coming from flaky integration test failures due to the memory limitations of Github action runners which facilitate integration tests (Reference), it has been determined that mitigation would require using a smaller model to test out local model registration within an integration test cluster.

The amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1 has been determined to be the smallest model we can use for testing (Documentation) , however the required fields for registering a sparse encoding model is different than the required fields for registering a custom text embedding model.

In order to replace the model used in integration testing with this sparse encoding model, it is required to separate out the RegisterLocalModelStep into a RegisterCustomLocalModelStep and RegisterSparseEncodingLocalModelStep. Additionally, we require support for registering an OpenSearch -provided pretained model, which does not require a URL.

What solution would you like?

The RegisterCustomLocalModelStep relates to the following documentation and will have the following required and optional fields :

Required keys :

  • name
  • version
  • model_format
  • function_name
  • model_content_hash_value
  • url
  • model_type
  • embedding_dimension
  • framework_type

Optional keys :

  • description
  • model_group_id
  • all_config
  • deploy

The RegisterSparseEncodingModelStep relates to the following documentation and will have the following required and optional fields

Required keys :

  • name
  • version
  • model_format
  • function_name
  • model_content_hash_value
  • url

Optional keys :

  • description
  • model_group_id
  • deploy

The RegisterLocalPretrainedModelStep relates to the following documentation and will have the following required and optional fields

Required keys :

  • name
  • version
  • model_format

Optional keys :

  • description
  • model_group_id
  • deploy
@dbwiddis
Copy link
Member

dbwiddis commented Jan 6, 2024

Suggestion:

  • rename existing RegisterLocalModelStep into an abstract parent step.
  • add an abstract method to get the required and optional keys
  • create 3 subclasses implementing those required/optional keys methods, each with their own NAME

@dbwiddis dbwiddis added v2.12.0 and removed untriaged labels Jan 7, 2024
@joshpalis joshpalis changed the title [FEATURE] Separate Local Model Registration to support Custom and Sparse Encoding Models [FEATURE] Separate Local Model Registration to support Custom, Pretrained and Sparse Encoding Models Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v2.12.0
Projects
None yet
Development

No branches or pull requests

2 participants