Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add embedding_size to text embedding config #95176

Merged
merged 3 commits into from
Apr 17, 2023

Conversation

davidkyle
Copy link
Member

Adds an optional field embedding_size to the text embedding config for NLP models. The field should be set at model creation and cannot be modified later. If defined embedding_size should be used to set the number of dimensions for the dense_vector field mapping the embedding will be indexed in.

@github-actions
Copy link
Contributor

Documentation preview:

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Apr 12, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

) {
this.vocabularyConfig = Optional.ofNullable(vocabularyConfig)
.orElse(new VocabularyConfig(InferenceIndexConstants.nativeDefinitionStore()));
this.tokenization = tokenization == null ? Tokenization.createDefault() : tokenization;
this.resultsField = resultsField;
this.embeddingSize = embeddingSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is serialized as strictly non-negative there should be a check in the public constructor that a negative number hasn't been supplied.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I've added that check to the ctor, it makes no sense for the size to be <= 0

commit: 1fbbdde

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants