Support dedicated ml node #79

spbjss · 2021-09-30T00:31:00Z

Is your feature request related to a problem?
We released ml-commons plugin in OpenSearch 1.3. It supports training model and predicting. ML model generally consuming more resources, especially for training process. The community wants to support bigger ML models which might require more resources and special hardware like GPU.

As OpenSearch doesn’t support ML node, we dispatch ML task to data node only. That means if user wants to train a large model, they need to scale up all data nodes which can be costly. And ML tasks will use shared resources on data nodes which may impact the core searching/indexing function.

What solution would you like?
Support a dedicated ML node, users don’t need to scale up their data node at all. Instead just configure a new ML node (with different settings, more powerful instance type) and add it to cluster via the YAML file (requires a cluster restart). By doing so, users can separate resource usage better by running ML task on dedicated node which can reduce impact to other critical tasks like search/ingestion.

OpenSearch core will check node role when start node. If role is not built-in roles like data role, it will throw exception and node can't start. To support dedicated ML node, we have to remove this limitation in OpenSearch core. That is done with this PR which supports dynamic node role in OpenSearch opensearch-project/OpenSearch#3436.

With that we can enhance ml-commons code to dispatch task to ml nodes first. If no ml nodes we can fall back to data nodes.

Do you have any additional context?
Original Proposal

The text was updated successfully, but these errors were encountered:

spbjss added the enhancement New feature or request label Sep 30, 2021

ylwu-amzn added the v2.1.0 label Jun 15, 2022

This was referenced Jun 15, 2022

dispatch ML task to ML node first #346

Merged

[META] Support ML dynamic node role opensearch-project/anomaly-detection#571

Open

ylwu-amzn closed this as completed in #346 Jun 17, 2022

ylwu-amzn reopened this Jun 20, 2022

ylwu-amzn mentioned this issue Jun 20, 2022

Support dynamic node role opensearch-project/OpenSearch#2877

Closed

ylwu-amzn mentioned this issue Jul 7, 2022

Add 2.1.0 release notes opensearch-project/opensearch-build#2302

Merged

ylwu-amzn closed this as completed Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support dedicated ml node #79

Support dedicated ml node #79

spbjss commented Sep 30, 2021 •

edited by ylwu-amzn

Loading

Support dedicated ml node #79

Support dedicated ml node #79

Comments

spbjss commented Sep 30, 2021 • edited by ylwu-amzn Loading

spbjss commented Sep 30, 2021 •

edited by ylwu-amzn

Loading