ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

gwang111 · 2024-02-13T02:29:19Z

Issue:

Customer is deploying HF ID: BAAI/bge-m3 with Task: sentence-similarity using:

model_builder = ModelBuilder(
    model=“BAAI/bge-m3”,
    schema_builder=SchemaBuilder(sample_input, sample_output),
    model_path=path, #local path where artifacts will be saved
    mode=Mode.LOCAL_CONTAINER,
    env_vars={
        "HF_TASK": "sentence-similarity"
    }
)

model_builder.deploy()

But getting the following from within the HF DLC:

ModelBuilder: DEBUG:     2024-02-13 00:05:57,567 [INFO ] W-BAAI__bge-m3-58-stdout 
com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ValueError: Task couldn't be inferenced from XLMRobertaModel.Inference Toolkit 
can only inference tasks from architectures ending with ['TapasForQuestionAnswering', 'ForQuestionAnswering', 
'ForTokenClassification', 'ForSequenceClassification', 'ForMultipleChoice', 'ForMaskedLM', 'ForCausalLM', 
'ForConditionalGeneration', 'MTModel', 'EncoderDecoderModel', 'GPT2LMHeadModel', 'T5WithLMHeadModel'].Use env
`HF_TASK` to define your task.

Root Cause:

ARCHITECTURES_2_TASK mapping is too constraining and does not include all admissible pipeline tasks:

sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

Line 79 in 80634b3

ARCHITECTURES_2_TASK = {

All we do is pass the task to get_pipeline from within the handler_service.py as far as I can see:

sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/handler_service.py

Line 115 in 80634b3

    
           hf_pipeline = get_pipeline(task=os.environ["HF_TASK"], model_dir=model_dir, device=self.device)

Shouldn't the logic be the following?:

if "HF_TASK" provided -> set task to "HF_TASK"

else:
    fetch architecture from config.json
    if architecture is not in ARCHITECTURES_2_TASK -> throw error
    set task to mapped task

This way we allow for a best effort deployment / a.k.a pass through for the HF_TASK to get_pipeline(). If that fails, then so be it we should propagate the right error messaging to the customer stating that we tried the best and that xyz went wrong.

The text was updated successfully, but these errors were encountered:

philschmid · 2024-02-13T09:04:16Z

Hello,
task sentence-similarity is sentence-transformers specific which is not yet supported.

mohanasudhan · 2024-02-13T17:38:37Z

@philschmid Is there a reason why we do not support all the tasks?

gwang111 · 2024-02-13T19:23:15Z

ack so will this eventually be supported via get_pipeline? Or will it only be limited to:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("BAAI/bge-m3")

philschmid · 2024-02-13T21:49:10Z

We might for on that in the future but for now the simplest is to create a inference.py, see here https://www.philschmid.de/custom-inference-huggingface-sagemaker

samruds · 2024-02-14T18:12:34Z

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

In addition text2text-generation task provided in the map doesn't have schemas in hf/tasks

huggingface/huggingface.js@b70ac6c

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

philschmid · 2024-02-16T08:31:21Z

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

Where is "here"?

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

What map?

samruds · 2024-02-16T18:21:57Z

We're basically trying to extend inference capabilities by storing/generating additional metadata for a larger set of task types. Today we use the pipeline tag/task as a core filter of the selection logic, this was under the assumption that pySDK would be compatible will all transformer task types. Explicit types being defined in the ARCHITECTURES_2_TASK is blocking us currently. And my questions above are about possibilities to extend it without using the inference.py route. There is a proposed fix in this issue, but given the lack of transformers support, I am not sure if making the fix will help.

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

Where is "here"?
ARCHITECTURES_2_TASK

sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

Lines 79 to 93 in 80634b3

ARCHITECTURES_2_TASK = {

"TapasForQuestionAnswering": "table-question-answering",

"ForQuestionAnswering": "question-answering",

"ForTokenClassification": "token-classification",

"ForSequenceClassification": "text-classification",

"ForMultipleChoice": "multiple-choice",

"ForMaskedLM": "fill-mask",

"ForCausalLM": "text-generation",

"ForConditionalGeneration": "text2text-generation",

"MTModel": "text2text-generation",

"EncoderDecoderModel": "text2text-generation",

# Model specific task for backward comp

"GPT2LMHeadModel": "text-generation",

"T5WithLMHeadModel": "text2text-generation",

}

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

What map?
ARCHITECTURES_2_TASK

philschmid · 2024-02-20T07:40:52Z

The ARCHITECTURES_2_TASK Is not actively used since its not longer always possible to derive the task from a model since the same "architecture" can have multiple tasks. Thats why we always (every example, guide, docs etc.) set the TASK.

gwang111 · 2024-03-07T00:36:02Z

@samruds To address this gap, we should provide InferenceSpec support for HF DLC in ModelBuilder. This way a customer can provide custom inference script logic

The work around today is to treat this as a BYOC ModelBuilder scenario. Doing something like this:

class MySentenceTransformerModel(InferenceSpec):
    def load(self, model_dir: str):
        from sentence_transformers import SentenceTransformer, util
        model = SentenceTransformer("BAAI/bge-m3")
        return model

    def invoke(self, data: object, model: object):
        sentences = data["inputs"]

        embedding_1 = model.encode(sentences[0], convert_to_tensor=True)
        embedding_2 = model.encode(sentences[1], convert_to_tensor=True)

        similarity_score = util.pytorch_cos_sim(embedding_1, embedding_2)

        return {"score": similarity_score.numpy()}

sample_input = {
    "inputs": ["I'm happy", "I'm full of happiness"]
}

sample_output = {
    "score": [0.999]
}

mb = ModelBuilder(
    inference_spec=MySentenceTransformerModel(),
    schema_builder=SchemaBuilder(sample_input=sample_input, sample_output=sample_output),
    model_path="/home/ec2-user/SageMaker/test_dir",
    image_uri="763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.1.0-transformers4.37.0-cpu-py310-ubuntu22.04"
)

my_model = mb.build(mode=Mode.LOCAL_CONTAINER)
my_model.deploy()

So error handling on our end should be:

In ModelBuilder

If TASK cannot be inferred or is not supported -> please provide InferenceSpec

samruds · 2024-03-07T00:45:38Z

There are two things here

Tasks being restricted by old implementation - fix is out for that.
Some transformers don't have tasks that have been exposed via pipeline tag yet - the option you provided is one solution (another could them be exposed as pipeline tag on a release cycle). I think we would need a list of libraries that do need custom inference to prioritize such an implementation. If custom inference is needed, is this the long term plan for such tasks. This looks to be more as a problem of supporting tasks in extremely nascent stage of development. Maybe @philschmid can weigh in more over the long term release.

samruds · 2024-03-13T17:29:42Z

Customer has two options here

Set the HF task via env variables
Provide inference.py for non task models like sentence similarity.

samruds self-assigned this Feb 14, 2024

samruds mentioned this issue Feb 27, 2024

Add support for additional tasks via HuggingFace TaskManager #113

Closed

gwang111 mentioned this issue Feb 29, 2024

feature: Add overriding logic in ModelBuilder when task is provided aws/sagemaker-python-sdk#4460

Merged

9 tasks

samruds closed this as completed Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

gwang111 commented Feb 13, 2024 •

edited

Loading

philschmid commented Feb 13, 2024

mohanasudhan commented Feb 13, 2024

gwang111 commented Feb 13, 2024

philschmid commented Feb 13, 2024

samruds commented Feb 14, 2024

philschmid commented Feb 16, 2024

samruds commented Feb 16, 2024 •

edited

Loading

philschmid commented Feb 20, 2024

gwang111 commented Mar 7, 2024 •

edited

Loading

samruds commented Mar 7, 2024 •

edited

Loading

samruds commented Mar 13, 2024

ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

Comments

gwang111 commented Feb 13, 2024 • edited Loading

Issue:

Root Cause:

philschmid commented Feb 13, 2024

mohanasudhan commented Feb 13, 2024

gwang111 commented Feb 13, 2024

philschmid commented Feb 13, 2024

samruds commented Feb 14, 2024

philschmid commented Feb 16, 2024

samruds commented Feb 16, 2024 • edited Loading

philschmid commented Feb 20, 2024

gwang111 commented Mar 7, 2024 • edited Loading

samruds commented Mar 7, 2024 • edited Loading

samruds commented Mar 13, 2024

gwang111 commented Feb 13, 2024 •

edited

Loading

samruds commented Feb 16, 2024 •

edited

Loading

gwang111 commented Mar 7, 2024 •

edited

Loading

samruds commented Mar 7, 2024 •

edited

Loading