Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112

Closed
gwang111 opened this issue Feb 13, 2024 · 11 comments
Closed
Assignees

Comments

@gwang111
Copy link

gwang111 commented Feb 13, 2024

Issue:

Customer is deploying HF ID: BAAI/bge-m3 with Task: sentence-similarity using:

model_builder = ModelBuilder(
    model=“BAAI/bge-m3”,
    schema_builder=SchemaBuilder(sample_input, sample_output),
    model_path=path, #local path where artifacts will be saved
    mode=Mode.LOCAL_CONTAINER,
    env_vars={
        "HF_TASK": "sentence-similarity"
    }
)

model_builder.deploy()

But getting the following from within the HF DLC:

ModelBuilder: DEBUG:     2024-02-13 00:05:57,567 [INFO ] W-BAAI__bge-m3-58-stdout 
com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ValueError: Task couldn't be inferenced from XLMRobertaModel.Inference Toolkit 
can only inference tasks from architectures ending with ['TapasForQuestionAnswering', 'ForQuestionAnswering', 
'ForTokenClassification', 'ForSequenceClassification', 'ForMultipleChoice', 'ForMaskedLM', 'ForCausalLM', 
'ForConditionalGeneration', 'MTModel', 'EncoderDecoderModel', 'GPT2LMHeadModel', 'T5WithLMHeadModel'].Use env
`HF_TASK` to define your task.

Root Cause:

ARCHITECTURES_2_TASK mapping is too constraining and does not include all admissible pipeline tasks:

All we do is pass the task to get_pipeline from within the handler_service.py as far as I can see:

hf_pipeline = get_pipeline(task=os.environ["HF_TASK"], model_dir=model_dir, device=self.device)

Shouldn't the logic be the following?:

if "HF_TASK" provided -> set task to "HF_TASK"

else:
    fetch architecture from config.json
    if architecture is not in ARCHITECTURES_2_TASK -> throw error
    set task to mapped task

This way we allow for a best effort deployment / a.k.a pass through for the HF_TASK to get_pipeline(). If that fails, then so be it we should propagate the right error messaging to the customer stating that we tried the best and that xyz went wrong.

@philschmid
Copy link
Collaborator

Hello,
task sentence-similarity is sentence-transformers specific which is not yet supported.

@mohanasudhan
Copy link

@philschmid Is there a reason why we do not support all the tasks?

@gwang111
Copy link
Author

ack so will this eventually be supported via get_pipeline? Or will it only be limited to:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("BAAI/bge-m3")

@philschmid
Copy link
Collaborator

We might for on that in the future but for now the simplest is to create a inference.py, see here https://www.philschmid.de/custom-inference-huggingface-sagemaker

@samruds
Copy link

samruds commented Feb 14, 2024

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

In addition text2text-generation task provided in the map doesn't have schemas in hf/tasks

huggingface/huggingface.js@b70ac6c

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

@samruds samruds self-assigned this Feb 14, 2024
@philschmid
Copy link
Collaborator

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

Where is "here"?

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

What map?

@samruds
Copy link

samruds commented Feb 16, 2024

We're basically trying to extend inference capabilities by storing/generating additional metadata for a larger set of task types. Today we use the pipeline tag/task as a core filter of the selection logic, this was under the assumption that pySDK would be compatible will all transformer task types. Explicit types being defined in the ARCHITECTURES_2_TASK is blocking us currently. And my questions above are about possibilities to extend it without using the inference.py route. There is a proposed fix in this issue, but given the lack of transformers support, I am not sure if making the fix will help.

@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.)

Where is "here"?
ARCHITECTURES_2_TASK

ARCHITECTURES_2_TASK = {
"TapasForQuestionAnswering": "table-question-answering",
"ForQuestionAnswering": "question-answering",
"ForTokenClassification": "token-classification",
"ForSequenceClassification": "text-classification",
"ForMultipleChoice": "multiple-choice",
"ForMaskedLM": "fill-mask",
"ForCausalLM": "text-generation",
"ForConditionalGeneration": "text2text-generation",
"MTModel": "text2text-generation",
"EncoderDecoderModel": "text2text-generation",
# Model specific task for backward comp
"GPT2LMHeadModel": "text-generation",
"T5WithLMHeadModel": "text2text-generation",
}

Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats.

What map?
ARCHITECTURES_2_TASK

@philschmid
Copy link
Collaborator

The ARCHITECTURES_2_TASK Is not actively used since its not longer always possible to derive the task from a model since the same "architecture" can have multiple tasks. Thats why we always (every example, guide, docs etc.) set the TASK.

@gwang111
Copy link
Author

gwang111 commented Mar 7, 2024

@samruds To address this gap, we should provide InferenceSpec support for HF DLC in ModelBuilder. This way a customer can provide custom inference script logic

The work around today is to treat this as a BYOC ModelBuilder scenario. Doing something like this:

class MySentenceTransformerModel(InferenceSpec):
    def load(self, model_dir: str):
        from sentence_transformers import SentenceTransformer, util
        model = SentenceTransformer("BAAI/bge-m3")
        return model

    def invoke(self, data: object, model: object):
        sentences = data["inputs"]

        embedding_1 = model.encode(sentences[0], convert_to_tensor=True)
        embedding_2 = model.encode(sentences[1], convert_to_tensor=True)

        similarity_score = util.pytorch_cos_sim(embedding_1, embedding_2)

        return {"score": similarity_score.numpy()}

sample_input = {
    "inputs": ["I'm happy", "I'm full of happiness"]
}

sample_output = {
    "score": [0.999]
}

mb = ModelBuilder(
    inference_spec=MySentenceTransformerModel(),
    schema_builder=SchemaBuilder(sample_input=sample_input, sample_output=sample_output),
    model_path="/home/ec2-user/SageMaker/test_dir",
    image_uri="763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.1.0-transformers4.37.0-cpu-py310-ubuntu22.04"
)

my_model = mb.build(mode=Mode.LOCAL_CONTAINER)
my_model.deploy()

So error handling on our end should be:

In ModelBuilder

  • If TASK cannot be inferred or is not supported -> please provide InferenceSpec

@samruds
Copy link

samruds commented Mar 7, 2024

There are two things here

  1. Tasks being restricted by old implementation - fix is out for that.
  2. Some transformers don't have tasks that have been exposed via pipeline tag yet - the option you provided is one solution (another could them be exposed as pipeline tag on a release cycle). I think we would need a list of libraries that do need custom inference to prioritize such an implementation. If custom inference is needed, is this the long term plan for such tasks. This looks to be more as a problem of supporting tasks in extremely nascent stage of development. Maybe @philschmid can weigh in more over the long term release.

@samruds
Copy link

samruds commented Mar 13, 2024

Customer has two options here

  1. Set the HF task via env variables
  2. Provide inference.py for non task models like sentence similarity.

@samruds samruds closed this as completed Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants