-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARCHITECTURES_2_TASK is limiting the tasks able to be deployed with HF DLC #112
Comments
Hello, |
@philschmid Is there a reason why we do not support all the tasks? |
ack so will this eventually be supported via
|
We might for on that in the future but for now the simplest is to create a inference.py, see here https://www.philschmid.de/custom-inference-huggingface-sagemaker |
@philschmid there are a number of multimodal tasks missing here (object detection, text to speech, audio classification etc.) In addition text2text-generation task provided in the map doesn't have schemas in hf/tasks huggingface/huggingface.js@b70ac6c Is there a way to extend this map further for new use-cases? This will also save customer effort to create the inference.py for these formats. |
Where is "here"?
What map? |
We're basically trying to extend inference capabilities by storing/generating additional metadata for a larger set of task types. Today we use the pipeline tag/task as a core filter of the selection logic, this was under the assumption that pySDK would be compatible will all transformer task types. Explicit types being defined in the ARCHITECTURES_2_TASK is blocking us currently. And my questions above are about possibilities to extend it without using the inference.py route. There is a proposed fix in this issue, but given the lack of transformers support, I am not sure if making the fix will help.
|
The |
@samruds To address this gap, we should provide InferenceSpec support for HF DLC in ModelBuilder. This way a customer can provide custom inference script logic The work around today is to treat this as a BYOC ModelBuilder scenario. Doing something like this:
So error handling on our end should be: In ModelBuilder
|
There are two things here
|
Customer has two options here
|
Issue:
Customer is deploying HF ID:
BAAI/bge-m3
with Task:sentence-similarity
using:But getting the following from within the HF DLC:
Root Cause:
ARCHITECTURES_2_TASK
mapping is too constraining and does not include all admissible pipeline tasks:sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/transformers_utils.py
Line 79 in 80634b3
All we do is pass the task to
get_pipeline
from within thehandler_service.py
as far as I can see:sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/handler_service.py
Line 115 in 80634b3
Shouldn't the logic be the following?:
This way we allow for a best effort deployment / a.k.a pass through for the HF_TASK to
get_pipeline()
. If that fails, then so be it we should propagate the right error messaging to the customer stating that we tried the best and that xyz went wrong.The text was updated successfully, but these errors were encountered: