HuggingFaceModel does not properly accept script mode environment variables #3361

athewsey · 2022-09-16T10:00:45Z

Describe the bug

While Model / FrameworkModel's prepare_container_def() supports (here) manually configuring script mode environment variables for an existing model.tar.gz package, HuggingFaceModel's override implementation does not (here). User-configured env={ "SAGEMAKER_PROGRAM", "SAGEMAKER_SUBMIT_DIRECTORY", ...} are ignored regardless of whether re-packing of new entrypoint code is requested.

This is important for importing large (multi-GB) pre-trained models to SageMaker inference, because it forces us to use the SDK class' re-packing functionality to add inference code... Which is significantly slower in some cases: Can reach tens of minutes extra delay.

To reproduce

Prepare a model.tar.gz in S3, already containing a code/inference.py alongside (whatever) model artifacts. For a simple reproduction, could use no model artifacts at all - and add trivial custom model loader to inference.py something like model_fn(data_dir): return lambda x: x.

In my current use case, my model artifacts are about 5GB and constructing/uploading this archive takes ~10min - regardless of whether the small script code is included.

Create and deploy a Hugging Face Model from the archive on S3 via SageMaker Python SDK, indicating what code directory and entry point should be used:

model = HuggingFaceModel(
    model_data = "s3://.../model.tar.gz",  # (Contains code/inference.py)  
    role=sagemaker.get_execution_role(),
    py_version="py38",
    pytorch_version="1.10",
    transformers_version="4.17",
    env={
        "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
        "SAGEMAKER_PROGRAM": "inference.py",
        "SAGEMAKER_REGION": "ap-southeast-1",
        "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    },
)
predictor = model.deploy(instance_type="ml.g4dn.xlarge", initial_instance_count=1)

Observed behavior

The endpoint will fail to find the inference.py entry point (and therefore will not correctly use the model_fn() and fail to load).

This is because the HuggingFaceModel overrides the SAGEMAKER_PROGRAM and SAGEMAKER_SUBMIT_DIRECTORY environment variables to empty even though no entry_point or source_dir are provided.

Expected behavior

The HuggingFaceModel should correctly propagate the user-specified environment variables, to support using a pre-prepared model.tar.gz without re-packing. In this case, the container would find the pre-loaded inference.py entry point and correctly use the override model_fn.

Screenshots or logs

N/A

System information
A description of your system. Please provide:

SageMaker Python SDK version: 2.92.1
Framework name (eg. PyTorch) or algorithm (eg. KMeans): HuggingFace
Framework version: 4.17
Python version: py38
CPU or GPU: GPU
Custom Docker image (Y/N): N

Additional context

I am able to deploy a working endpoint by having my code folder and inference.py locally and adding these options to the model: HuggingFaceModel(source_dir="code", entry_point="inference.py", ...).

The problem is this >doubles the time and resources taken to prepare the package:

10min to produce an initial "model-raw.tar.gz" and load to S3
10min for the SageMaker SDK to download that archive, extract and re-pack it to add code folder, and re-upload to a new location

Since the use case here is just to prepare the model from local artifacts+code, it would also be OK if model_data was able to accept a local, uncompressed folder: As the 10min tarball creation would still only need to be done once. From my tests though, this doesn't seem to be possible?

The text was updated successfully, but these errors were encountered:

athewsey · 2022-09-19T07:47:41Z

As an interim measure, users could override this behaviour with a patch like this:

class PatchedHuggingFaceModel(HuggingFaceModel):
    """Modified Model class to allow manually setting SM Script Mode env vars"""

    def prepare_container_def(self, *args, **kwargs):
        # Call the parent function:
        result = super().prepare_container_def(*args, **kwargs)
        # ...But allow our manual env vars configuration to override the internals:
        manual_env = dict(self.env)
        result["Environment"].update(manual_env)
        return result

Now you are free to:

model = PatchedHuggingFaceModel(
    model_data="s3://doc-example-bucket/pre-packed/model.tar.gz"
    env={
        # "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
        "SAGEMAKER_PROGRAM": "inference.py",
        # "SAGEMAKER_REGION": region,
        "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    },
    ...
)
model.deploy(...)

(...Where the pre-uploaded tarball already contains code/inference.py, code/requirements.txt and whatever else)

athewsey added the bug label Sep 16, 2022

martinRenou added the HuggingFace label Sep 21, 2023

trungleduc added the contributions welcome label Sep 21, 2023

martinRenou added the Good First Issue label Oct 6, 2023

martinRenou mentioned this issue Oct 10, 2023

feature: Accept user-defined env variables for the entry-point #4175

Merged

9 tasks

martinRenou self-assigned this Oct 10, 2023

knikure closed this as completed in #4175 Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HuggingFaceModel does not properly accept script mode environment variables #3361

HuggingFaceModel does not properly accept script mode environment variables #3361

athewsey commented Sep 16, 2022

athewsey commented Sep 19, 2022

HuggingFaceModel does not properly accept script mode environment variables #3361

HuggingFaceModel does not properly accept script mode environment variables #3361

Comments

athewsey commented Sep 16, 2022

athewsey commented Sep 19, 2022