Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX Loading with AutoPeftModel.from_pretrained #1449

Conversation

BenjaminBossan
Copy link
Member

Fixes #1430

When Using AutoPeftModel.from_pretrained, there is a check to see if a tokenizer can be found. This check will include a search for the tokenizer on HF Hub. However, when the model is stored locally, the path may not be a valid HF Hub repo ID. In that case, an error is raised by huggingface_hub.

This PR consists of catching that error, and assuming that if the error occurs, the tokenizer does not exist. This resolves the issue.

Note

We do have tests for AutoPeftModel.from_pretrained, so why did they not catch the error? I think there was a mistake in the tests. The tests did something like this:

    model_id = "peft-internal-testing/tiny-OPTForCausalLM-lora"
    model = AutoPeftModelForCausalLM.from_pretrained(model_id)

    with tempfile.TemporaryDirectory() as tmp_dirname:
        model.save_pretrained(tmp_dirname)
        model = AutoPeftModelForCausalLM.from_pretrained(model_id)  # <==

As you can see, the model is loaded twice from the hub. I think the intent was that the 2nd time around, it should be loaded from the temporary directory (otherwise, why save it there?). I thus changed all the tests to load the model from the temporary directory. This way, the bug is triggered and the bug fix makes the tests pass again.

Please let me know if I misinterpreted the intent of the tests, then I will revert them and add a new test instead.

Fixes huggingface#1430

When Using AutoPeftModel.from_pretrained, there is a check to see if a
tokenizer can be found. This check will include a search for the
tokenizer on HF Hub. However, when the model is stored locally, the path
may not be a valid HF Hub repo ID. In that case, an error is raised by
huggingface_hub.

This PR consists of catching that error, and assuming that if the error
occurs, the tokenizer does not exist. This resolves the issue.

Note

We do have tests for AutoPeftModel.from_pretrained, so why did they not
catch the error? I think there was a mistake in the tests. The tests did
something like this:

    model_id = "peft-internal-testing/tiny-OPTForCausalLM-lora"
    model = AutoPeftModelForCausalLM.from_pretrained(model_id)

    with tempfile.TemporaryDirectory() as tmp_dirname:
        model.save_pretrained(tmp_dirname)
        model = AutoPeftModelForCausalLM.from_pretrained(model_id)  # <==

As you can see, the model is loaded twice from the hub. I think the
intent was that the 2nd time around, it should be loaded from the
temporary directory (otherwise, why save it there?). I thus changed all
the tests to load the model from the temporary directory. This way, the
bug is triggered and the bugfix makes the tests pass again.

Please let me know if I misinterpreted the intent of the tests, then I
will revert them and add a new test instead.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @BenjaminBossan for the fixes!

@BenjaminBossan BenjaminBossan merged commit 055e4db into huggingface:main Feb 9, 2024
14 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-issue-autopeftmodel-loading-locally branch February 9, 2024 10:00
BenjaminBossan added a commit to BenjaminBossan/peft that referenced this pull request Mar 14, 2024
Fixes huggingface#1430

When Using AutoPeftModel.from_pretrained, there is a check to see if a
tokenizer can be found. This check will include a search for the
tokenizer on HF Hub. However, when the model is stored locally, the path
may not be a valid HF Hub repo ID. In that case, an error is raised by
huggingface_hub.

This PR consists of catching that error, and assuming that if the error
occurs, the tokenizer does not exist. This resolves the issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

huggingface_hub.utils._validators.HFValidationError when AutoPeftModelForCausalLM.from_pretrained
3 participants