Update `huggingface_hub` Version in the storage initializer to fix ImportError #2180

helenxie-bit · 2024-07-21T05:58:07Z

What this PR does / why we need it:
Due to the update of huggingface_hub, split_torch_state_dict_into_shards is not supported in v0.19.3. Therefore, I updated the version in the requirements.txt for the storage initializer to fix the "ImportError".

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #2179

Checklist:

Signed-off-by: helenxie-bit <helenxiehz@gmail.com>

coveralls · 2024-07-21T06:02:30Z

Pull Request Test Coverage Report for Build 10026064625

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall first build on helenxie/fix_huggingface_hub_version at 35.398%

Totals
Change from base Build 9999203579:	35.4%
Covered Lines:	4377
Relevant Lines:	12365

💛 - Coveralls

tenzen-y

It seems that this error is irrelevant to the huggingface_hub version.
Which peft version do you use in your local?

I guess that your local peft version is newer than v0.3.0:

training-operator/sdk/python/kubeflow/storage_initializer/requirements.txt

Line 1 in f55a91d

peft==0.3.0

helenxie-bit · 2024-07-29T00:36:27Z

It seems that this error is irrelevant to the huggingface_hub version. Which peft version do you use in your local?

I guess that your local peft version is newer than v0.3.0:

training-operator/sdk/python/kubeflow/storage_initializer/requirements.txt

Line 1 in f55a91d

peft==0.3.0

@tenzen-y My local peft version is 0.3.0:

Name: peft
Version: 0.3.0
Summary: Parameter-Efficient Fine-Tuning (PEFT)
Home-page: https://github.com/huggingface/peft
Author: The HuggingFace team
Author-email: sourab@huggingface.co
License: Apache
Location: /opt/homebrew/anaconda3/envs/kubeflow/lib/python3.11/site-packages
Requires: accelerate, numpy, packaging, psutil, pyyaml, torch, transformers
Required-by:

And here is the detailed information of the error:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/app/storage_initializer/storage.py", line 2, in <module>
    from .hugging_face import HuggingFace, HuggingFaceDataset
  File "/app/storage_initializer/hugging_face.py", line 8, in <module>
    from peft import LoraConfig
  File "/usr/local/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PEFT_TYPE_TO_CONFIG_MAPPING, get_peft_config, get_peft_model
  File "/usr/local/lib/python3.11/site-packages/peft/mapping.py", line 16, in <module>
    from .peft_model import (
  File "/usr/local/lib/python3.11/site-packages/peft/peft_model.py", line 22, in <module>
    from accelerate import dispatch_model, infer_auto_device_map
  File "/usr/local/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
    from .accelerator import Accelerator
  File "/usr/local/lib/python3.11/site-packages/accelerate/accelerator.py", line 34, in <module>
    from huggingface_hub import split_torch_state_dict_into_shards
ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py)

Do you have any idea where the problem could be?

andreyvelich · 2024-07-29T19:21:26Z

@tenzen-y @helenxie-bit Getting the same error on my side with huggingface_hub==0.19.3 version
I think, this update can be related: #2056.

@tenzen-y @johnugeorge @deepanker13 Should we move this forward to fix errors in train API ?

Additionally, @helenxie-bit if you could help us with some simple e2e tests for train API that would be amazing!

helenxie-bit · 2024-07-29T23:02:48Z

Yeah, of course. I can help with the e2e tests.

tenzen-y · 2024-07-31T05:28:24Z

@tenzen-y @helenxie-bit Getting the same error on my side with huggingface_hub==0.19.3 version I think, this update can be related: #2056.

@tenzen-y @johnugeorge @deepanker13 Should we move this forward to fix errors in train API ?

Additionally, @helenxie-bit if you could help us with some simple e2e tests for train API that would be amazing!

SGTM

andreyvelich

I think, we can merge it. Thanks @helenxie-bit!
/lgtm
/approve

google-oss-prow · 2024-08-05T12:44:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~sdk/python/OWNERS~~ [andreyvelich]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…portError (kubeflow#2180) Signed-off-by: helenxie-bit <helenxiehz@gmail.com>

…portError (kubeflow#2180) Signed-off-by: helenxie-bit <helenxiehz@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>

* Update `huggingface_hub` Version in the storage initializer to fix ImportError (#2180) Signed-off-by: helenxie-bit <helenxiehz@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> * [SDK] Fix trainer error: Update the version of base image and add "num_labels" for downloading pretrained models (#2230) * fix trainer error Signed-off-by: helenxie-bit <helenxiehz@gmail.com> * rerun tests Signed-off-by: helenxie-bit <helenxiehz@gmail.com> * update the process of num_labels in trainer Signed-off-by: helenxie-bit <helenxiehz@gmail.com> * rerun tests Signed-off-by: helenxie-bit <helenxiehz@gmail.com> * adjust the default value of 'num_labels' Signed-off-by: helenxie-bit <helenxiehz@gmail.com> --------- Signed-off-by: helenxie-bit <helenxiehz@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> --------- Signed-off-by: helenxie-bit <helenxiehz@gmail.com> Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com> Co-authored-by: Hezhi Xie <hezxie@ucdavis.edu> Co-authored-by: Hezhi (Helen) Xie <helenxiehz@gmail.com>

fix the version of huggingface_hub

9e0e27e

Signed-off-by: helenxie-bit <helenxiehz@gmail.com>

google-oss-prow bot requested review from jinchihe and kuizhiqing July 21, 2024 05:58

google-oss-prow bot added the size/XS label Jul 21, 2024

tenzen-y reviewed Jul 26, 2024

View reviewed changes

andreyvelich approved these changes Aug 5, 2024

View reviewed changes

google-oss-prow bot assigned andreyvelich Aug 5, 2024

google-oss-prow bot added the lgtm label Aug 5, 2024

google-oss-prow bot added the approved label Aug 5, 2024

google-oss-prow bot merged commit 0f8588a into kubeflow:master Aug 5, 2024
39 checks passed

helenxie-bit mentioned this pull request Aug 19, 2024

[GSoC] Project 4: Hyperparameter Optimization API in Katib for LLMs kubeflow/katib#2339

Open

6 tasks

tenzen-y mentioned this pull request Aug 29, 2024

[Release] Training operator 1.8.1 release #2241

Closed

4 tasks

andreyvelich pushed a commit to andreyvelich/training-operator that referenced this pull request Aug 29, 2024

Update huggingface_hub Version in the storage initializer to fix Im…

bccd0ae

…portError (kubeflow#2180) Signed-off-by: helenxie-bit <helenxiehz@gmail.com>

andreyvelich mentioned this pull request Aug 29, 2024

Cherry pick of #2180 #2230 into v1.8-branch #2242

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `huggingface_hub` Version in the storage initializer to fix ImportError #2180

Update `huggingface_hub` Version in the storage initializer to fix ImportError #2180

helenxie-bit commented Jul 21, 2024

coveralls commented Jul 21, 2024

tenzen-y left a comment

helenxie-bit commented Jul 29, 2024

andreyvelich commented Jul 29, 2024

helenxie-bit commented Jul 29, 2024

tenzen-y commented Jul 31, 2024

andreyvelich left a comment

google-oss-prow bot commented Aug 5, 2024

Update huggingface_hub Version in the storage initializer to fix ImportError #2180

Update huggingface_hub Version in the storage initializer to fix ImportError #2180

Conversation

helenxie-bit commented Jul 21, 2024

coveralls commented Jul 21, 2024

Pull Request Test Coverage Report for Build 10026064625

Details

💛 - Coveralls

tenzen-y left a comment

Choose a reason for hiding this comment

helenxie-bit commented Jul 29, 2024

andreyvelich commented Jul 29, 2024

helenxie-bit commented Jul 29, 2024

tenzen-y commented Jul 31, 2024

andreyvelich left a comment

Choose a reason for hiding this comment

google-oss-prow bot commented Aug 5, 2024

Update `huggingface_hub` Version in the storage initializer to fix ImportError #2180

Update `huggingface_hub` Version in the storage initializer to fix ImportError #2180