[Hotfix] Fix BOFT mixed precision #1925

Edenzzzz · 2024-07-14T06:39:41Z

The authors of BOFT seemingly forgot to cast some data types in the bf16/fp16 mixed precision setting, so me and @srguo24 fixed them during our research project.
See error below (reproducible when running any model with transformers trainer in bf16)

BenjaminBossan

Thanks a lot for providing this fix for BOFT. Do you have a small example that produces the error you showed? Ideally, we can turn this into a unit test for this bug.

BenjaminBossan · 2024-07-15T09:56:43Z

src/peft/tuners/boft/layer.py

@@ -77,9 +78,6 @@ def get_fbd_cuda():
    if _FBD_CUDA is not None:
        return _FBD_CUDA

-    # This import initializes cuda context and should thus be local, see issue 1877
-    from torch.utils.cpp_extension import load


Could you please undo this change? The import should be local. Maybe merging with/rebasing on the latest main is sufficient.

BenjaminBossan · 2024-07-15T09:56:55Z

src/peft/tuners/boft/layer.py

                boft_rotation = butterfly_oft_mat @ boft_rotation
                boft_scale = boft_s * boft_scale
+


BenjaminBossan · 2024-07-29T09:10:42Z

@Edenzzzz do you still plan to work on this PR?

Edenzzzz · 2024-08-05T07:36:46Z

@BenjaminBossan Sorry for the late update. I have added the test.

BenjaminBossan

Thanks for adding test. I ran them locally and indeed they fail without your fix. Let's still make some small changes to the test, please check my comments. Also, could you please run make style so that the linter/formatter can fix a few things?

BenjaminBossan · 2024-08-05T10:41:56Z

tests/test_common_gpu.py

@@ -1135,6 +1135,20 @@ def test_dora_ephemeral_gpu_offload(self):
        # The results should be the same
        assert torch.allclose(out_peft_model_cpu, out_peft_model_ego)

+    @require_torch_gpu


Could you please move this whole test to tests/test_gpu_examples.py? Please create a new test class TestBOFT at the very bottom of the file and place this test inside of the class. You could also split it into two tests, one for Linear and one for Conv2d.

BenjaminBossan · 2024-08-05T10:42:06Z

tests/test_common_gpu.py

+        layer = nn.Linear(160, 160).cuda()
+        layer = Linear(layer, "layer", boft_n_butterfly_factor=2).to(dtype=torch.bfloat16)
+        x = torch.randn(160, 160, device="cuda", dtype=torch.bfloat16)
+        x = layer(x)


Suggested change

x = layer(x)

layer(x) # does not raise

BenjaminBossan · 2024-08-05T10:42:19Z

tests/test_common_gpu.py

+        conv = nn.Conv2d(1, 1, 4).cuda()
+        conv = Conv2d(conv, "conv", boft_n_butterfly_factor=2).to(dtype=torch.bfloat16)
+        x = torch.randn(1, 160, 160, device="cuda", dtype=torch.bfloat16)
+        x = conv(x)


Suggested change

x = conv(x)

conv(x) # does not raise

BenjaminBossan · 2024-08-05T10:43:51Z

tests/test_common_gpu.py

@@ -50,7 +50,7 @@
 )
 from peft.import_utils import is_bnb_4bit_available, is_bnb_available
 from peft.tuners.lora.config import LoraRuntimeConfig
-
+from peft.tuners.boft.layer import Linear, Conv2d


Let's import from peft.tuners import boft and below use boft.layer.Linear and boft.layer.Conv2d.

Edenzzzz · 2024-08-05T12:05:46Z

Thanks for the feedback. Pushed changes

BenjaminBossan · 2024-08-05T12:09:53Z

Thanks for the feedback. Pushed changes

Could you please run make style once more? You probably ran it before you made the other changes.

Edenzzzz · 2024-08-05T13:53:36Z

I just ran make style but it only produced error messages without fixing them?

BenjaminBossan · 2024-08-05T14:09:21Z

Could you please ensure that you have ruff==4.10.0 installed.

Edenzzzz · 2024-08-06T00:26:25Z

Do you mean 0.4.1? Perhaps we can add that to the requirements?

Edenzzzz · 2024-08-06T07:46:31Z

well I see that in setup.py but didn't come with pip install -e .. Anyway, I reran with ruff 0.4.10.

HuggingFaceDocBuilderDev · 2024-08-06T09:14:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

Thanks for the update.

The new tests are failing because they are run despite the CI not having a GPU. I added a comment on how to circumvent this. There are also some other failing tests, but they appear to be unrelated, I'll investigate.

but didn't come with pip install -e .

The reason is because ruff is an extra dependency (as it's only relevant for devs, not users). If you install with pip install -e .[test] it should be included.

BenjaminBossan · 2024-08-06T09:45:09Z

tests/test_gpu_examples.py

@@ -3076,3 +3077,25 @@ def test_bnb_4bit_wrap_fsdp(self):
        init_process_group(world_size=1, rank=0)
        # check that this does not raise:
        FSDP(model, auto_wrap_policy=fsdp_auto_wrap_policy(model), use_orig_params=False, sync_module_states=True)
+
+
+@require_torch_gpu


This decorator doesn't work as is. The reason why it works on other classes in this file but not here is because these other classes inherit from unittest.TestCase. But instead of doing the same, let's just move the decorator on the two test methods directly, then it works as expected.

BenjaminBossan · 2024-08-06T11:41:17Z

There are also some other failing tests, but they appear to be unrelated, I'll investigate.

Okay, the other failing tests were caused by an interaction with the latest transformers patch release. Please rebase on/merge with main to remedy this.

BenjaminBossan

Thanks a lot for the fixes, this now LGTM.

I'll ping the BOFT authors @YuliangXiu @yfeng95 @Zeju1997 @DTennant in case they want to add something. If I don't hear back in a couple of days, I'll assume this is good to be merged.

Zeju1997 · 2024-08-07T11:23:50Z

Dear all, thanks a lot for the effort!

BenjaminBossan · 2024-08-07T12:12:53Z

Thanks for checking @Zeju1997

fix boft mixed precision

58caf44

Edenzzzz force-pushed the main branch from 4621cd2 to 58caf44 Compare July 14, 2024 06:41

BenjaminBossan requested changes Jul 15, 2024

View reviewed changes

Edenzzzz added 2 commits August 5, 2024 07:33

add tests; fix conv2d

8fd57a1

fix import

8f618b5

BenjaminBossan requested changes Aug 5, 2024

View reviewed changes

fix style

156f146

ruff 0.4.1

18db3c6

BenjaminBossan requested changes Aug 6, 2024

View reviewed changes

Edenzzzz and others added 2 commits August 6, 2024 19:43

Merge branch 'huggingface:main' into main

917f0dd

decorator

9f9f7bc

BenjaminBossan approved these changes Aug 6, 2024

View reviewed changes

BenjaminBossan merged commit c869664 into huggingface:main Aug 7, 2024
14 checks passed

BenjaminBossan mentioned this pull request Aug 8, 2024

FIX Import error in BOFT half precision test #1995

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hotfix] Fix BOFT mixed precision #1925

[Hotfix] Fix BOFT mixed precision #1925

Edenzzzz commented Jul 14, 2024 •

edited

Loading

BenjaminBossan left a comment

BenjaminBossan Jul 15, 2024

Edenzzzz Aug 5, 2024

BenjaminBossan Jul 15, 2024

BenjaminBossan commented Jul 29, 2024

Edenzzzz commented Aug 5, 2024 •

edited

Loading

BenjaminBossan left a comment

BenjaminBossan Aug 5, 2024

BenjaminBossan Aug 5, 2024

BenjaminBossan Aug 5, 2024

BenjaminBossan Aug 5, 2024

Edenzzzz commented Aug 5, 2024

BenjaminBossan commented Aug 5, 2024

Edenzzzz commented Aug 5, 2024 •

edited

Loading

BenjaminBossan commented Aug 5, 2024

Edenzzzz commented Aug 6, 2024

Edenzzzz commented Aug 6, 2024

HuggingFaceDocBuilderDev commented Aug 6, 2024

BenjaminBossan left a comment

BenjaminBossan Aug 6, 2024

BenjaminBossan commented Aug 6, 2024 •

edited

Loading

BenjaminBossan left a comment

Zeju1997 commented Aug 7, 2024

BenjaminBossan commented Aug 7, 2024

		boft_rotation = butterfly_oft_mat @ boft_rotation
		boft_scale = boft_s * boft_scale

[Hotfix] Fix BOFT mixed precision #1925

[Hotfix] Fix BOFT mixed precision #1925

Conversation

Edenzzzz commented Jul 14, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan commented Jul 29, 2024

Edenzzzz commented Aug 5, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Edenzzzz commented Aug 5, 2024

BenjaminBossan commented Aug 5, 2024

Edenzzzz commented Aug 5, 2024 • edited Loading

BenjaminBossan commented Aug 5, 2024

Edenzzzz commented Aug 6, 2024

Edenzzzz commented Aug 6, 2024

HuggingFaceDocBuilderDev commented Aug 6, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan commented Aug 6, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

Zeju1997 commented Aug 7, 2024

BenjaminBossan commented Aug 7, 2024

Edenzzzz commented Jul 14, 2024 •

edited

Loading

Edenzzzz commented Aug 5, 2024 •

edited

Loading

Edenzzzz commented Aug 5, 2024 •

edited

Loading

BenjaminBossan commented Aug 6, 2024 •

edited

Loading