-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix edge cases in (de)serialize_torch_tensor #591
Conversation
Codecov Report
@@ Coverage Diff @@
## master #591 +/- ##
==========================================
+ Coverage 85.20% 85.37% +0.17%
==========================================
Files 81 81
Lines 8009 8022 +13
==========================================
+ Hits 6824 6849 +25
+ Misses 1185 1173 -12
|
hivemind/compression/floating.py
Outdated
@@ -12,22 +12,28 @@ class Float16Compression(CompressionBase): | |||
FP16_MIN, FP16_MAX = torch.finfo(torch.float16).min, torch.finfo(torch.float16).max | |||
|
|||
def compress(self, tensor: torch.Tensor, info: CompressionInfo, allow_inplace: bool = False) -> runtime_pb2.Tensor: | |||
assert torch.is_floating_point(tensor) and tensor.dtype != torch.bfloat16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we should fail with an error in case of bf16 inputs? It is indeed not sensible, but if the user wants to do so, it's probably better to issue a warning instead of flat out refusing to pass that through quantization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added ValueError with a more user-legible reason
hivemind/compression/quantization.py
Outdated
@@ -135,14 +138,15 @@ def quantize( | |||
except ImportError: | |||
raise ImportError(BNB_MISSING_MESSAGE) | |||
|
|||
quantized, (absmax, codebook) = quantize_blockwise(tensor) | |||
quantized, (absmax, codebook, *extra_params) = quantize_blockwise(tensor, blocksize=4096, nested=False) | |||
assert tuple(extra_params) == (4096, False, tensor.dtype, None, None) # blocksize, nested, dtype, offset, s2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make that tuple on the right a module-level constant? It's used twice in the code, better to make it clear we're using some predefined values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks for the suggestion
* serialize with requires_grad * ensure that all compression methods return tensor of the original dtype * test that all compression methods preserve dtype and requires_grad --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
* serialize with requires_grad * ensure that all compression methods return tensor of the original dtype * test that all compression methods preserve dtype and requires_grad --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: Max Ryabinin <mryabinin0@gmail.com> (cherry picked from commit 2873252)
During an earlier patch, we lost the requires_grad property during serialize_torch_tensor. This PR adds it back.