Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'nf4' compute datatype? #1321

Open
dorsa-zeinali opened this issue Aug 15, 2024 · 1 comment
Open

'nf4' compute datatype? #1321

dorsa-zeinali opened this issue Aug 15, 2024 · 1 comment
Labels
question Further information is requested

Comments

@dorsa-zeinali
Copy link

Feature request

In the quantization procedure for qlora, there is the 'nf'4 storage datatype and the compute datatype (in the paper bfloat16 which is the original)(please refer to the image). They then dequantize the value to the compute datatype for inference or calculating the backward pass. When I tried using int8 for the compute datatype, matrix multiplication threw an error for not being supported for this datatype. I have not tried inference with qint8(). Is it possible to make 'nf4' as a possible computation datatype, and have the relevant functions be able to handle this?
Screenshot 2024-08-15 at 6 47 02 PM

Motivation

Dequantizing a value for performing calculations and storing those results and updates in current full precision (even though in qlora, only a small set of adapter weights are updated), is still inefficient and undoable especially for hardware on edge devices. Doing research towards performing calculations accurately with weights still in 4 bits would be a desirable improvement.

Your contribution

I can try to submit a PR for this. I would just need some guidance in the right direction to help me get started.

@matthewdouglas
Copy link
Member

Hi,
For nf4 quantization we only support computation with fp32, fp16, or bf16. We also do not quantize the activations.

Can you clarify by what you mean with edge devices and what the goal is?

@matthewdouglas matthewdouglas added the question Further information is requested label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants