-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poll on relaxed mode bf16 dot product #106
Comments
Where did we settle on what the best deterministic, portable semantics of this instruction would be? How do we expect that deterministic semantics to perform relative to the deterministic semantics of other instructions? On what timescale do we expect to realize that 2x potential speedup in browsers? |
I see two options to define deterministic semantics for Option 1: denormal inputs in Option 1 is better for majority of existing systems: it is easy to implement in software by extracting even/odd numbers, extending them to IEEE FP32, and doing FMA operations. It also match the semantics of two-instruction ( I would recommend Option 2 as (treat denormals as zero) as the deterministic behavior as both x86 and ARM are converging on this option going forward and it can be efficiently implemented on any hardware with Fused Multiply-Add. |
Regarding the "No BF16 standard": BFloat16 represents the high 16 bits of an IEEE FP32 number, and thus can be losslessly converted to FP32. All math in the proposed BF16 Dot Product instruction is done on IEEE FP32 representation, so it is standardized. Regarding hardware support: the following CPUs support ARM BF16 or AVX512-BF16:
|
Neoverse V1 (which Graviton3 is based on) also supports BFloat16 (the |
@conrad-watt @sunfishcode @titzer @penzn Since you voted against BFloat16 Dot Product, could you comment on what is the dealbraker about this instruction for you? |
Well, I might have voted before there was a 'neutral' option, not sure. I think it is OK to drop it (as opposed to "we need to drop it"), if that is the compromise to move proposal forward, especially since for now most engines would emulate it anyway. |
@Maratyszcza Between your messages here and @akirilov-arm's, it's not clear to me which CPUs have which semantics. And it's not clear that the ARMv9.2 optional |
Given the results of the poll, do we feel able to make a decision here? |
@conrad-watt The decision is to remove BF16 Dot Product. |
This is a poll on #88 in the context of relaxed mode (slides).
In the 2022-11-04 meetings (notes), we discussed this, main points:
👍 for inclusion of BF16 dot product (i.e. BF16 dot product stays in this proposal)
👎 against inclusion of BF16 dot product (i.e. remove BF16 dot product from this proposal)
update: 👀 for neutral option
The text was updated successfully, but these errors were encountered: