-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small gradient optimization #1885
Small gradient optimization #1885
Conversation
wgpu/src/shader/quad.wgsl
Outdated
fn l(c: f32) -> f32 { | ||
if (c < 0.04045) { | ||
return c / 12.92; | ||
} else { | ||
return pow(((c + 0.055) / 1.055), 2.4); | ||
}; | ||
} | ||
|
||
fn to_linear(color: u32) -> vec4<f32> { | ||
let c = unpack4x8unorm(color); //unpacks as a b g r | ||
return vec4<f32>(l(c.w), l(c.z), l(c.y), c.x); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #1888, we are not always blending in linear RGB. This means the to_linear
transformation only needs to happen if the web-colors
feature is disabled.
Generally, we always upload colors to the GPU ready to use (i.e. ready for blending). This allows us to perform (or skip) the conversion in Rust code. However, using u32
for colors in linear RGB can apparently cause precision issues.
Generating different shader code at compile-time is quite a terrible solution as well, so I am a bit torn here...
Since we have plans to eventually use textures for storage, maybe we could keep using f32
for now? Would reducing the gradient colors to 6 unblock #1882?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a into_linear_u32
fn for Color
s to use in 9554c78; this will do the linear conversion first with the correct precision, and then round it into a u8. This should only cause a loss of precision if a user is using more than 8 bits per-color, which I think is pretty uncommon. This keeps any color conversion off the GPU as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only cause a loss of precision if a user is using more than 8 bits per-color, which I think is pretty uncommon.
This isn't entirely true. The issue with using 8 bits for linear color is that the distribution of the 256 possible values will not be an even distribution for most displays. For instance, the values between 0-10 may describe very similar colors, while the difference between 11-12 may be very apparent. In sRGB, the 0-10 range in linear may just get 3 values, while the 11-12 range may get 10 (not accurate, completely making this up!).
This can cause a bunch of issues. For instance, you may be trying to make a subtle linear gradient in iced
and notice that the start and end colors are not different because they are being reduced to the same linear value.
This is why generally you don't want to store linear values with u8
components. The article The Importance of Being Linear mentions this in the "Intermediate Color Buffers" section:
Intermediate color buffers may lose precision in the darks if stored as 8-bit linear images, compared to the precision they would have as gamma-corrected images. Thus, it may be beneficial to use 16-bit floating-point or sRGB frame-buffer and sRGB texture formats for rendering and accessing intermediate color buffers.
I'm okay with it while we figure out the texture storage, though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other option I can think of if we want to retain the linear -> sRGB conversion precision then is to use IEEE 754 binary16 representation for colors & offsets when packing gradients for upload to the GPU. This will nearly half the total bytes required for everything we are packing in the gradient data, and I do not believe cause any (noticeable) precision loss. I could use them for colors, offsets, and (possibly) direction data. There is already a crate that implements this spec called half
, though the bit shiftin isn't too bad to to myself if we don't want to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to using f16s (using the half
crate) for colors & offsets in 8bd5969; left direction as-is because it's unclear if it will ever not fit into a f16::MAX in a situation where someone has a huge canvas or something cause it's being normalized on the GPU not CPU.
946d616
to
9554c78
Compare
be313a5
to
8bd5969
Compare
8bd5969
to
677f564
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Let's merge this! 🎉
Packed colors into a u32 and unpack'd in shader to free up some space in gradient data. This will improve performance by reducing data transfer latency. Overall gain = 96 bytes per gradient vertex.
Ultimately I'd like to rewrite the gradient pipeline to use a texture to store data (no storage buffers if we want WASM support) and pass an index/length to the shader, but I will do that further down the line. Besides just being a performance gain, this will unblock #1882.