Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defend against invalid GL_MAX_UNIFORM_BLOCK_SIZE. #82921

Conversation

jasonwinterpixel
Copy link
Contributor

I have a Radeon RX580, and I run Linux Mint latest stable, with up to date drivers.

I receive this error when running Godot 3.x games in chrome:
image

I think it might be causing a crash in the 3d renderer, but we are only using the 2d renderer.

My GPU driver, in chrome and firefox, returns an invalid value for GL_MAX_UNIFORM_BLOCK_SIZE.

https://webglreport.com/?v=2
image

It actually returns this binary number: 10000000000000000000000000000000, which is a negative int, and webglreport is rendering it as a 64 bit int.

Anyways, this commit uses the minimum value of 16384 as specified in the opengl spec to gaurd against this invalid value from the driver.

https://registry.khronos.org/OpenGL-Refpages/es3.0/html/glGet.xhtml

@jasonwinterpixel jasonwinterpixel requested a review from a team as a code owner October 6, 2023 18:02
@Calinou Calinou added bug platform:web topic:rendering cherrypick:3.5 Considered for cherry-picking into a future 3.5.x release labels Oct 6, 2023
@Calinou Calinou added this to the 3.6 milestone Oct 6, 2023
@bitsawer
Copy link
Member

bitsawer commented Oct 6, 2023

This probably fixes #75913 for 3.x

Similar PR was made earlier for 4.x: #80909

@Calinou
Copy link
Member

Calinou commented Oct 6, 2023

For future reference, these are the limits I get on Linux + GeForce RTX 4090 (NVIDIA 535.113.01). This platform doesn't use ANGLE and therefore behaves correctly, but this may be useful to refer to in the future.

There are some differences in reported limits between Firefox and Chromium, namely Max Combined Texture Image Units, Max Varying Vectors, Aliased Line Width Range and Max Samples.

WebGL 2.0

Firefox

image

Chromium

image

WebGL 1.0

Firefox

image

Chromium

image

@jasonwinterpixel
Copy link
Contributor Author

Yeah it seems unlikely that Godot really needs to query and use this value from the GPU at all. Did this code get added to fix a real error? Maybe it should be ignored on web. Annoying but seems like angle or the Radeon Linux driver is returning a wrong value here.

Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. #80909 fixed the same issue by using glGetInteger64v with an int64_t. But technically GL_MAX_UNIFORM_BLOCK_SIZE should be retrieved with glGetIntegerv.

So something is going very wrong in making it overflow, and I think in that case it is much safer to assume the minimum value than to use the giant value returned by the function.

@lawnjelly
Copy link
Member

lawnjelly commented Oct 7, 2023

But technically GL_MAX_UNIFORM_BLOCK_SIZE should be retrieved with glGetIntegerv.

Is that in the spec though? (I couldn't see it with a quick look for the other PR, but maybe it is in there)

@bruvzg 's comment here suggests it should be using 64 bit for web:
#80909 (comment)

I wonder whether retrieving as 64 bit (as seems to be required), then capping if desired, might make it easier to debug out of range values (we might want to e.g. print these values in verbose). 🤔

Also I'm not sure it is guaranteed that a 64 bit overflow will result in a negative value in 32 bit in every case (it just does in this particular bug report). In theory it should still be safe in all cases, but it seems a bit hacky versus just using 64 bit. It might come up as some random-ish positive value above the minimum.

This is of course assuming that the 64 bit glGet is supported on all platforms.

@bruvzg
Copy link
Member

bruvzg commented Oct 7, 2023

But technically GL_MAX_UNIFORM_BLOCK_SIZE should be retrieved with glGetIntegerv.

ANGLE use 64-bit version for it and some other values, specifically (the rest seems to use glGetIntegerv):

GL_TIMESTAMP_EXT
GL_MAX_ELEMENT_INDEX
GL_MAX_UNIFORM_BLOCK_SIZE
GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS
GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS
GL_MAX_SERVER_WAIT_TIMEOUT
GL_MAX_SHADER_STORAGE_BLOCK_SIZE

In any case, 64-bit version should work with all values since these functions do type conversion:

Type conversion is performed if data has a different type than the state variable value being requested. If glGetBooleanv is called, a floating-point (or integer) value is converted to GL_FALSE if and only if it is 0.0 (or 0). Otherwise, it is converted to GL_TRUE. If glGetIntegerv is called, boolean values are returned as GL_TRUE or GL_FALSE, and most floating-point values are rounded to the nearest integer value. Floating-point colors and normals, however, are returned with a linear mapping that maps 1.0 to the most positive representable integer value and −1.0 to the most negative representable integer value. If glGetFloatv or glGetDoublev is called, boolean values are returned as GL_TRUE or GL_FALSE, and integer values are converted to floating-point values.

@jasonwinterpixel
Copy link
Contributor Author

It appears as though my GPU is returning an invalid # on web no matter how you slice it.

Its returning 1 + max signed int32. I'm assuming that's totally invalid. I'm not sure if Godot can use that "correctly" at all. I'll post here with what # my GPU returns on non-web.

Is it possibly an ANGLE bug?

@lawnjelly
Copy link
Member

lawnjelly commented Oct 8, 2023

It appears as though my GPU is returning an invalid # on web no matter how you slice it.
Its returning 1 + max signed int32. I'm assuming that's totally invalid. I'm not sure if Godot can use that "correctly" at all. I'll post here with what # my GPU returns on non-web.
Is it possibly an ANGLE bug?

See: #80909 (comment)

It does appear to be stored as 64 bit in angle (line 298):
https://chromium.googlesource.com/angle/angle/+/refs/heads/main/src/libANGLE/Caps.h

It is initialized to the min:

caps.maxUniformBlockSize          = 16384;

I'm not sure when it is adjusted though (without downloading Angle source).

If not a bug, or error code, it is just possible that Angle has a path where it uses system memory for large UBO and has some kind of reference implementation so it can handle large uniform block size (without crashing). Whether it would be wise to use that is another matter, and capping it sounds sensible.

https://bugs.chromium.org/p/chromium/issues/detail?id=660670&q=angle%20GL_MAX_UNIFORM_BLOCK_SIZE&can=1

Buffer11: Keep system memory storage for large UBOs.
The system memory storage is needed to back UBOs larger than the max
constant buffer size in D3D11 Windows 7. Because readback from UBOs
can be tricky, the system memory storage keeps the canonical copy of
the data in this case.
We can also extend this to add a workaround to fix the outstanding
failure cases of UBOs on Intel.

This is a PR adding system memory backing:
https://chromium.googlesource.com/angle/angle/+/d89792f42b28c3d7ed822a4f3319ce2ebe73a11b

What platforms are these errors on? Is it possible Angle is being used on windows, it is converting OpenGL to DirectX, and DirectX is supporting larger blocks than expected using direct OpenGL?

The PR does suggest there are two figures over DirectX, the hardware supported size and system memory fallback, and in GL it is returning the larger of these, because there may be no distinction.

Our code

From the look of the code in 3.x it looks like this is used for max_ubo_lights, max_ubo_reflections, max_skeleton_bones. It may not even need more than 16384 bytes worst case, but I'm not super familiar with the GLES3 scene renderer.

In practice I don't think this currently is likely to matter much in 3.x as it doesn't look like we make use of very large uniform buffers. It looks as though the max figure reported may in fact be correct (and should be interpreted as 64 bit) but would likely run slower if we exceeded the GPU hardware limit (which we likely don't).

While this PR would work as is, it could do with some better explanation in comments. Although in most cases the effect would be the same, I would be tempted to:

  • Grab value in 64bit.
  • Cap at some threshold we are unlikely to use above, and may be supported in GPU hardware in the future (say 1mb?).

In general capping anything that reports a high value to the minimum is likely to unnecessarily cripple systems with system memory fallback. This probably doesn't matter in current 3.x, but is a concern in 4.x.

Overall

Given that this same problem can occur in other caps, it might be a good idea to mirror #80909 and fix all of these at once. Overall #80909 is a more sensible fix imo, although I think adding a sensible maximum cap could be good idea for 3.x.

@jasonwinterpixel
Copy link
Contributor Author

jasonwinterpixel commented Oct 9, 2023

On desktop linux, my GPU is returning 2147483647. In the browser, it returns the 64 bit number 2147483648, which when interpreted as a signed int, is a negative number. I think using 64 bits here is probably fine, but note that what I've written in this PR is probably closer to matching 'the spec' than what ANGLE is doing. Seems to me like this is a bug in ANGLE. We're calling glGetIntegerv and we're basically getting a 64 bit int back. Even with using 64 bits instead of 32, we should keep this MAX in to maintain the minimum defensively against any crazy GPUs.

But technically GL_MAX_UNIFORM_BLOCK_SIZE should be retrieved with glGetIntegerv.

Where is that specified?

@clayjohn
Copy link
Member

clayjohn commented Oct 9, 2023

I looked at the openGL 3.3 spec (page 321) the GL ES 3.0 / WebGL2 specs may differ.

edit: ah, OpenGL ES 3.0 spec (page 278) says to use GetInteger64v. Which is why ANGLE would always use it. So this is definitely an us problem, not an ANGLE problem

@clayjohn
Copy link
Member

Thanks for doing the initial work and spurring us to get this fixed. Closing as superseded by #83031

@clayjohn clayjohn closed this Nov 13, 2023
@clayjohn clayjohn removed this from the 3.6 milestone Nov 13, 2023
@clayjohn clayjohn added archived and removed cherrypick:3.5 Considered for cherry-picking into a future 3.5.x release labels Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants