Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Image CowData crash when baking large lightmaps #94243

Merged

Conversation

Calinou
Copy link
Member

@Calinou Calinou commented Jul 11, 2024

This switches to 64-bit integers in select locations of the Image class, so that image resolutions of 16384×16384 (used by lightmap texture arrays) can be used properly. Values that are larger should also work (the Image class' MAX_PIXELS constant is 2^28 pixels, and 2^24 pixels on each axis).

VRAM compression is also supported, although most VRAM-compressed formats are limited to individual slices of 16384×16384. WebP is limited to 16383×16383 due to format limitations.

Note that very large lightmaps will cause the scene to load significant slower – this would be addressed by using VRAM compression, but it's currently very slow on lightmaps this large. #91535 should make compression times more reasonable, but I haven't tested it together with this PR yet. #50574 can also help bring down file sizes and memory usage (and could be coupled with VRAM compression for further gains).

Please test this on your own projects 🙂

cc @jcostello @BlueCube3310 @SpockBauru @jitspoe @TokisanGames @DarioSamo

PS: How do we handle GDExtension compatibility on this one? Is it possible to add a compatibility method in this case?

Error: Validate extension JSON: Error: Field 'classes/Image/methods/get_mipmap_offset/return_value': meta changed value in new API, from "int32" to "int64".

Testing project: global_illumination_updated.zip
Previously, this project would crash upon baking lightmaps. It works now.

Preview

Lightmap editor

@reduz
Copy link
Member

reduz commented Jul 12, 2024

@RandomShaper Since you are here, I think these have to be fixed too, as some return size or offset in uint32_t (my fault).

Like this:

static uint32_t get_image_format_required_size(DataFormat p_format, uint32_t p_width, uint32_t p_height, uint32_t p_depth, uint32_t p_mipmaps, uint32_t *r_blockw = nullptr, uint32_t *r_blockh = nullptr, uint32_t *r_depth = nullptr);

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

There are some tests that need adjustments too (see diff view) and a compat method needs to be registered for get_mipmap_offset.

ERR_FAIL_INDEX_V(p_mipmap, get_mipmap_count() + 1, -1);

int ofs, w, h;
int64_t ofs;
int w, h;
_get_mipmap_offset_and_size(p_mipmap, ofs, w, h);
return ofs;
}

int Image::get_mipmap_byte_size(int p_mipmap) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this one return int64_t too? Since it's a difference between two offsets.

@passivestar
Copy link
Contributor

@Calinou the provided test project has glb with the default texel size of 0.1 and LightmapGI texel scale is also the default value 1. Not knowing what large value to change it to for testing I set it to 30 and it immediately crashed the editor when I clicked bake :D

MTLTextureDescriptorInternal validateWithDevice:]:1357: failed assertion Texture Descriptor Validation
MTLTextureDescriptor has width (210870) greater than the maximum allowed size of 16384.
MTLTextureDescriptor has height (278700) greater than the maximum allowed size of 16384.

@Saul2022
Copy link

what large value to change it to for testing I set it to 30 and it immediately crashed the editor when I clicked bake :D

It might be the file size and the fact it still really slpw for the m1 max to handle.

MTLTextureDescriptor has height (278700) greater than the maximum allowed size of 16384.

Why is it so big lol...

@jcostello
Copy link
Contributor

Im getting this error baking sponza with texel density set to 3

 Too many pixels for Image. Maximum is 16777216x16777216 = 268435456pixels.
  core/io/image.cpp:2798 - Condition "dsize == 0" is true.

@SpockBauru
Copy link

SpockBauru commented Jul 12, 2024

Built @Calinou's repo locally and made a test with my LightmapProbeGrid plugin demo scene. I enabled lightmap only for the terrain (disabled for the tunnel, trees, etc.).

Here the test project, you may have to restart on the first launch because the probes plugin:

LightmapProbeGrid.zip

There are 2 modified scenes on this test: the Demo Scene (main scene) and the Terrain scene (GLTF inherited scene).

On the Terrain scene, unwrapping the terrain's UV2 in Godot generates a Lightmap Size Hint of 12034x12238 (the terrain is not perfectly squared) which is bellow the 16384 max size from this PR.

Back to the demo scene if I bake the LightmapGI it results in a wall or errors with a pitch-black terrain and the lightmap .exr file is also full black. The same happen lowering the resolution all way down to 8192 (2^13):

image

image

By reducing the Lightmap Size Hint to 8053x8190, it freezes the editor on Determining optimal atlas size:
image

image

With the resolution of 8023x8139, it bakes normally without errors. The resulting EXR file has a resolution of 8192x8192
image

image

In a short: Any attempt on creating a Lightmap file with resolution above 8192 didn't work on this demo :'(

Edit:
System info: Godot v4.3.beta unknown - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 3080 (NVIDIA; 32.0.15.5612) - 12th Gen Intel(R) Core(TM) i7-12700KF (20 Threads), 32GB RAM

@BlueCube3310
Copy link
Contributor

By reducing the Lightmap Size Hint to 8053x8190, it freezes the editor on `Determining optimal atlas size

Could you try rebasing this on top of #94237?

@SpockBauru
Copy link

SpockBauru commented Jul 13, 2024

By reducing the Lightmap Size Hint to 8053x8190, it freezes the editor on `Determining optimal atlas size

Could you try rebasing this on top of #94237?

Sorry, I can't... I don't have enough knowledge to rabase one PR in another one :'(

Edit: Testing the Windows artifact of #94237, it crashes the editor when defining the Lightmap Size Hint to 8053x8190

@BlueCube3310
Copy link
Contributor

After adding the patches from that PR 16k lightmaps bake correctly. The first case seems to happen when the generated atlas is rounded to 32k, though I'm not certain about that yet.

@Calinou
Copy link
Member Author

Calinou commented Jul 13, 2024

Very few GPUs support 32768x32768 textures and their memory usage is obscene, so the lightmapper should clamp the size down in that case (and print a warning advising the user to reduce their texel scale property).

Textures should already not exceed 4096x4096 in the first place if you want them to show up reliably on mobile/web platforms (they could be shrunk on export).

@lyuma
Copy link
Contributor

lyuma commented Jul 13, 2024

the lightmapper should clamp the size down in that case (and print a warning advising the user to reduce their texel scale property).

The entire underlying issue here is a UX problem and lack of communication with the user.

i doubt I have ever wanted godot to create even a 16k lightmap texture. It is still better to not crash, but we certainly shouldn't make it possible to go higher.

Textures should already not exceed 4096x4096 in the first place if you want them to show up reliably on mobile/web platforms

Indeed, this is the real problem. I think Godot needs better UX here to show the user what size texture would be generated, explain the consequences with mobile and web support, diagnose which meshes are responsible for the bloated size, and give the user various options to keep the lightmap within the size budget the user asks for.

It might be nice to have a max lightmap size option and perhaps options like "downscale" or "error" if it goes over that size budget. (The problem is usually one mesh is taking way more space in the lightmap, and it might be a relatively unimportant object or have very little detail, so if I knew which it is then I could adjust the settings)

@jcostello
Copy link
Contributor

@lyuma sorry my ignorance, but is there a limit of how big a scene can be (because of the texture size) or the problem is the texel density?

@Saul2022
Copy link

Saul2022 commented Jul 14, 2024

the lightmapper should clamp the size down in that case (and print a warning advising the user to reduce their texel scale property).

Agreeed, even ue5 being pretty high end might just support 8k texture at maximum here.

Screenshot_20240714_053655_Chrome

@lyuma
Copy link
Contributor

lyuma commented Jul 14, 2024

We tend to discourage posting AI responses because of its tendency to be verbose or inaccurate: the AI generated response is wrong. UE4's resolution limit of 8192 is likely no longer present at least in UE 5.1, (and UE 5.x in particular may be able to make use of virtual texturing in some cases to go beyond conventional resolution limits)

That said, UE5 docs contain this warning:

"Some GPUs have hardware limits in the maximum size texture they can support. For example, some GPUs may not support texture sizes larger than 8192 pixels (8k)."

my guess is most desktop GPUs support 16k (supposedly GPUs without support for 16k will be DX10 only). Still, relying on 16k textures might make your content inaccessible to mobile users or old/low-end desktop hardware.

That said, while an uncompressed 16k HDR lightmap texture will take 2 Gigabytes of contiguous memory which would be impractical on most GPUs, BPTC should be able to cut that down to 8bpp, only 256MB of memory, which should be acceptable.

@SpockBauru
Copy link

SpockBauru commented Jul 14, 2024

I believe the main issue is not the lightmap size, but the lack of automatic lightmap tiling where lightmaps are splitted in chunks.

In Unity you can set the max size of each tile and several are automatically generated according the need. The engine bakes one by one so gpus with low amount of memory can bake big scenes without issues. Tiles have not a fixed size, but rather vary according the region, each tile with its own file.

For Godot we can take a step further and after baking make the game load/unload each tile as needed, but how to make it must be a topic of careful discussion since is far from trivial.

Since Godot aims to include low end platforms and also simplify the development as much as possible, I believe that a fully automatic tiling system without the need for user interaction would be the best approach for developing games with relative big maps, not only full open worlds but also things like a racing game with 10km tracks, or something like the Gran Pulse level on Final Fantasy XIII.

@Calinou
Copy link
Member Author

Calinou commented Jul 14, 2024

For Godot we can take a step further and after baking make the game load/unload each tile as needed, but how to make it must be a topic of careful discussion since is far from trivial.

Mesh splitting is needed to get decent culling (and already has its own proposal), but it's difficult to do right in a way that preserves materials and both UV layers.

@SpockBauru
Copy link

Mesh splitting is needed to get decent culling (and already has its own proposal), but it's difficult to do right in a way that preserves materials and both UV layers.

As far as I remember, Unity does not perform mesh splitting on the lightmaps, it only groups whole meshes.

Instead, the user defines the max size of the lightmap tile and if a mesh exceeds that, the lightmap resolution of the mesh is automatically shrunk to fit the max size. After baking a warning is print in the console.

Example:

If a terrain with 1km² have defined 4 samples per meter, it would not fit an 1024x1024 lightmap tile. In this case the engine automatically shrinks the terrain lightmap resolution to fit the max tile size.

After that, the bake is done normally and give a console warning telling the user what objects had their lightmap resolution reduced.

@Saul2022
Copy link

Prob or we could do what unreal does as making the lightmap a texture atlas( idk if it’s done already, but it’s an idea for file sizes). As for resolution i agree with lyuma about limit prob being 15k since with this and bluecibe performance it does seem to improve the situatuon, though i would say an resolution limit option should be brought in.

@BlueCube3310
Copy link
Contributor

The other compression modes (ASTC, CVTT, etcpak) should probably be adjusted for this as well, even though compressed images technically shouldn't require 64-bit offsets/sizes.

@akien-mga akien-mga force-pushed the image-fix-large-cowdata-size-crash branch from 6e33585 to 9982c84 Compare July 19, 2024 09:31
@akien-mga akien-mga requested a review from a team as a code owner July 19, 2024 09:31
@akien-mga
Copy link
Member

akien-mga commented Jul 19, 2024

Rebased and updated the PR to fix the image unit tests.

I confirm this fixes the MRP in the OP.

I also removed int Image::get_mipmap_byte_size(int p_mipmap) const which wasn't used anywhere (its only used was in 4 years old #if 0'd code, which I removed too).

And added a compat binding for get_mipmap_offset which is exposed to scripting.

The other compression modes (ASTC, CVTT, etcpak) should probably be adjusted for this as well, even though compressed images technically shouldn't require 64-bit offsets/sizes.

I guess we can merge this PR first and either @BlueCube3310 or @Calinou can do this as a follow-up?

@akien-mga akien-mga force-pushed the image-fix-large-cowdata-size-crash branch from 9982c84 to 7a5d462 Compare July 19, 2024 11:22
@akien-mga akien-mga requested review from a team as code owners July 19, 2024 11:22
Comment on lines 33 to 39
int32_t Image::_get_mipmap_offset_bind_compat_94243(int p_mipmap) const {
return get_mipmap_offset(p_mipmap);
}

void Image::_bind_compatibility_methods() {
ClassDB::bind_compatibility_method(D_METHOD("get_mipmap_offset", "mipmap"), &Image::_get_mipmap_offset_bind_compat_94243);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulsntos @dsnopek Not sure how to deal with this compat method, it seems GDExtension and C# have conflicting needs.

  • GDExtension (at least our CI checks) want a compat method as the return type changed from int32_t to int64_t.
  • C# doesn't like it and complains:
/home/runner/work/godot/godot/modules/mono/glue/GodotSharp/GodotSharp/Generated/GodotObjects/Image.cs(1207,17): error CS0111: Type 'Image' already defines a member called 'GetMipmapOffset' with the same parameter types [/home/runner/work/godot/godot/modules/mono/glue/GodotSharp/GodotSharp/GodotSharp.csproj]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Having a compatibility method for the change in return type is the right thing to do for GDExtension. Perhaps the C# code generation hasn't yet had to take into account a compatibility method where the only change is the return type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The glue generation seems to have logic to skip methods with conflicting signature:

// Add compat methods that don't conflict with other methods in the type.
for (const MethodInterface &imethod : compat_methods) {
if (_method_has_conflicting_signature(imethod, itype)) {
WARN_PRINT("Method '" + imethod.name + "' conflicts with an already existing method in type '" + itype.name + "' and has been ignored.");
continue;
}
itype.methods.push_back(imethod);
}

But the implementation seems to ignore the return type, and I don't really understand the comment justifying it.

bool BindingsGenerator::_method_has_conflicting_signature(const MethodInterface &p_imethod_left, const MethodInterface &p_imethod_right) {
// Check if a method already exists in p_itype with a method signature that would conflict with p_imethod.
// The return type is ignored because only changing the return type is not enough to avoid conflicts.
// The const keyword is also ignored since it doesn't generate different C# code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# can't have overloads where the only thing that changes is the return type, so these kinds of changes break compat for C#.

The bindings generator avoids generating these compat methods, to avoid generating invalid C# that won't compile. That's what the comment is trying to say, we don't compare the return type because we don't care if they are different it should still return true in this case (the method is still considered conflicting even if the return type is different).

I'm pretty sure we already have cases of compat methods where the only thing that changes is the return type, so this should already be handled. Not sure why it would generate invalid C# in this case, it may have something to do with the type, int32_t and int64_t are really both INT with different metadata, so maybe ClassDB gets confused about this and it's not properly registering the compat method.

The bindings generator only checks conflicting methods of compat methods, so if it's not registered as a compat method it won't check. We assume there are no conflicts in the non-compat methods because ClassDB doesn't support method overloading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bindings generator only checks conflicting methods of compat methods, so if it's not registered as a compat method it won't check. We assume there are no conflicts in the non-compat methods because ClassDB doesn't support method overloading.

Can't there be conflicts between a compat method and a non-compat method though, like we have here apparently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash and the hash_compatibility are the same (923996154). @dsnopek Are you sure registering a compat method in this case is correct?

Ah, no, actually I think you're right. GDExtension always uses int64_t in its encoding anyway, so the metadata change only affects the data type that bindings generate. So, if the hash doesn't change, then we don't need the compatibility method. Sorry for the incorrect info earlier!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's our CI script that needs fixing I guess?

Or should I just include the line in the .expected file and write that it doesn't require a compat binding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error line? Is it actually complaining about a missing hash?

This close to 4.3, I'd say adding the line to .expected is probably fine for now - we can work on CI more later.

Copy link
Member

@akien-mga akien-mga Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm no it didn't actually complain about the hash, only:

Validate extension JSON: Error: Field 'classes/Image/methods/get_mipmap_offset/return_value': meta changed value in new API, from "int32" to "int64".

I'll silence it.

After 4.3 we could review the CI script, maybe it shouldn't report changes to meta at all? Or not all of them?

Copy link
Contributor

@dsnopek dsnopek Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the errors about missing hashes really affect GDExtension binary compatibility, and shows the need for a compatibility method.

The rest are good to consider for source compatibility, like, we don't want to completely change an API signature, better to add new arguments at the end with defaults. And these errors help to call out API changes, that we might otherwise miss.

But we could certainly make the CI more clear that it's the hash change ones that require compatibility methods, and should only very, very rarely be ignored.

@akien-mga akien-mga force-pushed the image-fix-large-cowdata-size-crash branch from 7a5d462 to 9508a52 Compare July 19, 2024 12:31
This switches to 64-bit integers in select locations of the Image
class, so that image resolutions of 16384×16384 (used by
lightmap texture arrays) can be used properly. Values that are larger
should also work.

VRAM compression is also supported, although most VRAM-compressed
formats are limited to individual slices of 16384×16384. WebP
is limited to 16383×16383 due to format limitations.
@akien-mga akien-mga force-pushed the image-fix-large-cowdata-size-crash branch from 9508a52 to 0445ccf Compare July 19, 2024 14:04
Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, and I confirmed it fixes the linked issue.

@akien-mga akien-mga merged commit a0943ac into godotengine:master Jul 19, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

@SpockBauru
Copy link

Downloaded the windows artifact and simplified the terrain scene to include only the needed parts. Now we can simply set the lightmap_size_hint on the Terrain Mesh node to get the desired lightmap file size after baking.

Terrain test scene:

TerrainLightmapTest.zip

I agree that this PR can be merged as it is, and a follow up will be needed.

The artifact can correctly generate Lightmap files up to 8192, which is better than we had, but between this value and the proposed 16384 it still generates a black lightmap file with a wall of errors.

May I be tagged on the follow up PR to test this issue? ^-^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet