-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Web] Use -Oz
instead of -Os
when optimizing for size
#97407
base: master
Are you sure you want to change the base?
Conversation
LLVM has an Oz optimization flag, which they define as: `Like -Os (and thus -O2), but reduces code size further.` I've tested this, on a release builds, and the result is ~25% smaller uncompressed size, and ~10% smaller when compressed.
-Oz
instead of -Os
when optimizing for size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a pretty significant gain! Looks great.
Any hints of tradeoffs on performance, and build times?
Build times were almost the same between the two (no noticeable difference). I didn't run any performance test yet, so we might want to look into that... maybe using something from the benchmark repository? The emscripten docs say:
|
https://github.com/godotengine/godot-benchmarks hasn't been tested as a web export, but it should be able to work there. Some of the rendering benchmarks won't make sense to run though, as only the Compatibilit rendering method can be used. I would also try to run the 3D Platformer and Truck Town projects in very small windows with Print FPS enabled and V-Sync disabled (use Chromium's |
Nevermind... I didn't copy the string correctly :/ |
I've tested both track town and the 3d platformer. In both cases the FPS difference is very small. On the 3d platformer there's almost no difference, I get around 240 FPS on both demos. Likewise, in track town, there is maybe a 1-2 FPS difference (out of 300), but it might as well be due to random fluctuations (my GPU fan starts spinning like hell when I remove the limit ;-) ). |
I made another test using the script in #70838 , these are the results (
Note: they varied a bit, I tried to get something close to the median value but we might want to be a bit more rigorous |
Thanks for doing some performance tests! For the record, was this with or without LTO? A 15% performance hit is pretty significant :/ but the size gain is also really worth having for the Web specifically. I wonder if we should add a new |
I think that's actually a good idea...
I think we need to decide if we care more about the ~10% size reduction vs the ~15% perf hit on cpu-bound operations. We should also probably run the tests above with Godot 4 instead of Godot 3, but I expect more or less the same results... |
So, I've run some more tests, using the chromium profiler, with Godot 4, LTO, and debug symbols. Please note, builds with debug symbols may be less optimized then regular builds:
Test script (noise + 1.5MiB webp decode + 1.5MiB webp encode): extends Node2D
func _ready() -> void:
var tex := NoiseTexture2D.new()
tex.width = 10240
var noise := FastNoiseLite.new()
noise.noise_type = FastNoiseLite.TYPE_SIMPLEX
tex.noise = noise
await tex.changed
$TextureRect.texture = tex
await get_tree().process_frame
var image := Image.load_from_file("res://image.webp")
await get_tree().process_frame
var buf := image.save_webp_to_buffer()
await get_tree().process_frame
print("done") Oz: Os: Comparison: Noise: 743.33 vs 741.64 (almost no difference, considering run variability) WebP decode seems hugely affected, while encoding and noise generation seems to have the same performance. |
Can you please share what steps did you take to get to ~10mb uncompressed? I disabled 3d, disabled almost all modules, and I used optimize = "size" but I get 24mb. |
It could possibly be Godot 3.x |
LLVM has an Oz optimization flag, which they define as:
Like -Os (and thus -O2), but reduces code size further.
I've tested this, on a release builds, and the result is ~25% smaller uncompressed size, and ~10% smaller when compressed.
I found out while evaluating the impact of LTO builds (#96851), and, while in the case of
Os
the LTO actually produces larger builds, in the case ofOz
there is a (small) reduction in size (I'll add all the tests in #96851).For now:
Oz, no LTO:
Oz LTO:
Os no LTO:
Os LTO: