Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review use of -fomit-frame-pointer and -ftree-vectorize flags for optimized builds #66296

Closed
akien-mga opened this issue Sep 23, 2022 · 2 comments · Fixed by #66297
Closed

Review use of -fomit-frame-pointer and -ftree-vectorize flags for optimized builds #66296

akien-mga opened this issue Sep 23, 2022 · 2 comments · Fixed by #66297

Comments

@akien-mga
Copy link
Member

akien-mga commented Sep 23, 2022

Godot version

4.0.beta (006e345)

System information

All

Issue description

As discussed in #66242, we have inconsistent usage of the -fomit-frame-pointer and -ftree-vectorize compiler flags in the various platform ports currently.

In current master, we apply these flags on target=release export templates on macOS, iOS and Android only, with the following logic:

  • target=release optimize=speed -> both flags
  • target=release optimize=size -> only -ftree-vectorize

macOS, iOS and Android are three platforms which rely on Clang as compiler, but these flags are also supported by GCC and the LLVM-based Emscripten so in theory we should be able to use them for all platforms aside from MSVC (where the /Oy flag, implicit in /O1 and /O2 which we use, should do something similar).

So we just need to assess what they do and whether we want them for release builds made with GCC (Linux, Windows) and Emscripten (Web) too. This may also be relevant for thirdparty platform ports (consoles, Haiku, etc.).


But first some history to know when this flags were introduced or removed:

-fomit-frame-pointer

In the first open source commit 0b806ee, it seems like all platforms used this flag, aside from iOS and JavaScript:

$ rg --sort path -l fomit-frame-pointer
platform/android/detect.py
platform/nacl/detect.py
platform/osx/detect.py
platform/server/detect.py
platform/windows/detect.py
platform/x11/detect.py

During Godot 2.0 development, it was removed from Windows/MinGW and X11 builds by two separate commits:

During Godot 3.0 development the flag was finally added to iOS by @RandomShaper in #8949 in an effort to harmonize the config between Android and iOS.

Finally during Godot 3.1 development the flag was removed for the server platform by @Calinou in #26666 while syncing it with X11.

Considerations

  • So now we have macOS, iOS and Android using the flag, and Windows/MinGW, Linux and Web not using it.
  • We can either make all platforms use it, or remove it from macOS, iOS and Android.
  • Possibly worth a read, Facebook engineers are proposing changing default compilation in Fedora to disable -fomit-frame-pointer: https://fedoraproject.org/wiki/Changes/fno-omit-frame-pointer
  • After doing all this research I finally read the GCC docs and found out that -fomit-frame-pointer is already implied by -O1, -O2, and -O3... so we don't need to specify it for GCC. For Clang, I can't find exhaustive information on what flags they enable for each optimization level, Clang docs are extremely terse.

-ftree-vectorize

When Godot was open sourced, this was used for Android and macOS:

$ rg --sort path -l ftree-vectorize
platform/android/detect.py
platform/osx/detect.py

It was added to iOS in #8949 like -fomit-frame-pointer.

@akien-mga
Copy link
Member Author

After doing all this research I finally read the GCC docs and found out that -fomit-frame-pointer is already implied by -O1, -O2, and -O3... so we don't need to specify it for GCC. For Clang, I can't find exhaustive information on what flags they enable for each optimization level, Clang docs are extremely terse.

Did some tests inspired by the MRP in this (not directly related) bug report: https://bugs.llvm.org/show_bug.cgi?id=9825

$ cat test.c 
int f() {
  return 42;
}

GCC 12.2.1

$ gcc -S test.c -O0 -o - | grep rbp
        pushq   %rbp
        movq    %rsp, %rbp
        popq    %rbp
$ gcc -S test.c -O0 -fomit-frame-pointer -o - | grep rbp
$ gcc -S test.c -O1 -o - | grep rbp
$ gcc -S test.c -O1 -fno-omit-frame-pointer -o - | grep rbp

So -O1 does include omitting the frame pointer... and -fno-omit-frame-pointer doesn't work‽ xD

Clang 15.0.0

$ clang -S test.c -O0 -o - | grep rbp
        pushq   %rbp
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        popq    %rbp
$ clang -S test.c -O0 -fomit-frame-pointer -o - | grep rbp
$ clang -S test.c -O1 -o - | grep rbp
$ clang -S test.c -O1 -fno-omit-frame-pointer -o - | grep rbp
        pushq   %rbp
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        popq    %rbp

Like GCC -O1 omits the frame pointer automatically. And here -fno-omit-frame-pointer does work as it should.

I confirmed that -O2, -O3, and -Os all include -fomit-frame-pointer for GCC and Clang.

@akien-mga
Copy link
Member Author

For -ftree-vectorize, the (current) GCC docs say:

-ftree-vectorize

    Perform vectorization on trees. This flag enables -ftree-loop-vectorize and -ftree-slp-vectorize if not explicitly specified.

-ftree-loop-vectorize

    Perform loop vectorization on trees. This flag is enabled by default at -O2 and by -ftree-vectorize, -fprofile-use, and -fauto-profile.

-ftree-slp-vectorize

    Perform basic block vectorization on trees. This flag is enabled by default at -O2 and by -ftree-vectorize, -fprofile-use, and -fauto-profile.

So it's included by default in -O2, which is the minimal optimization level at which we manually specify it currently for Android, macOS and iOS.

Back in 2007 when GCC devs did work on this -ftree-vectorize seemed to be included only in -O3, so that might be why it made its way to our build scripts: https://gcc.gnu.org/projects/tree-ssa/vectorization.html

According to LLVM docs, both the Loop and SLP vectorizers are enabled by default.

So it seems like we can also remove that one.

akien-mga added a commit to akien-mga/godot that referenced this issue Sep 23, 2022
- `-fomit-frame-pointer` is included automatically by both GCC and
  Clang in `-O1` and above.
- `-ftree-vectorize` is included automatically by GCC in `-O2` and
  beyond, and seems always enabled by Clang.

Closes godotengine#66296. See that issue for a detailed investigation.
timothyqiu pushed a commit to timothyqiu/godot that referenced this issue Dec 4, 2022
- `-fomit-frame-pointer` is included automatically by both GCC and
  Clang in `-O1` and above.
- `-ftree-vectorize` is included automatically by GCC in `-O2` and
  beyond, and seems always enabled by Clang.

Closes godotengine#66296. See that issue for a detailed investigation.

(cherry picked from commit c5c3d13)
Riordan-DC pushed a commit to Riordan-DC/godot that referenced this issue Jan 24, 2023
- `-fomit-frame-pointer` is included automatically by both GCC and
  Clang in `-O1` and above.
- `-ftree-vectorize` is included automatically by GCC in `-O2` and
  beyond, and seems always enabled by Clang.

Closes godotengine#66296. See that issue for a detailed investigation.

(cherry picked from commit c5c3d13)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant