Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No apparent way to build C/C++ with -O0 #11921

Closed
kmicklas opened this issue Jun 24, 2022 · 2 comments · Fixed by #11949
Closed

No apparent way to build C/C++ with -O0 #11921

kmicklas opened this issue Jun 24, 2022 · 2 comments · Fixed by #11949
Labels
bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. zig cc Zig as a drop-in C compiler feature
Milestone

Comments

@kmicklas
Copy link
Contributor

Zig Version

0.10.0-dev.2674+d980c6a38

Steps to Reproduce

> touch test.c
> ~/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/zig cc -v -O0 -c test.c
clang version 13.0.1 (git@github.com:ziglang/zig-bootstrap.git 81f0e6c5b902ead84753490db4f0007d08df964a)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Selected multilib: .;@m64
 (in-process)
 "/home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/zig" -cc1 -triple x86_64-unknown-linux-gnu -emit-obj --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name test.c -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -target-cpu x86-64 -tune-cpu generic -debug-info-kind=constructor -dwarf-version=4 -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/kmicklas/zig-bug -nostdsysteminc -nobuiltininc -resource-dir /home/kmicklas/lib/clang/13.0.1 -dependency-file /home/kmicklas/.cache/zig/tmp/23183fef7b85b6ba-test.o.d -MT /home/kmicklas/.cache/zig/tmp/23183fef7b85b6ba-test.o -sys-header-deps -MV -isystem /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/include -isystem /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/x86_64-linux-gnu -isystem /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/generic-glibc -isystem /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/x86-linux-any -isystem /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/any-linux-any -isystem /usr/local/include -isystem /usr/include/x86_64-linux-gnu -isystem /usr/include -D __GLIBC_MINOR__=31 -D _DEBUG -Og -fdebug-compilation-dir=/home/kmicklas/zig-bug -ferror-limit 19 -fsanitize=alignment,array-bounds,bool,builtin,enum,float-cast-overflow,function,integer-divide-by-zero,nonnull-attribute,null,object-size,pointer-overflow,return,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,unreachable,vla-bound -fsanitize-trap=alignment,array-bounds,bool,builtin,enum,float-cast-overflow,function,integer-divide-by-zero,nonnull-attribute,null,object-size,pointer-overflow,return,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,unreachable,vla-bound -stack-protector 2 -stack-protector-buffer-size 4 -fgnuc-version=4.2.1 -fcolor-diagnostics -fno-spell-checking -target-cpu x86-64 -target-feature -16bit-mode -target-feature -32bit-mode -target-feature -3dnow -target-feature -3dnowa -target-feature +64bit -target-feature +adx -target-feature +aes -target-feature -amx-bf16 -target-feature -amx-int8 -target-feature -amx-tile -target-feature +avx -target-feature +avx2 -target-feature -avx512bf16 -target-feature +avx512bitalg -target-feature +avx512bw -target-feature +avx512cd -target-feature +avx512dq -target-feature -avx512er -target-feature +avx512f -target-feature +avx512ifma -target-feature -avx512pf -target-feature +avx512vbmi -target-feature +avx512vbmi2 -target-feature +avx512vl -target-feature +avx512vnni -target-feature +avx512vp2intersect -target-feature +avx512vpopcntdq -target-feature -avxvnni -target-feature +bmi -target-feature +bmi2 -target-feature -branchfusion -target-feature -cldemote -target-feature +clflushopt -target-feature +clwb -target-feature -clzero -target-feature +cmov -target-feature +cx16 -target-feature +cx8 -target-feature -enqcmd -target-feature -ermsb -target-feature +f16c -target-feature -false-deps-lzcnt-tzcnt -target-feature -false-deps-popcnt -target-feature -fast-11bytenop -target-feature -fast-15bytenop -target-feature -fast-7bytenop -target-feature -fast-bextr -target-feature -fast-gather -target-feature -fast-hops -target-feature -fast-lzcnt -target-feature -fast-movbe -target-feature -fast-scalar-fsqrt -target-feature -fast-scalar-shift-masks -target-feature -fast-shld-rotate -target-feature -fast-variable-crosslane-shuffle -target-feature -fast-variable-perlane-shuffle -target-feature -fast-vector-fsqrt -target-feature -fast-vector-shift-masks -target-feature +fma -target-feature -fma4 -target-feature +fsgsbase -target-feature -fsrm -target-feature +fxsr -target-feature +gfni -target-feature -hreset -target-feature -idivl-to-divb -target-feature +idivq-to-divl -target-feature +invpcid -target-feature -kl -target-feature -lea-sp -target-feature -lea-uses-ag -target-feature -lvi-cfi -target-feature -lvi-load-hardening -target-feature -lwp -target-feature +lzcnt -target-feature +macrofusion -target-feature +mmx -target-feature +movbe -target-feature +movdir64b -target-feature +movdiri -target-feature -mwaitx -target-feature +nopl -target-feature -pad-short-functions -target-feature +pclmul -target-feature -pconfig -target-feature +pku -target-feature +popcnt -target-feature -prefer-128-bit -target-feature -prefer-256-bit -target-feature -prefer-mask-registers -target-feature -prefetchwt1 -target-feature +prfchw -target-feature -ptwrite -target-feature +rdpid -target-feature +rdrnd -target-feature +rdseed -target-feature -retpoline -target-feature -retpoline-external-thunk -target-feature -retpoline-indirect-branches -target-feature -retpoline-indirect-calls -target-feature -rtm -target-feature +sahf -target-feature -serialize -target-feature -seses -target-feature -sgx -target-feature +sha -target-feature +shstk -target-feature +slow-3ops-lea -target-feature +slow-incdec -target-feature -slow-lea -target-feature -slow-pmaddwd -target-feature -slow-pmulld -target-feature -slow-shld -target-feature -slow-two-mem-ops -target-feature -slow-unaligned-mem-16 -target-feature -slow-unaligned-mem-32 -target-feature -soft-float -target-feature +sse -target-feature +sse2 -target-feature +sse3 -target-feature +sse4.1 -target-feature +sse4.2 -target-feature -sse4a -target-feature -sse-unaligned-mem -target-feature +ssse3 -target-feature -tbm -target-feature -tsxldtrk -target-feature -uintr -target-feature -use-aa -target-feature -use-glm-div-sqrt-costs -target-feature +vaes -target-feature +vpclmulqdq -target-feature +vzeroupper -target-feature -waitpkg -target-feature -wbnoinvd -target-feature -widekl -target-feature +x87 -target-feature -xop -target-feature +xsave -target-feature +xsavec -target-feature +xsaveopt -target-feature +xsaves -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /home/kmicklas/.cache/zig/tmp/23183fef7b85b6ba-test.o -x c test.c
clang -cc1 version 13.0.1 based upon LLVM 13.0.1 default target x86_64-linux-musl
#include "..." search starts here:
#include <...> search starts here:
 /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/include
 /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/x86_64-linux-gnu
 /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/generic-glibc
 /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/x86-linux-any
 /home/kmicklas/zig-linux-x86_64-0.10.0-dev.2674+d980c6a38/lib/libc/include/any-linux-any
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
LLD Link... ld.lld -r -error-limit=0 -O0 --eh-frame-hdr -znow -m elf_x86_64 -static -o test.o -L /usr/local/lib64 -L /usr/local/lib -L /usr/lib/x86_64-linux-gnu -L /lib64 -L /lib -L /usr/lib64 -L /usr/lib -L /lib/x86_64-linux-gnu /home/kmicklas/.cache/zig/o/8ddd6d697f5107df5aa0206cb6757279/test.o

Expected Behavior

Clang would be invoked with -O0.

Actual Behavior

Zig interprets this as "debug" compilation mode, and always passes -Og.

GCC originally introduced -Og as an intermediate between -O0 (pathologically slow) and -O1 (some optimizations which impair debugability). Clang later copied the flag, but it seems like currently it is just an alias for -O1.

We care about this because -O1/-Og are significantly slower to compile than -O0. For CI or a local development feedback cycle, we want to be able to compile as fast as possible.

The potential options I can think of are:

  • Just don't pass -Og in debug mode, since it doesn't currently do anything different than -O1. If Clang later decides to disable some optimizations in -Og, we can reevaluate then.
  • Still default to -Og in debug mode, but let -O0 override it.
  • Create a new "fast build" mode which defaults to -O0. Getting fast compilation feedback and being able to debug effectively are conceptually distinct modes.
@kmicklas kmicklas added the bug Observed behavior contradicts documented or intended behavior label Jun 24, 2022
@motiejus
Copy link
Contributor

motiejus commented Jun 27, 2022

I did some comparisons between:

  • clang-13:
    • -O0
    • -O0 -g -fsanitize=undefined
    • -Og -g
    • -Og -g -fsanitize=undefined
  • upstream zig cc (for which this bug is filed).
  • zig cc with a new mode FastBuild, which disables optimizations and safety checks (master...motiejus:mode-fastbuild). We call it zig-cc0 in the benchmark below. We can discuss other venues to pass -O0 too, but that was the simplest I could implement without breaking other stuff.

Prep

$ git clone https://github.com/sqlite/sqlite.git; cd sqlite
$ git checkout 54f1fc4f940b0b2cbc38d5decf8b17c1d9959f6d  # version-3.39.0
$ ./configure --disable-tcl
<...>
checking whether to support JSON functions... yes
checking whether to support MEMSYS5... no
checking whether to support MEMSYS3... no
checking whether to support FTS3... no
checking whether to support FTS4... no
checking whether to support FTS5... no
checking whether to support LIMIT on UPDATE and DELETE statements... no
checking whether to support GEOPOLY... no
checking whether to support RTREE... no
checking whether to support SESSION... no
<...>
$ make sqlite3.c
$ ls -lh sqlite3.c  shell.c
-rw-r--r-- 1 motiejus engineering 717K Jun 27 09:01 shell.c
-rw-r--r-- 1 motiejus engineering 8.2M Jun 27 09:01 sqlite3.c 
$

Benchmark

Hardware:

OS/kernel: Ubuntu 22.04 x86_64 5.15.0-35
CPU: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz

Run:

zcc0="/code/zig/build/zig cc -target x86_64-linux-gnu.2.28"; zcc="zig cc -target x86_64-linux-gnu.2.29"; \
    time taskset --cpu-list 0 \
    hyperfine -r 3 -w 0 --export-markdown ~/all.md \
    -p : \
    -n "clang-13 -O0                         (optimizations off, debug off, safety off)" "clang-13 -O0 shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3" \
    -p : \
    -n "clang-13 -O0 -g                      (optimizations off, debug on,  safety off)" "clang-13 -O0 -g shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3" \
    -p "rm -fr $HOME/.cache/zig; echo | $zcc0 /dev/stdin 2>/dev/null || :" \
    -n "zig-cc0  -O0 -g                      (optimizations off, debug on,  safety off)" "$zcc0  -O0 -g sqlite3.c shell.c -o sqlite3" \
    -p : \
    -n "clang-13 -O0 -g -fsanitize=undefined (optimizations off, debug on,  safety  on)" "clang-13 -O0 -g -fsanitize=undefined shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3" \
    -p : \
    -n "clang-13 -O1                         (optimizations on,  debug off, safety off)" "clang-13 -O1 shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3" \
    -p : \
    -n "clang-13 -Og -g                      (optimizations on,  debug on,  safety off)" "clang-13 -Og -g shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3" \
    -p "rm -fr $HOME/.cache/zig; echo | $zcc /dev/stdin 2>/dev/null || :" \
    -n "zig-cc   -Og -g                      (optimizations on,  debug on,  safety  on)" "$zcc -Og -g sqlite3.c shell.c -o sqlite3" \
    -p : \
    -n "clang-13 -Og -g -fsanitize=undefined (optimizations on,  debug on,  safety  on)" "clang-13 -Og -g -fsanitize=undefined shell.c sqlite3.c -lpthread -ldl -lm -o sqlite3"





Benchmark 1: clang-13 -O0                         (optimizations off, debug off, safety off)
  Time (mean ± σ):      1.746 s ±  0.082 s    [User: 1.642 s, System: 0.075 s]
  Range (min … max):    1.697 s …  1.840 s    3 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: clang-13 -O0 -g                      (optimizations off, debug on,  safety off)
  Time (mean ± σ):      3.087 s ±  0.343 s    [User: 2.883 s, System: 0.125 s]
  Range (min … max):    2.845 s …  3.479 s    3 runs
 
Benchmark 3: zig-cc0  -O0 -g                      (optimizations off, debug on,  safety off)
  Time (mean ± σ):      6.793 s ±  0.521 s    [User: 6.001 s, System: 0.587 s]
  Range (min … max):    6.306 s …  7.343 s    3 runs
 
Benchmark 4: clang-13 -O0 -g -fsanitize=undefined (optimizations off, debug on,  safety  on)
  Time (mean ± σ):      8.719 s ±  0.232 s    [User: 8.140 s, System: 0.350 s]
  Range (min … max):    8.490 s …  8.955 s    3 runs
 
Benchmark 5: clang-13 -O1                         (optimizations on,  debug off, safety off)
  Time (mean ± σ):     24.211 s ±  1.531 s    [User: 23.427 s, System: 0.172 s]
  Range (min … max):   22.857 s … 25.873 s    3 runs
 
Benchmark 6: clang-13 -Og -g                      (optimizations on,  debug on,  safety off)
  Time (mean ± σ):     27.785 s ±  1.395 s    [User: 26.914 s, System: 0.229 s]
  Range (min … max):   26.253 s … 28.981 s    3 runs
 
Benchmark 7: zig-cc   -Og -g                      (optimizations on,  debug on,  safety  on)
  Time (mean ± σ):     50.028 s ±  2.578 s    [User: 47.044 s, System: 1.783 s]
  Range (min … max):   47.085 s … 51.881 s    3 runs
 
Benchmark 8: clang-13 -Og -g -fsanitize=undefined (optimizations on,  debug on,  safety  on)
  Time (mean ± σ):     75.033 s ±  0.480 s    [User: 72.788 s, System: 0.515 s]
  Range (min … max):   74.479 s … 75.335 s    3 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  'clang-13 -O0                         (optimizations off, debug off, safety off)' ran
    1.77 ± 0.21 times faster than 'clang-13 -O0 -g                      (optimizations off, debug on,  safety off)'
    3.89 ± 0.35 times faster than 'zig-cc0  -O0 -g                      (optimizations off, debug on,  safety off)'
    4.99 ± 0.27 times faster than 'clang-13 -O0 -g -fsanitize=undefined (optimizations off, debug on,  safety  on)'
   13.87 ± 1.09 times faster than 'clang-13 -O1                         (optimizations on,  debug off, safety off)'
   15.92 ± 1.09 times faster than 'clang-13 -Og -g                      (optimizations on,  debug on,  safety off)'
   28.66 ± 2.00 times faster than 'zig-cc   -Og -g                      (optimizations on,  debug on,  safety  on)'
   42.98 ± 2.03 times faster than 'clang-13 -Og -g -fsanitize=undefined (optimizations on,  debug on,  safety  on)'

Results table

Command Mean [s] Min [s] Max [s] Relative
clang-13 -O0 (optimizations off, debug off, safety off) 1.746 ± 0.082 1.697 1.840 1.00
clang-13 -O0 -g (optimizations off, debug on, safety off) 3.087 ± 0.343 2.845 3.479 1.77 ± 0.21
zig-cc0 -O0 -g (optimizations off, debug on, safety off) 6.793 ± 0.521 6.306 7.343 3.89 ± 0.35
clang-13 -O0 -g -fsanitize=undefined (optimizations off, debug on, safety on) 8.719 ± 0.232 8.490 8.955 4.99 ± 0.27
clang-13 -O1 (optimizations on, debug off, safety off) 24.211 ± 1.531 22.857 25.873 13.87 ± 1.09
clang-13 -Og -g (optimizations on, debug on, safety off) 27.785 ± 1.395 26.253 28.981 15.92 ± 1.09
zig-cc -Og -g (optimizations on, debug on, safety on) 50.028 ± 2.578 47.085 51.881 28.66 ± 2.00
clang-13 -Og -g -fsanitize=undefined (optimizations on, debug on, safety on) 75.033 ± 0.480 74.479 75.335 42.98 ± 2.03

Summary

  • Our lower bound is ~3s (zig defaults to -g and can't turn it off by design).
  • We are currently spending ~50s, even when specifying -O0; like @kmicklas original issue suggests.
  • Naïvely with -O FastBuild our latency goes down from 50s to ~6.5s.
  • I am curious why such difference between the last two tests: in principle, they are doing the same thing.

@andrewrk andrewrk added the zig cc Zig as a drop-in C compiler feature label Jun 28, 2022
@andrewrk andrewrk added this to the 0.10.0 milestone Jun 28, 2022
@andrewrk
Copy link
Member

andrewrk commented Jun 28, 2022

Thank you both for the detailed breakdown and analysis.

Just don't pass -Og in debug mode, since it doesn't currently do anything different than -O1. If Clang later decides to disable some optimizations in -Og, we can reevaluate then.

Given the information you have provided, I support this option. Zig should pass -O0 to clang for Debug mode.

This is a 1-line fix, plus some comments explaining the reasoning so that a future maintainer does not make the same mistake I did and put -Og back in place.

try argv.append("-Og");

@andrewrk andrewrk added the contributor friendly This issue is limited in scope and/or knowledge of Zig internals. label Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. zig cc Zig as a drop-in C compiler feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants