-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "-mcpu=native" when building for aarch64 #532
Conversation
Can confirm the same in my Ampere At in Oracle Cloud. unmodified checkout : system_info: n_threads = 4 / 4 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 | main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ... [00:00:00.000 --> 00:00:11.000] And so, my fellow Americans, ask not what your country can do for you, ask what you can do for your country. whisper_print_timings: fallbacks = 0 p / 0 h using -mcpu=native : system_info: n_threads = 4 / 4 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 | main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ... [00:00:00.000 --> 00:00:11.000] And so, my fellow Americans, ask not what your country can do for you, ask what you can do for your country. whisper_print_timings: fallbacks = 0 p / 0 h |
Performance test was done in #89 (comment) and #89 (comment)
While the test was done only for Ampere A1 on Oracle Cloud, there's a recommendation from ARM to just set
-mcpu=native
. We might as well do it for all ARM CPUs.https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/compiler-flags-across-architectures-march-mtune-and-mcpu
On Ampere A1 on Oracle Cloud, FP16 will be enabled with
-mcpu=native
which results in large performance gains.If it's not acceptable to do this for all ARM CPUs, I can add an
ifdef WHISPER_AMPERE_A1
check before enabling-mcpu=native
.ARM CPUs aren't very good at reporting their names and cannot easily identify it.