Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

twister failing locally - fails to link native_posix w/lld #32237

Closed
jharris-intel opened this issue Feb 11, 2021 · 24 comments · Fixed by #35674
Closed

twister failing locally - fails to link native_posix w/lld #32237

jharris-intel opened this issue Feb 11, 2021 · 24 comments · Fixed by #35674
Assignees
Labels
area: Toolchains Toolchains area: Twister Twister bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Milestone

Comments

@jharris-intel
Copy link
Contributor

jharris-intel commented Feb 11, 2021

Describe the bug

While following the contribution guide to create a PR for this comment, I realized that twister is failing on my local dev setup.

To Reproduce

Steps to reproduce the behavior:

These steps are subtly incorrect, see below comments for details. Please do not use them as-is.

  1. git clone https://github.com/zephyrproject-rtos/zephyr.git
  2. cd zephyr
  3. west init
  4. west update
  5. ./scripts/twister (as an aside, is there a way to get twister to exit after the first failing test? I can't seem to find an option for it.)
  6. See errors on (seemingly) all posix tests, as well as a few other scattered failures.

Expected behavior

Testing that is documented as "please run [...] locally before submitting your pull request to GitHub." passes on trunk.

Impact

I have to guess as to if my PR changes actually break code or not, which may cause CI failures.

As per the documentation, "We highly recommend you run these tests locally to avoid any CI failures."

Logs and console output

Including boilerplate (Zephyr base): /home/jharri/r/zephyr/cmake/app/boilerplate.cmake
-- Application: /home/jharri/r/zephyr/samples/philosophers
-- Zephyr version: 2.5.0-rc3 (/home/jharri/r/zephyr)
-- Found Python3: /usr/bin/python3.8 (found suitable exact version "3.8.0") found components: Interpreter 
-- Found west (found suitable version "0.8.0", minimum required is "0.7.1")
-- Board: native_posix
-- Cache files will be written to: /home/jharri/.cache/zephyr
CMake Warning at /home/jharri/r/zephyr/cmake/host-tools.cmake:31 (message):
  Could NOT find dtc: Found unsuitable version "1.4.5", but required is at
  least "1.4.6" (found /usr/bin/dtc).  Optional devicetree error checking
  with dtc will not be performed.
Call Stack (most recent call first):
  /home/jharri/r/zephyr/cmake/app/boilerplate.cmake:516 (include)
  /home/jharri/r/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:24 (include)
  /home/jharri/r/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:35 (include_boilerplate)
  CMakeLists.txt:4 (find_package)


-- Found toolchain: host (gcc/ld)
-- Found BOARD.dts: /home/jharri/r/zephyr/boards/posix/native_posix/native_posix.dts
-- Generated zephyr.dts: /home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher/zephyr/zephyr.dts
-- Generated devicetree_unfixed.h: /home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher/zephyr/include/generated/devicetree_unfixed.h
-- Generated device_extern.h: /home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher/zephyr/include/generated/device_extern.h

warning: The choice symbol LOG_MODE_IMMEDIATE (defined at subsys/logging/Kconfig.mode:15) was
selected (set =y), but no symbol ended up as the choice selection. See
http://docs.zephyrproject.org/latest/reference/kconfig/CONFIG_LOG_MODE_IMMEDIATE.html and/or look up
LOG_MODE_IMMEDIATE in the menuconfig/guiconfig interface. The Application Development Primer,
Setting Configuration Values, and Kconfig - Tips and Best Practices sections of the manual might be
helpful too.

Parsing /home/jharri/r/zephyr/Kconfig
Loaded configuration '/home/jharri/r/zephyr/boards/posix/native_posix/native_posix_defconfig'
Merged configuration '/home/jharri/r/zephyr/samples/philosophers/prj.conf'
Configuration saved to '/home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher/zephyr/.config'
Kconfig header saved to '/home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher/zephyr/include/generated/autoconf.h'
-- The C compiler identification is GNU 8.4.0
-- The CXX compiler identification is GNU 8.4.0
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/gcc
CMake Deprecation Warning at ../../modules/lib/civetweb/CMakeLists.txt:2 (cmake_minimum_required):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Warning at ../../CMakeLists.txt:1345 (message):
  __ASSERT() statements are globally ENABLED


-- Configuring done
-- Generating done
-- Build files have been written to: /home/jharri/r/zephyr/twister-out/native_posix/samples/philosophers/sample.kernel.philosopher
Scanning dependencies of target parse_syscalls_target
[  1%] Generating misc/generated/syscalls.json, misc/generated/struct_tags.json
[  3%] Built target parse_syscalls_target
Scanning dependencies of target kobj_types_h_target
[  4%] Generating include/generated/kobj-types-enum.h, include/generated/otype-to-str.h
[  4%] Built target kobj_types_h_target
Scanning dependencies of target syscall_list_h_target
[  5%] Generating include/generated/syscall_dispatch.c, include/generated/syscall_list.h
[  5%] Built target syscall_list_h_target
Scanning dependencies of target driver_validation_h_target
[  6%] Generating include/generated/driver-validation.h
[  6%] Built target driver_validation_h_target
Scanning dependencies of target offsets
[  8%] Building C object zephyr/CMakeFiles/offsets.dir/arch/posix/core/offsets/offsets.c.obj
[  8%] Built target offsets
Scanning dependencies of target offsets_h
[  9%] Generating include/generated/offsets.h
[  9%] Built target offsets_h
Scanning dependencies of target zephyr_generated_headers
[  9%] Built target zephyr_generated_headers
Scanning dependencies of target app
[ 10%] Building C object CMakeFiles/app.dir/src/main.c.obj
[ 11%] Linking C static library app/libapp.a
[ 11%] Built target app
Scanning dependencies of target kernel
[ 12%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/device.c.obj
[ 13%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/errno.c.obj
[ 15%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/fatal.c.obj
[ 16%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/idle.c.obj
[ 17%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/init.c.obj
[ 18%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/kheap.c.obj
[ 19%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/mailbox.c.obj
[ 20%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/mem_slab.c.obj
[ 22%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/msg_q.c.obj
[ 23%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/mutex.c.obj
[ 24%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/pipes.c.obj
[ 25%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/queue.c.obj
[ 26%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/sched.c.obj
[ 27%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/sem.c.obj
[ 29%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/stack.c.obj
[ 30%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/system_work_q.c.obj
[ 31%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/thread.c.obj
[ 32%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/thread_abort.c.obj
[ 33%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/version.c.obj
[ 34%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/work_q.c.obj
[ 36%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/condvar.c.obj
[ 37%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/smp.c.obj
[ 38%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/banner.c.obj
[ 39%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/timeout.c.obj
[ 40%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/timer.c.obj
[ 41%] Building C object zephyr/kernel/CMakeFiles/kernel.dir/mempool.c.obj
[ 43%] Linking C static library libkernel.a
[ 43%] Built target kernel
Scanning dependencies of target linker_script_target
[ 44%] Generating linker.cmd
[ 44%] Built target linker_script_target
Scanning dependencies of target zephyr
[ 45%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/cbprintf.c.obj
[ 46%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/crc32_sw.c.obj
[ 47%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/crc16_sw.c.obj
[ 48%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/crc8_sw.c.obj
[ 50%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/crc7_sw.c.obj
[ 51%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/dec.c.obj
[ 52%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/fdtable.c.obj
[ 53%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/hex.c.obj
[ 54%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/notify.c.obj
[ 55%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/printk.c.obj
[ 56%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/onoff.c.obj
[ 58%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/rb.c.obj
[ 59%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/sem.c.obj
[ 60%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/thread_entry.c.obj
[ 61%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/timeutil.c.obj
[ 62%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/work_q.c.obj
[ 63%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/heap.c.obj
[ 65%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/heap-validate.c.obj
[ 66%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/cbprintf_complete.c.obj
[ 67%] Building C object zephyr/CMakeFiles/zephyr.dir/lib/os/assert.c.obj
[ 68%] Building C object zephyr/CMakeFiles/zephyr.dir/misc/generated/configs.c.obj
[ 69%] Building C object zephyr/CMakeFiles/zephyr.dir/drivers/console/native_posix_console.c.obj
[ 70%] Building C object zephyr/CMakeFiles/zephyr.dir/drivers/timer/sys_clock_init.c.obj
[ 72%] Building C object zephyr/CMakeFiles/zephyr.dir/drivers/timer/native_posix_timer.c.obj
[ 73%] Linking C static library libzephyr.a
[ 73%] Built target zephyr
Scanning dependencies of target arch__posix__core
[ 74%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/cpuhalt.c.obj
[ 75%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/fatal.c.obj
[ 76%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/irq.c.obj
[ 77%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/posix_core.c.obj
[ 79%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/swap.c.obj
[ 80%] Building C object zephyr/arch/arch/posix/core/CMakeFiles/arch__posix__core.dir/thread.c.obj
[ 81%] Linking C static library libarch__posix__core.a
[ 81%] Built target arch__posix__core
Scanning dependencies of target soc__posix__inf_clock
[ 82%] Building C object zephyr/soc/posix/inf_clock/CMakeFiles/soc__posix__inf_clock.dir/soc.c.obj
[ 83%] Linking C static library libsoc__posix__inf_clock.a
[ 83%] Built target soc__posix__inf_clock
Scanning dependencies of target boards__posix__native_posix
[ 84%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/hw_models_top.c.obj
[ 86%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/timer_model.c.obj
[ 87%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/native_rtc.c.obj
[ 88%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/irq_handler.c.obj
[ 89%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/irq_ctrl.c.obj
[ 90%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/main.c.obj
[ 91%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/tracing.c.obj
[ 93%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/cmdline_common.c.obj
[ 94%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/cmdline.c.obj
[ 95%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/cpu_wait.c.obj
[ 96%] Building C object zephyr/boards/posix/native_posix/CMakeFiles/boards__posix__native_posix.dir/hw_counter.c.obj
[ 97%] Linking C static library libboards__posix__native_posix.a
[ 97%] Built target boards__posix__native_posix
Scanning dependencies of target zephyr_prebuilt
[ 98%] Building C object zephyr/CMakeFiles/zephyr_prebuilt.dir/misc/empty_file.c.obj
[100%] Linking C executable zephyr.elf
ld: error: unable to INSERT AFTER/BEFORE .data: section not defined
collect2: error: ld returned 1 exit status
zephyr/CMakeFiles/zephyr_prebuilt.dir/build.make:111: recipe for target 'zephyr/zephyr.elf' failed
make[2]: *** [zephyr/zephyr.elf] Error 1
CMakeFiles/Makefile2:2211: recipe for target 'zephyr/CMakeFiles/zephyr_prebuilt.dir/all' failed
make[1]: *** [zephyr/CMakeFiles/zephyr_prebuilt.dir/all] Error 2
Makefile:102: recipe for target 'all' failed
make: *** [all] Error 2

Similar failures appear on seemingly all posix tests. (And a few others, but this is the main commonality I noted.)

Also see the attached twister.zip (zipped because it's a rather giant log file).

Overall:

INFO    - 5016 of 5421 test configurations passed (92.53%), 405 failed, 9558 skipped with 0 warnings in 1920.87 seconds
INFO    - In total 55296 test cases were executed, 124133 skipped on 323 out of total 329 platforms (98.18%)
INFO    - 2920 test configurations executed on platforms, 2501 test configurations were only built.

Environment (please complete the following information):

  • OS: Ubuntu 18.04.5 LTS
  • Toolchain (e.g Zephyr SDK, ...): how do I collect this information?
  • Commit SHA or Version used:
jharri@jharri-compilebox-01 ~/r/zephyr (master) $ git log -n 1 --oneline
9f858ac1b6 (HEAD -> master, origin/master, origin/HEAD) tests: drivers: can: timing: Fix potential div by zero

Additional context

I see the DTS warning in the above; I don't think it is related (The log message does say "Optional", but famous last words?). Unfortunately, 1.4.5. is the latest version available in Ubuntu 18.04 LTS.

I'm probably just missing something obvious here. E.g. I'm using a different default ld version or something that the tests aren't compatible with but don't override.

@jharris-intel jharris-intel added the bug The issue is a bug, or the PR is fixing a bug label Feb 11, 2021
@galak
Copy link
Collaborator

galak commented Feb 11, 2021

Can you see which version of ld you are running ld --version.

The docker image we use in CI is based on ubuntu 18.04.5:

root@030cbe6f9579:/# ld --version
GNU ld (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

@jharris-intel
Copy link
Contributor Author

jharris-intel commented Feb 11, 2021

jharri@jharri-compilebox-01 ~/r/zephyr (master) $ ld --version
LLD 10.0.0 (compatible with GNU linkers)

(which is apparently a lie :-) )

If Zephyr requires GNU ld in particular, perhaps it could explicitly use or specify ld.bfd, instead of implicitly relying on the default system linker being ld? I have that version of GNU ld installed also:

jharri@jharri-compilebox-01 ~/r/zephyr (master) $ ld.bfd --version
GNU ld (GNU Binutils for Ubuntu) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

@nashif has been working on LLVM support and @tejlmand can say how we specify LD and such. The native posix target probably makes assumptions that the majority of host linux systems are setup with GNU tools at this point.

@galak galak changed the title twister failing locally twister failing locally - fails to link native_posix w/lld Feb 11, 2021
@jharris-intel
Copy link
Contributor Author

Yep, turns out that I have a short-term workaround of "setting ld to bfd before running Zephyr tests".

Longer-term, can those tests please explicitly use ld.bfd if that's what they require?


There are a few other seemingly-unrelated failures still. In particular:

2 flaky tests (reran and they passed):

  1. qemu_x86_64_nokpti samples/kernel/condition_variables/condvar/sample.kernel.cond_var
    • Annoyingly, all I've got is the snippit from twister.log here, not the full output:
2021-02-11 13:13:41,043 - twister - DEBUG - QEMU (71369): inc_count: thread 3, count = 9, unlocking mutex
2021-02-11 13:13:41,044 - twister - DEBUG - QEMU (71369): inc_count: thread 2, count = 10, unlocking mutex
2021-02-11 13:13:41,044 - twister - DEBUG - QEMU (71369): E: Page fault at address 0xb9 (error code 0x10)
2021-02-11 13:13:41,044 - twister - DEBUG - QEMU (71369): E: Linear address not present in page tables
2021-02-11 13:13:41,045 - twister - DEBUG - QEMU (71369): E: Access violation: supervisor thread not allowed to execute
2021-02-11 13:13:41,045 - twister - DEBUG - QEMU (71369): E: PTE: not present
2021-02-11 13:13:41,046 - twister - DEBUG - QEMU (71369): E: RAX: 0x000000000010b370 RBX: 0x000000000010b370 RCX: 0x0000000000000001 RDX: 0x0000000000000001
2021-02-11 13:13:41,046 - twister - DEBUG - QEMU (71369): E: RSI: 0x0000000000000246 RDI: 0x000000000010f800 RBP: 0x0000000000115380 RSP: 0x0000000000115358
2021-02-11 13:13:41,047 - twister - DEBUG - QEMU (71369): E:  R8: 0x0000000000000001  R9: 0x000000007fd8cf35 R10: 0x0000000000000000 R11: 0x0000000000000000
2021-02-11 13:13:41,047 - twister - DEBUG - QEMU (71369): E: R12: 0x000000000010b370 R13: 0x0000000000000246 R14: 0x0000000000000246 R15: 0x000000000000000a
2021-02-11 13:13:41,047 - twister - DEBUG - QEMU (71369): E: RSP: 0x0000000000115358 RFLAGS: 0x0000000000000247 CS: 0x0018 CR3: 0x0000000000131000
2021-02-11 13:13:41,048 - twister - DEBUG - QEMU (71369): E: RIP: 0x00000000000000b9
2021-02-11 13:13:41,048 - twister - DEBUG - QEMU (71369): E: Attempt to resume un-suspended thread object
2021-02-11 13:13:41,048 - twister - DEBUG - QEMU (71369): E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 1
2021-02-11 13:13:41,048 - twister - DEBUG - QEMU (71369): E: Current thread: 0x10b370 (unknown)
2021-02-11 13:13:41,048 - twister - DEBUG - QEMU (71369): E: Halting system
2021-02-11 13:13:41,052 - twister - DEBUG - QEMU (71369) complete (unexpected eof) after 2.793084144592285 seconds
- Not reproducable, likely a race condition.
  1. qemu_x86_64_nokpti tests/kernel/mbox/mbox_usage/kernel.mailbox.usage
    • Ditto, on both accounts.
2021-02-11 13:36:45,568 - twister - DEBUG - QEMU (68187): Booting from ROM..*** Booting Zephyr OS build v2.5.0-rc3-79-g9f858ac1b6bc  ***
2021-02-11 13:36:45,580 - twister - DEBUG - QEMU (68187): Running test suite test_mbox
2021-02-11 13:36:45,580 - twister - DEBUG - QEMU (68187): ===================================================================
2021-02-11 13:36:45,580 - twister - DEBUG - QEMU (68187): START - test_msg_receiver
2021-02-11 13:36:45,589 - twister - DEBUG - QEMU (68187): ASSERTION FAIL [thread->base.pended_on] @ WEST_TOPDIR/kernel/sched.c:516
2021-02-11 13:36:45,590 - twister - DEBUG - QEMU (68187): E: RAX: 0x0000000000000004 RBX: 0x000000000010b040 RCX: 0x0000000000000001 RDX: 0x0000000000000001
2021-02-11 13:36:45,592 - twister - DEBUG - QEMU (68187): E: RSI: 0x0000000000000204 RDI: 0x000000000010a2cf RBP: 0x0000000000115370 RSP: 0x0000000000115358
2021-02-11 13:36:45,593 - twister - DEBUG - QEMU (68187): E:  R8: 0x0000000000000001  R9: 0x0000000000115100 R10: 0x00000000ffffffff R11: 0x0000000000000035
2021-02-11 13:36:45,595 - twister - DEBUG - QEMU (68187): E: R12: 0x000000000010b040 R13: 0x0000000000000000 R14: 0x0000000000000000 R15: 0x0000000000000000
2021-02-11 13:36:45,596 - twister - DEBUG - QEMU (68187): E: RSP: 0x0000000000115358 RFLAGS: 0x0000000000000002 CS: 0x0018 CR3: 0x000000000012e000
2021-02-11 13:36:45,596 - twister - DEBUG - QEMU (68187): E: RIP: 0x00000000001012c7
2021-02-11 13:36:45,597 - twister - DEBUG - QEMU (68187): E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 1
2021-02-11 13:36:45,597 - twister - DEBUG - QEMU (68187): E: Current thread: 0x10c240 (ztest_thread)
2021-02-11 13:36:45,599 - twister - DEBUG - QEMU (68187): E: Halting system
2021-02-11 13:36:45,604 - twister - DEBUG - QEMU (68187) complete (unexpected eof) after 0.6466786861419678 seconds

And a few deterministic failures (or seemingly so), in seemingly two buckets:

Bucket A:

  1. native_posix tests/subsys/openthread/openthread.radio -> fatal error: openthread/platform/radio.h: No such file or directory native_posix_tests_subsys_openthread_openthread_radio.log

Bucket B:

  1. nrf5340dk_nrf5340_cpuappns samples/synchronization/sample.kernel.synchronization (and tests/kernel/common/kernel.common)-> [Errno 2] No such file or directory: '/home/jharri/r/zephyr/../modules/tee/tfm/trusted-firmware-m/bl2/ext/mcuboot/root-RSA-3072.pem'
  2. nrf5340pdk_nrf5340_cpuappns, same examples and same error.

Both of these buckets look like path issues. Is there additional setup I need to do beyond what I mentioned at the start? If so, what, and if so, can that please be documented better?

Also, should I make separate tickets for these, or reuse this one?

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

Yep, turns out that I have a short-term workaround of "setting ld to bfd before running Zephyr tests".

Longer-term, can those tests please explicitly use ld.bfd if that's what they require?

We'd need a survey of common linux distro and if the majority support ld.bfd. Or possible try ld.bfd and fall back on ld.

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

There are a few other seemingly-unrelated failures still. In particular:

2 flaky tests (reran and they passed):

1.

QEMU is know to have issues when run under load w/twister.

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

And a few deterministic failures (or seemingly so), in seemingly two buckets:

Bucket A:

1. `native_posix` `tests/subsys/openthread/openthread.radio` -> `fatal error: openthread/platform/radio.h: No such file or directory` [native_posix_tests_subsys_openthread_openthread_radio.log](https://github.com/zephyrproject-rtos/zephyr/files/5968710/native_posix_tests_subsys_openthread_openthread_radio.log)

Bucket B:

1. `nrf5340dk_nrf5340_cpuappns` `samples/synchronization/sample.kernel.synchronization` (and `tests/kernel/common/kernel.common`)-> `[Errno 2] No such file or directory: '/home/jharri/r/zephyr/../modules/tee/tfm/trusted-firmware-m/bl2/ext/mcuboot/root-RSA-3072.pem'`

2. `nrf5340pdk_nrf5340_cpuappns`, same examples and same error.

Both of these buckets look like path issues. Is there additional setup I need to do beyond what I mentioned at the start? If so, what, and if so, can that please be documented better?

Also, should I make separate tickets for these, or reuse this one?

Did you do a west update to get all the modules pulled down and sync'd?

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

And a few deterministic failures (or seemingly so), in seemingly two buckets:
Bucket A:

1. `native_posix` `tests/subsys/openthread/openthread.radio` -> `fatal error: openthread/platform/radio.h: No such file or directory` [native_posix_tests_subsys_openthread_openthread_radio.log](https://github.com/zephyrproject-rtos/zephyr/files/5968710/native_posix_tests_subsys_openthread_openthread_radio.log)

Bucket B:

1. `nrf5340dk_nrf5340_cpuappns` `samples/synchronization/sample.kernel.synchronization` (and `tests/kernel/common/kernel.common`)-> `[Errno 2] No such file or directory: '/home/jharri/r/zephyr/../modules/tee/tfm/trusted-firmware-m/bl2/ext/mcuboot/root-RSA-3072.pem'`

2. `nrf5340pdk_nrf5340_cpuappns`, same examples and same error.

Both of these buckets look like path issues. Is there additional setup I need to do beyond what I mentioned at the start? If so, what, and if so, can that please be documented better?
Also, should I make separate tickets for these, or reuse this one?

Did you do a west update to get all the modules pulled down and sync'd?

Hmm, it feels like there is something else with your setup. We do nightly builds of Zephyr and aren't seeing any similar issues.

How many tests are failing? can you report the specific summary here.

For example for the root-RSA-3072.pem do you have that file? If not something else is wrong w/your setup.

@jharris-intel
Copy link
Contributor Author

We'd need a survey of common linux distro and if the majority support ld.bfd. Or possible try ld.bfd and fall back on ld.

Hm. I, for one, would prefer to know I was running in an unsupported configuration up front, rather than failures later.

Did you do a west update to get all the modules pulled down and sync'd?

Hm. I can reproduce this from "scratch" with the steps at the start. That is, start with a shell in an empty folder, then do:

  1. git clone https://github.com/zephyrproject-rtos/zephyr.git
  2. cd zephyr
  3. west init
  4. west update
  5. ./scripts/twister

...which likely means that something in the above is subtly incorrect. (Or blatantly incorrect and I'm just missing something.)

QEMU is know to have issues when run under load w/twister.

Hm. Is this QEMU itself? Or just exposing race conditions in Zephyr that are more prominent under system load?

For example for the root-RSA-3072.pem do you have that file?

I do, but not in that directory (note the directory path ends up outside the repo!)

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

We'd need a survey of common linux distro and if the majority support ld.bfd. Or possible try ld.bfd and fall back on ld.

Hm. I, for one, would prefer to know I was running in an unsupported configuration up front, rather than failures later.

You're the first person I'm aware of to run into this issue. As I mentioned, I'm guessing the majority of distro installs probably still default to GNU tools. We check version of other tools, so a version check could be added to catch such an issue.

Did you do a west update to get all the modules pulled down and sync'd?

Hm. I can reproduce this from "scratch" with the steps at the start. That is, start with a shell in an empty folder, then do:

1. `git clone https://github.com/zephyrproject-rtos/zephyr.git`

2. `cd zephyr`

3. `west init`

4. `west update`

5. `./scripts/twister`

...which likely means that something in the above is subtly incorrect. (Or blatantly incorrect and I'm just missing something.)

So I think the about is subtly incorrect. As you did a git clone, I think step 3, should be something like west init -l .. Based on what you've got you probably have /zephyr/{bunch of zephyr files like west.yml, arch/ boards/, etc} and <ROOT/zephyr/zephyr/{bunch of zephyr files}. Effectively have 2 clones of zephyr

QEMU is know to have issues when run under load w/twister.

Hm. Is this QEMU itself? Or just exposing race conditions in Zephyr that are more prominent under system load?

Its believed to be an issue w/QEMU itself.

For example for the root-RSA-3072.pem do you have that file?

I do, but not in that directory (note the directory path ends up outside the repo!)

Sure, just confused by the specific error as its saying no such file or dir.

@jharris-intel
Copy link
Contributor Author

So I think the about is subtly incorrect. As you did a git clone, I think step 3, should be something like west init -l .. Based on what you've got you probably have /zephyr/{bunch of zephyr files like west.yml, arch/ boards/, etc} and <ROOT/zephyr/zephyr/{bunch of zephyr files}. Effectively have 2 clones of zephyr

Yep, that's exactly the case. I'll retry with west init -l . and let you know how it goes.

@jharris-intel
Copy link
Contributor Author

You're the first person I'm aware of to run into this issue. As I mentioned, I'm guessing the majority of distro installs probably still default to GNU tools. We check version of other tools, so a version check could be added to catch such an issue.

It's a little unfortunate, indeed. Linux has all of the infrastructure to have system default tools with people changing around for different implementations, but there's now too much software that assumes that e.g. cc is always gcc , which results in people not actually changing the configuration because it breaks too many things, which in turn means that more software makes those assumptions because the assumptions aren't challenged as often, and so on.

Its believed to be an issue w/QEMU itself.

Oh interesting. Is there an upstream bug report for this?

@jharris-intel
Copy link
Contributor Author

Hm. With west init -l ., west update seems to be dumping a lot of stuff into the parent directory, (modules folder, etc, etc) in this case my main toplevel repository folder.

I usually have a folder structure that's essentially:

~/r/ -> parent folder for all git repositories
~/r/foo/ -> some repository that I'm working on's git root folder
~/r/bar/ -> some other repository that I'm working on's git root folder
~/r/baz/ -> some other repository that I'm working on's git root folder

etc. This makes it (relatively) straightforward to do e.g. mass fetches of repositories and avoids repos in deeply nested folders (which can be a problem in some cases).

Is there no way to do this with zephyr?

For now I suppose I can make a folder ~/r/z/ and use that as the base west folder, but then I'm "always" working in ~/r/z/zephyr, assuming I understand how this works correctly.

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

Hm. With west init -l ., west update seems to be dumping a lot of stuff into the parent directory, (modules folder, etc, etc) in this case my main toplevel repository folder.

I usually have a folder structure that's essentially:

~/r/ -> parent folder for all git repositories
~/r/foo/ -> some repository that I'm working on's git root folder
~/r/bar/ -> some other repository that I'm working on's git root folder
~/r/baz/ -> some other repository that I'm working on's git root folder

etc. This makes it (relatively) straightforward to do e.g. mass fetches of repositories and avoids repos in deeply nested folders (which can be a problem in some cases).

Is there no way to do this with zephyr?

For now I suppose I can make a folder ~/r/z/ and use that as the base west folder, but then I'm "always" working in ~/r/z/zephyr, assuming I understand how this works correctly.

@mbolivar-nordic is probably the best person to answer this.

@galak
Copy link
Collaborator

galak commented Feb 11, 2021

Its believed to be an issue w/QEMU itself.

Oh interesting. Is there an upstream bug report for this?

Take a look at #14173

@jharris-intel
Copy link
Contributor Author

@galak - yep, rerunning with the workarounds from this thread and I only get a couple of QEMU timeouts/unexpected EOF. Good catch.

INFO    - Total complete: 5740/6670  86%  skipped: 9402, failed:    0
ERROR   - qemu_x86_64               tests/kernel/mem_protect/syscalls/kernel.memory_protection.syscalls FAILED: unexpected eof
ERROR   - see: /home/jharri/r/z/zephyr/twister-out/qemu_x86_64/tests/kernel/mem_protect/syscalls/kernel.memory_protection.syscalls/handler.log
INFO    - Total complete: 6427/6670  96%  skipped: 9498, failed:    1
ERROR   - qemu_x86_64               tests/kernel/mbox/mbox_usage/kernel.mailbox.usage  FAILED: Timeout
ERROR   - see: /home/jharri/r/z/zephyr/twister-out/qemu_x86_64/tests/kernel/mbox/mbox_usage/kernel.mailbox.usage/handler.log
INFO    - Total complete: 6437/6670  96%  skipped: 9498, failed:    2
ERROR   - qemu_x86_64_nokpti        tests/kernel/mbox/mbox_usage/kernel.mailbox.usage  FAILED: Timeout
ERROR   - see: /home/jharri/r/z/zephyr/twister-out/qemu_x86_64_nokpti/tests/kernel/mbox/mbox_usage/kernel.mailbox.usage/handler.log
INFO    - Total complete: 6670/6670  100%  skipped: 9558, failed:    3

And then rerunning with --only-failed

ERROR   - qemu_x86_64               tests/kernel/mem_protect/syscalls/kernel.memory_protection.syscalls FAILED: unexpected eof
ERROR   - see: /home/jharri/r/z/zephyr/twister-out/qemu_x86_64/tests/kernel/mem_protect/syscalls/kernel.memory_protection.syscalls/handler.log
INFO    - Total complete:    3/   3  100%  skipped:    0, failed:    1

Unfortunately, this one failed the same way 3/4 runs on my box. Not the end of the world, but unfortunate.

@mbolivar-nordic
Copy link
Contributor

For now I suppose I can make a folder ~/r/z/ and use that as the base west folder, but then I'm "always" working in ~/r/z/zephyr, assuming I understand how this works correctly.

You need to create a west workspace. west init -l . means . is your manifest repository directory, so .. is the workspace topdir, and ../modules is where the modules will go, etc.

Details in https://docs.zephyrproject.org/latest/guides/west/basics.html and related pages.

@carlescufi
Copy link
Member

carlescufi commented Feb 13, 2021

Hm. With west init -l ., west update seems to be dumping a lot of stuff into the parent directory, (modules folder, etc, etc) in this case my main toplevel repository folder.

I was looking at your logs and though that your r/ stood for "repos" so I immediately thought of that before seeing your comment. The Getting Started Guide does ask you to create an enclosing folder (using west, but you can do that manually too since west init defaults to pwd:

west init ~/zephyrproject
cd ~/zephyrproject
west update

For now, moving your r/zephyr folder into r/z/zephyr and then running west init -l inside it as @mbolivar-nordic will do the trick. Don't forget to delete all the modules in r/, and then also r/.west/, which will have been created by it and marks the top dir of the west workspace.

@carlescufi
Copy link
Member

FYI @jharris-intel, GitHub/CI will run twister for you when you post the PR, so running it locally is technically optional. That said, having a working twister setup on your local machine definitely helps enormously.

@tejlmand
Copy link
Collaborator

instead of implicitly relying on the default system linker being ld?

well, expecting ld to be ld doesn't seem wrong to me.
The fact that lld can work as a drop-in replacement of ld in most cases doesn't make it 100% compatible with ld.

We still have some work to do before linker scripts are updated to support lld, but it is part of our plans.
Even when compiling using clang we still have to link with ld.

set(LINKER ld) # TODO: Use lld eventually rather than GNU ld

Instead of trying to detect if users has lld as an ld replacement, I think it would be better to get proper lld support in Zephyr.
Thus i'm not sure Zephyr build system should try to handle the lld as ld replacement situation here.

In this case, users has the ability of telling CMake (gcc) to actually use ld.bfd instead.
This can be done as:

cmake -DCMAKE_C_FLAGS="-fuse-ld=bfd" .....

or when using twister:

./scripts/twister -p native_posix -T samples/philosophers/ -x=CMAKE_C_FLAGS="-fuse-ld=bfd"

this will tell the build system to specifically use ld.bfd, which I think is sufficient, and works.

We could try to make documentation clearer on the fact that Zephyr cannot link using lld at the moment, and how users that has lld as ld replacement still can link.

@jharris-intel
Copy link
Contributor Author

The Getting Started Guide does ask you to create an enclosing folder (using west, but you can do that manually too since west init defaults to pwd:

FYI in terms of helping with documentation (and hopefully making so that the next person is less confused than I was): I had seen that but had assumed that that was just for the purposes of setting up the mentioned cmake and pip stuff, not something that had to be done for every local repo copy.

FYI @jharris-intel, GitHub/CI will run twister for you when you post the PR, so running it locally is technically optional. That said, having a working twister setup on your local machine definitely helps enormously.

Good to know...

well, expecting ld to be ld doesn't seem wrong to me.

Sure. It's more expecting ld to be specifically ld.bfd that's the issue here. (Akin to someone using #!/bin/sh for a bash script - sure, it works if your default shell is bash - but if your script relies on bash then please say so instead of having strange issues down the line when you run on a machine whose default shell is e.g. dash.)

@tejlmand
Copy link
Collaborator

Sure. It's more expecting ld to be specifically ld.bfd that's the issue here.

I see your point.

@nashif
Copy link
Member

nashif commented Feb 17, 2021

Even when compiling using clang we still have to link with ld.

this is already moving forward, see 265532b

i9:zephyr(oneAPI): west build -b native_posix samples/hello_world/
-- west build: generating a build system
Including boilerplate (Zephyr base): /home/nashif/Work/zephyrproject/zephyr/cmake/app/boilerplate.cmake
-- Application: /home/nashif/Work/zephyrproject/zephyr/samples/hello_world
-- Zephyr version: 2.5.99 (/home/nashif/Work/zephyrproject/zephyr)
-- Found Python3: /usr/bin/python3.9 (found suitable exact version "3.9.1") found components: Interpreter
-- Found west (found suitable version "0.9.0", minimum required is "0.7.1")
-- Board: native_posix
-- Cache files will be written to: /home/nashif/.cache/zephyr
-- Found dtc: /usr/bin/dtc (found suitable version "1.6.0", minimum required is "1.4.6")
-- Found toolchain: host (clang/ld)
-- Found BOARD.dts: /home/nashif/Work/zephyrproject/zephyr/boards/posix/native_posix/native_posix.dts
-- Generated zephyr.dts: /home/nashif/Work/zephyrproject/zephyr/build/zephyr/zephyr.dts
-- Generated devicetree_unfixed.h: /home/nashif/Work/zephyrproject/zephyr/build/zephyr/include/generated/devicetree_unfixed.h
-- Generated device_extern.h: /home/nashif/Work/zephyrproject/zephyr/build/zephyr/include/generated/device_extern.h

warning: The choice symbol LOG_MODE_IMMEDIATE (defined at subsys/logging/Kconfig.mode:15) was
selected (set =y), but no symbol ended up as the choice selection. See
http://docs.zephyrproject.org/latest/reference/kconfig/CONFIG_LOG_MODE_IMMEDIATE.html and/or look up
LOG_MODE_IMMEDIATE in the menuconfig/guiconfig interface. The Application Development Primer,
Setting Configuration Values, and Kconfig - Tips and Best Practices sections of the manual might be
helpful too.

Parsing /home/nashif/Work/zephyrproject/zephyr/Kconfig
Loaded configuration '/home/nashif/Work/zephyrproject/zephyr/boards/posix/native_posix/native_posix_defconfig'
Merged configuration '/home/nashif/Work/zephyrproject/zephyr/samples/hello_world/prj.conf'
Configuration saved to '/home/nashif/Work/zephyrproject/zephyr/build/zephyr/.config'
Kconfig header saved to '/home/nashif/Work/zephyrproject/zephyr/build/zephyr/include/generated/autoconf.h'
-- The C compiler identification is Clang 11.0.0
-- The CXX compiler identification is Clang 11.0.0
-- The ASM compiler identification is Clang
-- Found assembler: /usr/lib64/ccache/clang
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nashif/Work/zephyrproject/zephyr/build
-- west build: building application
[1/85] Preparing syscall dependency handling

[85/85] Linking C executable zephyr/zephyr.elf

and then...

i9:zephyr(oneAPI): grep lld build/CMakeCache.txt
CMAKE_LINKER:FILEPATH=/usr/bin/ld.lld

@nashif nashif added area: Twister Twister priority: low Low impact/importance bug labels Feb 20, 2021
@nashif nashif self-assigned this Feb 20, 2021
@nashif nashif removed their assignment Mar 8, 2021
@nashif nashif added the area: Toolchains Toolchains label Mar 8, 2021
@github-actions
Copy link

github-actions bot commented May 8, 2021

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label May 8, 2021
@tejlmand tejlmand removed the Stale label May 10, 2021
@carlescufi carlescufi added this to the v2.6.0 milestone May 19, 2021
tejlmand added a commit to tejlmand/zephyr that referenced this issue May 26, 2021
Fixes: zephyrproject-rtos#32237

When building for native_posix, then host tools are used.
This means that gcc will link using `/usr/bin/ld` per default.

If ld points to lld, then linking will fail.

This commit will first look for ld.bfd, and if found then use
-fuse-ld=bfd for linking. If ld.bfd is not found, then ld is used as
fallback as that will be assumed to be the best working candidate.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
galak pushed a commit that referenced this issue May 26, 2021
Fixes: #32237

When building for native_posix, then host tools are used.
This means that gcc will link using `/usr/bin/ld` per default.

If ld points to lld, then linking will fail.

This commit will first look for ld.bfd, and if found then use
-fuse-ld=bfd for linking. If ld.bfd is not found, then ld is used as
fallback as that will be assumed to be the best working candidate.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Toolchains Toolchains area: Twister Twister bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants