-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsanitized IPC input is used for memory flags #8832
Comments
Debugging this is much easier with zephyrproject-rtos/zephyr#68494 and a bit easier with #8831 |
What is the reason we are just not reverting the change? |
The change looks like just the messenger to me. Commit 58a42e5 adds a Fuzzing is great but it's not a silver bullet. In this case, fuzzing never seemed to notice the unsanitized flags because they never caused any obvious corruption? |
Agreed, I thought the commit added the fallthrough case not just expressing already broken logic, I was under a bad assumption. |
@marc-hb can you fix and add validation checks around the memory types passed to IPC. Thanks. |
Fixes lack of SOF_MEM_CAPS_* input validation found in thesofproject#8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Fixes lack of SOF_MEM_CAPS_* input validation found in thesofproject#8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Fixes lack of SOF_MEM_CAPS_* input validation found in thesofproject#8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
I have a quick and dirty hack that is passing all the tests in #8850 but it will break again whenever we add a new MEM_CAPS bit.
I still don't think this L3 heap should be reverted but @jxstelter I think it should be reworked to better handle invalid inputs. In the meantime I submitted a major rework of fuzz.sh because it was good enough for CI but really too inconvenient and too slow for interactive use. With #8851 it's great for both, please review. |
For rballoc_allign() call when caps are not correct it is enough to return error. k_panic() call is not required here. Previous change exposed this issue: thesofproject#8832, but it is sufficient to log error and return NULL at this point. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
For rballoc_allign() call when caps are not correct it is enough to return error. k_panic() call is not required here. Previous change exposed this issue: thesofproject#8832, but it is sufficient to log error and return NULL at this point. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
For rballoc_allign() call when caps are not correct it is enough to return error. k_panic() call is not required here. Previous change exposed this issue: #8832, but it is sufficient to log error and return NULL at this point. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
As discussed in the alternative approach zephyrproject-rtos/zephyr#68494, k_panic() in POSIX mode has various shortcomings that do not provide a useful trace. Useless pointers to signal handlers or other cleanup routines are printed instead. Leverage our already existing k_sys_fatal_error_handler() and dereference a NULL pointer there when in POSIX mode. This "fails fast" and provides a complete and relevant stack trace in CI when fuzzing or when using some other static analyzer. Example of how fuzzing failure thesofproject#8832 would have looked like in CI results thanks to this commit: ``` ./build-fuzz/zephyr/zephyr.exe: Running 1 inputs 1 time(s) each. Running: ./rballoc_align_fuzz_crash *** Booting Zephyr OS build zephyr-v3.5.0-3971-ge07de4e0a167 *** [00:00:00.000,000] <inf> main: SOF on native_posix [00:00:00.000,000] <inf> main: SOF initialized @ WEST_TOPDIR/sof/zephyr/lib/alloc.c:391 [00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic [00:00:00.000,000] <err> os: Current thread: 0x891f8a0 (unknown) [00:00:00.000,000] <err> zephyr: Halting emulation AddressSanitizer:DEADLYSIGNAL ================================================================= ==1784402==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000 ==1784402==The signal is caused by a WRITE memory access. ==1784402==Hint: address points to the zero page. #0 0x829a77d in k_sys_fatal_error_handler zephyr/wrapper.c:352:19 #1 0x829b8c0 in rballoc_align zephyr/lib/alloc.c:391:3 thesofproject#2 0x8281438 in buffer_alloc src/audio/buffer.c:58:16 thesofproject#3 0x826a60a in buffer_new src/ipc/ipc-helper.c:48:11 thesofproject#4 0x8262107 in ipc_buffer_new src/ipc/ipc3/helper.c:459:11 thesofproject#5 0x825944d in ipc_glb_tplg_buffer_new src/ipc/ipc3/handler.c:1305:8 thesofproject#6 0x8257036 in ipc_cmd src/ipc/ipc3/handler.c:1651:9 thesofproject#7 0x8272e59 in ipc_platform_do_cmd src/platform/posix/ipc.c:162:2 thesofproject#8 0x826a2ac in ipc_do_cmd src/ipc/ipc-common.c:328:9 thesofproject#9 0x829b0ab in task_run zephyr/include/rtos/task.h:94:9 thesofproject#10 0x829abd8 in edf_work_handler zephyr/edf_schedule.c:32:16 thesofproject#11 0x83560f7 in work_queue_main zephyr/kernel/work.c:688:3 thesofproject#12 0x82244c2 in z_thread_entry zephyr/lib/os/thread_entry.c:48:2 ``` Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Fixes lack of SOF_MEM_CAPS_* input validation found in thesofproject#8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
As discussed in the alternative approach zephyrproject-rtos/zephyr#68494, k_panic() in POSIX mode has various shortcomings that do not provide a useful trace. Useless pointers to signal handlers or other cleanup routines are printed instead. Leverage our already existing k_sys_fatal_error_handler() and dereference a NULL pointer there when in POSIX mode. This "fails fast" and provides a complete and relevant stack trace in CI when fuzzing or when using some other static analyzer. Example of how fuzzing failure thesofproject#8832 would have looked like in CI results thanks to this commit: ``` ./build-fuzz/zephyr/zephyr.exe: Running 1 inputs 1 time(s) each. Running: ./rballoc_align_fuzz_crash *** Booting Zephyr OS build zephyr-v3.5.0-3971-ge07de4e0a167 *** [00:00:00.000,000] <inf> main: SOF on native_posix [00:00:00.000,000] <inf> main: SOF initialized @ WEST_TOPDIR/sof/zephyr/lib/alloc.c:391 [00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic [00:00:00.000,000] <err> os: Current thread: 0x891f8a0 (unknown) [00:00:00.000,000] <err> zephyr: Halting emulation AddressSanitizer:DEADLYSIGNAL ================================================================= ==1784402==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000 ==1784402==The signal is caused by a WRITE memory access. ==1784402==Hint: address points to the zero page. #0 0x829a77d in k_sys_fatal_error_handler zephyr/wrapper.c:352:19 #1 0x829b8c0 in rballoc_align zephyr/lib/alloc.c:391:3 thesofproject#2 0x8281438 in buffer_alloc src/audio/buffer.c:58:16 thesofproject#3 0x826a60a in buffer_new src/ipc/ipc-helper.c:48:11 thesofproject#4 0x8262107 in ipc_buffer_new src/ipc/ipc3/helper.c:459:11 thesofproject#5 0x825944d in ipc_glb_tplg_buffer_new src/ipc/ipc3/handler.c:1305:8 thesofproject#6 0x8257036 in ipc_cmd src/ipc/ipc3/handler.c:1651:9 thesofproject#7 0x8272e59 in ipc_platform_do_cmd src/platform/posix/ipc.c:162:2 thesofproject#8 0x826a2ac in ipc_do_cmd src/ipc/ipc-common.c:328:9 thesofproject#9 0x829b0ab in task_run zephyr/include/rtos/task.h:94:9 thesofproject#10 0x829abd8 in edf_work_handler zephyr/edf_schedule.c:32:16 thesofproject#11 0x83560f7 in work_queue_main zephyr/kernel/work.c:688:3 thesofproject#12 0x82244c2 in z_thread_entry zephyr/lib/os/thread_entry.c:48:2 ``` Signed-off-by: Marc Herbert <marc.herbert@intel.com>
As discussed in the alternative approach zephyrproject-rtos/zephyr#68494, k_panic() in POSIX mode has various shortcomings that do not provide a useful trace. Useless pointers to signal handlers or other cleanup routines are printed instead. Leverage our already existing k_sys_fatal_error_handler() and dereference a NULL pointer there when in POSIX mode. This "fails fast" and provides a complete and relevant stack trace in CI when fuzzing or when using some other static analyzer. Example of how fuzzing failure #8832 would have looked like in CI results thanks to this commit: ``` ./build-fuzz/zephyr/zephyr.exe: Running 1 inputs 1 time(s) each. Running: ./rballoc_align_fuzz_crash *** Booting Zephyr OS build zephyr-v3.5.0-3971-ge07de4e0a167 *** [00:00:00.000,000] <inf> main: SOF on native_posix [00:00:00.000,000] <inf> main: SOF initialized @ WEST_TOPDIR/sof/zephyr/lib/alloc.c:391 [00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic [00:00:00.000,000] <err> os: Current thread: 0x891f8a0 (unknown) [00:00:00.000,000] <err> zephyr: Halting emulation AddressSanitizer:DEADLYSIGNAL ================================================================= ==1784402==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000 ==1784402==The signal is caused by a WRITE memory access. ==1784402==Hint: address points to the zero page. #0 0x829a77d in k_sys_fatal_error_handler zephyr/wrapper.c:352:19 #1 0x829b8c0 in rballoc_align zephyr/lib/alloc.c:391:3 #2 0x8281438 in buffer_alloc src/audio/buffer.c:58:16 #3 0x826a60a in buffer_new src/ipc/ipc-helper.c:48:11 #4 0x8262107 in ipc_buffer_new src/ipc/ipc3/helper.c:459:11 #5 0x825944d in ipc_glb_tplg_buffer_new src/ipc/ipc3/handler.c:1305:8 #6 0x8257036 in ipc_cmd src/ipc/ipc3/handler.c:1651:9 #7 0x8272e59 in ipc_platform_do_cmd src/platform/posix/ipc.c:162:2 #8 0x826a2ac in ipc_do_cmd src/ipc/ipc-common.c:328:9 #9 0x829b0ab in task_run zephyr/include/rtos/task.h:94:9 #10 0x829abd8 in edf_work_handler zephyr/edf_schedule.c:32:16 #11 0x83560f7 in work_queue_main zephyr/kernel/work.c:688:3 #12 0x82244c2 in z_thread_entry zephyr/lib/os/thread_entry.c:48:2 ``` Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Fixes lack of SOF_MEM_CAPS_* input validation found in thesofproject#8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
One last open PR on this topic and then we can close: |
Stable-v2.9 branched, this didn't make the cut, bumping to 2.10. |
Fixes lack of SOF_MEM_CAPS_* input validation found in #8832 Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@kv2019i given this is a security issue, can we not hotfix? |
@cujomalainey wrote:
I actually thought this was a follow-up and the primary issue was already fixed (and thus the P3 priority). But if not, let's indeed backport. The main PR is merged yesterday, @marc-hb can you submit a backport to stable-v2.9? I think we can then close this issue, right? Or anything else pending? |
I think this was a potential security issue. But it's most likely not as long as unknown flags are ignored or rejected. #8853 is already in stable-v2.9 so I think it's enough. |
./scripts/fuzz.sh -o fuzz-stdout.txt -t 600
fails systematically after L3_HEAP commit 58a42e5Example: https://github.com/thesofproject/sof/actions/runs/7763189541/job/21174929628
This may take a few minutes but never much longer.
P1 because this is affecting daily tests and our ability to fuzz.
Originally posted by @marc-hb in #8632 (comment)
PR #8632 is probably just the messenger but this fuzzing failure looks like a "good catch" to me.
Since #8632 was merged, one of the new k_panic() gets triggered because
caps & SOF_MEM_CAPS_L3
is true even whenCONFIG_L3_HEAP
is false.I think the reason
caps & SOF_MEM_CAPS_L3
is true is because...caps
comes directly from untrusted IPC input!? Why would IPCs be able to setcaps
directly?The panic happens when
ipc_glb_tplg_buffer_new()
does this:At this point
comp_data
looks like it came straight from the fuzzer's untrusted input:The text was updated successfully, but these errors were encountered: