-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG][TGL][Chrome] DSP Panic on opening Internal Mic after plugging/unplugging a headset #4648
Comments
@lancedkhou could please include also linux crash/coredump file? (with fw elf) |
Hi,
== for i in {1..50} cras_test_client --playback_f /dev/zero --select_output 7:0 --duration_seconds 30 & done === localhost ~ # cras_test_client Input Devices: |
@abonislawski can you share how to use the coredump tool? e.g. in the documentation https://thesofproject.github.io/latest/developer_guides/debugability/coredump-reader/index.html?highlight=coredump,
what is the "sys dump bin" here? |
@keyonjie simply by ./sof-coredump-to-gdb.sh elf dump_file
Linux core dump file with stack, regs dump (see panic.c) |
I reproduced "DSP panic", but exception file size is zero. localhost /sys/kernel/debug/sof # ls -ls Let me cross check the CONFIG as per this comment: #3965 (comment) |
Hi @bkokoszx - FYI. Thanks to @yongzhi1 for pointing out this CONFIG change. "DSP panic" is reproduced and attached exception file. |
Do not trust stat here, it will report 0 for files that aren't actually 0-sized. |
I think the goal should be to update the core dump script to parse for the dmesg for the FW oops. |
Detailed description what happened in this case:
So we have two buffers with size 0x6000, copy_bytes 0x3000, all is ok.
Similar data with second case when buffer 0xbe17ae80 size is 0x180, everything works as expected.
From the buffer pairs a) and c) we can easily see that this should be case from point 2. when we have two buffers with size 0x6000 and calculated copy_bytes 0x3000 but something is wrong and in buffer description data size is 0x180, just like in the previous allocation. So the bad things starts here and FW will end with dsp panic in several steps, it could be easily stopped if the copy_bytes was limited to smaller buffer (even if fw only thinks he is smaller).
this code is not prepared for case where copy_bytes is bigger than buffer size so in result it will set buffer->w_ptr > end_addr, this is a second place where we could still avoid panic with simple check if the new ptr > end_addr. In the next iteration fw will crash in dma_buffer_copy_to() -> audio_stream_writeback() because of faulty buffer->w_ptr (head_size 0xffffd200). From our tests allocating buffer description struct (with w_ptr, size etc.) in SOF_MEM_ZONE_RUNTIME_SHARED should be enough but please make more tests @sathya-nujella (PR #4696 ) |
@abonislawski good update and info here. Any validation news on whether #4696 will fix this issue ? |
Describe the bug
When IGO was enabled and PRs to fix heap usage issue applied.
During recording, there was DSP panic found when plugging/unplugging a headset
To Reproduce
Reproduction Rate
1/10
Expected behavior
The recording is always valid.
Impact
Blocker
Environment
tgl-13 branch with IGO patches & IGO binary
Issue still reproduced with PR #4575, #4612 and #4640
Screenshots or console output
2021-08-17T11:49:06.230044Z ERR kernel: [ 900.437070] sof-audio-pci 0000:00:1f.3: error : DSP panic!
2021-08-17T11:49:06.230068Z ERR kernel: [ 900.437093] sof-audio-pci 0000:00:1f.3: status: fw entered - code 00000005
2021-08-17T11:49:06.230069Z ERR kernel: [ 900.437203] sof-audio-pci 0000:00:1f.3: error: runtime exception
2021-08-17T11:49:06.230070Z ERR kernel: [ 900.437207] sof-audio-pci 0000:00:1f.3: error: trace point 00004000
2021-08-17T11:49:06.230072Z ERR kernel: [ 900.437210] sof-audio-pci 0000:00:1f.3: error: panic at :0
2021-08-17T11:49:06.230073Z ERR kernel: [ 900.437214] sof-audio-pci 0000:00:1f.3: error: DSP Firmware Oops
2021-08-17T11:49:06.230084Z ERR kernel: [ 900.437217] sof-audio-pci 0000:00:1f.3: error: Exception Cause: LoadProhibitedCause, A load referenced a page mapped with an attribute that does not permit loads
2021-08-17T11:49:06.230087Z ERR kernel: [ 900.437222] sof-audio-pci 0000:00:1f.3: EXCCAUSE 0x0000001c EXCVADDR 0xc0000038 PS 0x00060d25 SAR 0x00000000
2021-08-17T11:49:06.230088Z ERR kernel: [ 900.437225] sof-audio-pci 0000:00:1f.3: EPC1 0xbe02e437 EPC2 0xbe02d7e6 EPC3 0x00000000 EPC4 0x00000000
2021-08-17T11:49:06.230090Z ERR kernel: [ 900.437229] sof-audio-pci 0000:00:1f.3: EPC5 0x00000000 EPC6 0x00000000 EPC7 0x00000000 DEPC 0x00000000
2021-08-17T11:49:06.230091Z ERR kernel: [ 900.437233] sof-audio-pci 0000:00:1f.3: EPS2 0x00060720 EPS3 0x00000000 EPS4 0x00000000 EPS5 0x00000000
2021-08-17T11:49:06.230093Z ERR kernel: [ 900.437236] sof-audio-pci 0000:00:1f.3: EPS6 0x00000000 EPS7 0x00000000 INTENABL 0x00000000 INTERRU 0x00000222
4575+4612+4640-DSP-panic-messages.txt
The text was updated successfully, but these errors were encountered: