Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
x86/sev: Fix host kdump support for SNP
With active SNP VMs, SNP_SHUTDOWN_EX invoked during panic notifiers causes crashkernel boot failure with the following signature: [ 563.497112] sysrq: Trigger a crash [ 563.508415] Kernel panic - not syncing: sysrq triggered crash [ 563.522002] CPU: 10 UID: 0 PID: 4661 Comm: bash Kdump: loaded Not tainted 6.11.0-rc3-next-20240813-snp-host-f2a41ff576cc-dirty torvalds#61 [ 563.549762] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM100AB 10/17/2022 [ 563.566266] Call Trace: [ 563.576430] <TASK> [ 563.585932] dump_stack_lvl+0x2b/0x90 [ 563.597244] dump_stack+0x14/0x20 [ 563.608141] panic+0x3b9/0x400 [ 563.618801] ? srso_alias_return_thunk+0x5/0xfbef5 [ 563.631271] sysrq_handle_crash+0x19/0x20 [ 563.642696] __handle_sysrq+0xf9/0x290 [ 563.653691] ? srso_alias_return_thunk+0x5/0xfbef5 [ 563.666126] write_sysrq_trigger+0x60/0x80 ... ... [ 564.186804] in panic [ 564.194287] in panic_other_cpus_shutdown [ 564.203674] kexec: in crash_smp_send_stop [ 564.213205] kexec: in kdump_nmi_shootdown_cpus [ 564.224338] Kernel Offset: 0x35a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 564.282209] in snp_shutdown_on_panic after decommission, wbinvd + df_flush required [ 564.462217] ccp 0000:23:00.1: SEV-SNP DF_FLUSH failed with error 14 [ 564.676920] kexec: in native_machine_crash_shutdown early console in extract_kernel input_data: 0x000000007410d2cc input_len: 0x0000000000ce98b2 output: 0x0000000071600000 output_len: 0x000000000379eb8c kernel_total_size: 0x0000000002c30000 needed_size: 0x0000000003800000 trampoline_32bit: 0x0000000000000000 Invalid physical address chosen! Physical KASLR disabled: no suitable memory region! Virtual KASLR using RDRAND RDTSC... Decompressing Linux... Parsing ELF... Performing relocations... done. Booting the kernel (entry_offset: 0x0000000000000bda). [ 0.000000] Linux version 6.11.0-rc3-next-20240813-snp-host-f2a41ff576cc-dirty (amd@ethanolx7e2ehost) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils) 2.40) torvalds#61 SMP Mon Aug 19 19:59:02 UTC 2024 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.11.0-rc3-next-20240813-snp-host-f2a41ff576cc-dirty root=UUID=4b87a03b-0e78-42ca-a8ad-997e63bba4e0 ro console=tty0 console=ttyS0,115200n8 earlyprintk=ttyS0,115200n8 amd_iommu_dump=1 reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll nousb elfcorehdr=1916276K [ 0.000000] KERNEL supported cpus: ... ... [ 1.671804] AMD-Vi: Using global IVHD EFR:0x841f77e022094ace, EFR2:0x0 [ 1.679835] AMD-Vi: Translation is already enabled - trying to copy translation structures [ 1.689363] AMD-Vi: Copied DEV table from previous kernel. [ 1.864369] AMD-Vi: Completion-Wait loop timed out [ 2.038289] AMD-Vi: Completion-Wait loop timed out [ 2.212215] AMD-Vi: Completion-Wait loop timed out [ 2.386141] AMD-Vi: Completion-Wait loop timed out [ 2.560068] AMD-Vi: Completion-Wait loop timed out [ 2.733997] AMD-Vi: Completion-Wait loop timed out [ 2.907927] AMD-Vi: Completion-Wait loop timed out [ 3.081855] AMD-Vi: Completion-Wait loop timed out [ 3.225500] AMD-Vi: Completion-Wait loop timed out [ 3.231083] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 d out [ 3.579592] AMD-Vi: Completion-Wait loop timed out [ 3.753164] AMD-Vi: Completion-Wait loop timed out [ 3.815762] Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC [ 3.825347] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240813-snp-host-f2a41ff576cc-dirty torvalds#61 [ 3.837188] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS RXM100AB 10/17/2022 [ 3.846215] Call Trace: [ 3.848939] <TASK> [ 3.851277] dump_stack_lvl+0x2b/0x90 [ 3.855354] dump_stack+0x14/0x20 [ 3.859050] panic+0x3b9/0x400 [ 3.862454] panic_if_irq_remap+0x21/0x30 [ 3.866925] setup_IO_APIC+0x8aa/0xa50 [ 3.871106] ? __pfx_amd_iommu_enable_faulting+0x10/0x10 [ 3.877032] ? __cpuhp_setup_state+0x5e/0xd0 [ 3.881793] apic_intr_mode_init+0x6a/0xf0 [ 3.886360] x86_late_time_init+0x28/0x40 [ 3.890832] start_kernel+0x6a8/0xb50 [ 3.894914] x86_64_start_reservations+0x1c/0x30 [ 3.900064] x86_64_start_kernel+0xbf/0x110 [ 3.904729] ? setup_ghcb+0x12/0x130 [ 3.908716] common_startup_64+0x13e/0x141 [ 3.913283] </TASK> [ 3.915715] in panic [ 3.918149] in panic_other_cpus_shutdown [ 3.922523] ---[ end Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC ]--- This happens as SNP_SHUTDOWN_EX fails when SNP VMs are active as the firmware checks every encryption-capable ASID to verify that it is not in use by a guest and a DF_FLUSH is not required. If a DF_FLUSH is required, the firmware returns DFFLUSH_REQUIRED. To fix this, added support to do SNP_DECOMMISSION of all active SNP VMs in the panic notifier before doing SNP_SHUTDOWN_EX, but then SNP_DECOMMISSION tags all CPUs on which guest has been activated to do a WBINVD. This causes SNP_DF_FLUSH command failure with the following flow: SNP_DECOMMISSION -> SNP_SHUTDOWN_EX -> SNP_DF_FLUSH -> failure with WBINVD_REQUIRED. When panic notifier is invoked all other CPUs have already been shutdown, so it is not possible to do a wbinvd_on_all_cpus() after SNP_DECOMMISSION has been executed. This eventually causes SNP_SHUTDOWN_EX to fail after SNP_DECOMMISSION. Adding fix to do SNP_DECOMMISSION and subsequent WBINVD on all CPUs during NMI shutdown of CPUs as part of disabling virtualization on all CPUs via cpu_emergency_disable_virtualization -> svm_emergency_disable(). SNP_DECOMMISSION unbinds the ASID from SNP context and marks the ASID as unusable and then transitions the SNP guest context page to a firmware page and SNP_SHUTDOWN_EX transitions all pages associated with the IOMMU to reclaim state which the hypervisor then transitions to hypervisor state, all these page state changes are in the RMP table, so there is no loss of guest data as such and the complete host memory is captured with the crashkernel boot. There are no processes which are being killed and host/guest memory is not being altered or modified in any way. This fixes and enables crashkernel/kdump on SNP host. v2: - rename all instances of decommision to decommission - created a new function sev_emergency_disable() which is exported from sev.c and calls __snp_decommission_all() to do SNP_DECOMMISSION - added more information to commit message Fixes: c3b86e6 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
- Loading branch information