Revert "ACPI: OSL: Use a threaded interrupt handler for SCI"#580
Open
ymd-arista wants to merge 1 commit into
Open
Revert "ACPI: OSL: Use a threaded interrupt handler for SCI"#580ymd-arista wants to merge 1 commit into
ymd-arista wants to merge 1 commit into
Conversation
This reverts commit 7a36b901a6eb0e9945341db71ed3c45c7721cfa9.
After upgrading from Debian bookworm to trixie on modular systems,
the kdump kernel started hitting a soft lockup while capturing a
crash dump. The issue is reproducible by triggering a panic in the
production kernel with:
echo c | sudo tee /proc/sysrq-trigger
Once the kdump kernel boots, CPU0 gets stuck in the ACPI SCI handling
path and the soft lockup watchdog eventually panics the kdump kernel,
so no vmcore is produced.
The trace below was obtained by adding the following to the kdump
command line: debug=1, loglevel=7, softlockup_all_cpu_backtrace=1 and
softlockup_panic=1:
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [irq/9-acpi:39]
CPU: 0 UID: 0 PID: 39 Comm: irq/9-acpi Not tainted
6.12.41+deb13-sonic-amd64 sonic-net#1 Debian 6.12.41-1
Hardware name: Intel Camelback Mountain CRB, BIOS
Aboot-norcal7-7.1.6-generic-22971530 06/30/2021
RIP: 0010:acpi_os_read_port+0x30/0xa0
Call Trace:
<TASK>
acpi_hw_gpe_read+0x61/0x80
acpi_ev_detect_gpe+0x74/0x180
acpi_ev_gpe_detect+0xe1/0x130
acpi_ev_sci_xrupt_handler+0x1d/0x40
acpi_irq+0x1c/0x40
irq_thread_fn+0x23/0x60
irq_thread+0x1b3/0x2f0
kthread+0xd2/0x100
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1a/0x30
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
Comparing the bookworm and trixie kernels, the SCI handler was moved
from a hardirq handler to a threaded handler by the commit being
reverted. Moving to a threaded IRQ regressed kdump on this hardware;
reverting that commit restores the previous hardirq-based SCI handling
and the kdump kernel completes the crash dump without triggering the
soft lockup watchdog.
Signed-off-by: Mohan Yelugoti <ymd@arista.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Author
|
@saiarcot895 : This is another fallout caused by moving from bookworm to trixie. The soft lockup inside kdump kernel was reproduced each time on both supervisor and the linecard. |
Contributor
|
Please report this to the upstream list with the people involved in the patch in Cc:, just to get their feedback. PS: Another regression the commit caused, but was fixed in a follow up. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This reverts commit 7a36b901a6eb0e9945341db71ed3c45c7721cfa9.
After upgrading from Debian bookworm to trixie on modular systems, the kdump kernel started hitting a soft lockup while capturing a crash dump. The issue is reproducible by triggering a panic in the production kernel with:
echo c | sudo tee /proc/sysrq-trigger
Once the kdump kernel boots, CPU0 gets stuck in the ACPI SCI handling path and the soft lockup watchdog eventually panics the kdump kernel, so no vmcore is produced.
The trace below was obtained by adding the following to the kdump command line: debug=1, loglevel=7, softlockup_all_cpu_backtrace=1 and softlockup_panic=1:
Comparing the bookworm and trixie kernels, the SCI handler was moved from a hardirq handler to a threaded handler by the commit being reverted. Moving to a threaded IRQ regressed kdump on this hardware; reverting that commit restores the previous hardirq-based SCI handling and the kdump kernel completes the crash dump without triggering the soft lockup watchdog.