Commit 4d4c0d8
committed
pci/aer_inject: switching inject_lock to raw_spinlock_t
When injecting AER errors under PREEMPT_RT, the kernel may trigger a
lockdep warning about an invalid wait context:
```
[ 1850.950780] [ BUG: Invalid wait context ]
[ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty Rust-for-Linux#7 Not tainted
[ 1850.951457] -----------------------------
[ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
[ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
[ 1850.952731] other info that might help us debug this:
[ 1850.952997] context-{5:5}
[ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
[ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
[ 1850.953931] Rust-for-Linux#1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.954453] Rust-for-Linux#2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
[ 1850.954949] Rust-for-Linux#3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.955420] Rust-for-Linux#4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
```
This happens because the AER injection path (`aer_inj_read_config()`)
is called in the context of the PCIe PME interrupt thread, which runs
through `irq_forced_thread_fn()` under PREEMPT_RT. In this context,
`pci_lock` (a raw_spinlock_t) is held with interrupts disabled
(`spin_lock_irqsave()`), and then `aer_inj_read_config()` tries to
acquire `inject_lock`, which is a `rt_spin_lock`. (Thanks Waiman Long)
`rt_spin_lock` may sleep, so acquiring it while holding a raw spinlock
with IRQs disabled violates the lock ordering rules. This leads to
the “Invalid wait context” lockdep warning.
In other words, the lock order looks like this:
```
raw_spin_lock_irqsave(&pci_lock);
↓
rt_spin_lock(&inject_lock); <-- not allowed
```
To fix this, convert `inject_lock` from an `rt_spin_lock` to a
`raw_spinlock_t`, a raw spinlock is safe and consistent with the
surrounding locking scheme.
This resolves the lockdep “Invalid wait context” warning observed when
injecting correctable AER errors through `/dev/aer_inject` on PREEMPT_RT.
This was discovered while testing PCIe AER error injection on an arm64
QEMU virtual machine:
```
qemu-system-aarch64 \
-nographic \
-machine virt,highmem=off,gic-version=3 \
-cpu cortex-a72 \
-kernel arch/arm64/boot/Image \
-initrd initramfs.cpio.gz \
-append "console=ttyAMA0 root=/dev/ram rdinit=/linuxrc earlyprintk nokaslr" \
-m 2G \
-smp 1 \
-netdev user,id=net0,hostfwd=tcp::2223-:22 \
-device virtio-net-pci,netdev=net0 \
-device pcie-root-port,id=rp0,chassis=1,slot=0x0 \
-device pci-testdev -s -S
```
Injecting a correctable PCIe error via /dev/aer_inject caused a BUG
report with "Invalid wait context" in the irq/PCIe thread.
```
~ # export HEX="00020000000000000100000000000000000000000000000000000000"
~ # echo -n "$HEX" | xxd -r -p | tee /dev/aer_inject >/dev/null
[ 1850.947170] pcieport 0000:00:02.0: aer_inject: Injecting errors 00000001/00000000 into device 0000:00:02.0
[ 1850.949951]
[ 1850.950479] =============================
[ 1850.950780] [ BUG: Invalid wait context ]
[ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty Rust-for-Linux#7 Not tainted
[ 1850.951457] -----------------------------
[ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
[ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
[ 1850.952731] other info that might help us debug this:
[ 1850.952997] context-{5:5}
[ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
[ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
[ 1850.953931] Rust-for-Linux#1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.954453] Rust-for-Linux#2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
[ 1850.954949] Rust-for-Linux#3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.955420] Rust-for-Linux#4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
[ 1850.955932] stack backtrace:
[ 1850.956412] CPU: 0 UID: 0 PID: 56 Comm: irq/16-PCIe PME Not tainted 6.17.0-11316-g7a405dbb0f03-dirty Rust-for-Linux#7 PREEMPT_{RT,(full)}
[ 1850.957039] Hardware name: linux,dummy-virt (DT)
[ 1850.957409] Call trace:
[ 1850.957727] show_stack+0x18/0x24 (C)
[ 1850.958089] dump_stack_lvl+0x40/0xbc
[ 1850.958339] dump_stack+0x18/0x24
[ 1850.958586] __lock_acquire+0xa84/0x3008
[ 1850.958907] lock_acquire+0x128/0x2a8
[ 1850.959171] rt_spin_lock+0x50/0x1b8
[ 1850.959476] aer_inj_read_config+0x38/0x1dc
[ 1850.959821] pci_bus_read_config_dword+0x80/0xd8
[ 1850.960079] pcie_capability_read_dword+0xac/0xd8
[ 1850.960454] pcie_pme_irq+0x44/0xc4
[ 1850.960728] irq_forced_thread_fn+0x30/0x94
[ 1850.960984] irq_thread+0x1ac/0x3a4
[ 1850.961308] kthread+0x1b4/0x208
[ 1850.961557] ret_from_fork+0x10/0x20
[ 1850.963088] pcieport 0000:00:02.0: AER: Correctable error message received from 0000:00:02.0
[ 1850.963330] pcieport 0000:00:02.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 1850.963351] pcieport 0000:00:02.0: device [1b36:000c] error status/mask=00000001/0000e000
[ 1850.963385] pcieport 0000:00:02.0: [ 0] RxErr (First)
```
Signed-off-by: Guangbo Cui <jckeep.cuiguangbo@gmail.com>1 parent fd94619 commit 4d4c0d8
1 file changed
+11
-21
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
127 | 126 | | |
128 | 127 | | |
129 | | - | |
| 128 | + | |
130 | 129 | | |
131 | 130 | | |
132 | 131 | | |
133 | 132 | | |
134 | | - | |
135 | 133 | | |
136 | 134 | | |
137 | 135 | | |
| |||
219 | 217 | | |
220 | 218 | | |
221 | 219 | | |
222 | | - | |
223 | 220 | | |
224 | 221 | | |
225 | 222 | | |
226 | | - | |
| 223 | + | |
227 | 224 | | |
228 | 225 | | |
229 | 226 | | |
| |||
236 | 233 | | |
237 | 234 | | |
238 | 235 | | |
239 | | - | |
240 | 236 | | |
241 | 237 | | |
242 | 238 | | |
243 | 239 | | |
244 | | - | |
245 | 240 | | |
246 | 241 | | |
247 | 242 | | |
| |||
250 | 245 | | |
251 | 246 | | |
252 | 247 | | |
253 | | - | |
254 | 248 | | |
255 | 249 | | |
256 | 250 | | |
257 | 251 | | |
258 | | - | |
| 252 | + | |
259 | 253 | | |
260 | 254 | | |
261 | 255 | | |
| |||
271 | 265 | | |
272 | 266 | | |
273 | 267 | | |
274 | | - | |
275 | 268 | | |
276 | 269 | | |
277 | 270 | | |
278 | 271 | | |
279 | | - | |
280 | 272 | | |
281 | 273 | | |
282 | 274 | | |
| |||
304 | 296 | | |
305 | 297 | | |
306 | 298 | | |
307 | | - | |
| 299 | + | |
308 | 300 | | |
309 | 301 | | |
310 | 302 | | |
311 | 303 | | |
312 | 304 | | |
313 | 305 | | |
314 | | - | |
| 306 | + | |
315 | 307 | | |
316 | 308 | | |
317 | 309 | | |
| |||
383 | 375 | | |
384 | 376 | | |
385 | 377 | | |
386 | | - | |
| 378 | + | |
387 | 379 | | |
388 | 380 | | |
389 | 381 | | |
| |||
404 | 396 | | |
405 | 397 | | |
406 | 398 | | |
407 | | - | |
| 399 | + | |
408 | 400 | | |
409 | 401 | | |
410 | 402 | | |
411 | 403 | | |
412 | 404 | | |
413 | 405 | | |
414 | | - | |
| 406 | + | |
415 | 407 | | |
416 | 408 | | |
417 | 409 | | |
| |||
445 | 437 | | |
446 | 438 | | |
447 | 439 | | |
448 | | - | |
| 440 | + | |
449 | 441 | | |
450 | 442 | | |
451 | 443 | | |
| |||
523 | 515 | | |
524 | 516 | | |
525 | 517 | | |
526 | | - | |
527 | 518 | | |
528 | 519 | | |
529 | 520 | | |
| |||
533 | 524 | | |
534 | 525 | | |
535 | 526 | | |
536 | | - | |
| 527 | + | |
537 | 528 | | |
538 | 529 | | |
539 | 530 | | |
540 | 531 | | |
541 | | - | |
542 | 532 | | |
543 | 533 | | |
544 | 534 | | |
| |||
0 commit comments