Skip to content

Commit 9556318

Browse files
tg3: Disable tg3 PCIe AER on system reboot
JIRA: https://issues.redhat.com/browse/RHEL-63747 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git commit e0efe83 Author: Lenny Szubowicz <lszubowi@redhat.com> Date: Thu Jan 30 16:57:54 2025 -0500 tg3: Disable tg3 PCIe AER on system reboot Disable PCIe AER on the tg3 device on system reboot on a limited list of Dell PowerEdge systems. This prevents a fatal PCIe AER event on the tg3 device during the ACPI _PTS (prepare to sleep) method for S5 on those systems. The _PTS is invoked by acpi_enter_sleep_state_prep() as part of the kernel's reboot sequence as a result of commit 38f34db ("PM: ACPI: reboot: Reinstate S5 for reboot"). There was an earlier fix for this problem by commit 2ca1c94 ("tg3: Disable tg3 device on system reboot to avoid triggering AER"). But it was discovered that this earlier fix caused a reboot hang when some Dell PowerEdge servers were booted via ipxe. To address this reboot hang, the earlier fix was essentially reverted by commit 9fc3bc7 ("tg3: power down device only on SYSTEM_POWER_OFF"). This re-exposed the tg3 PCIe AER on reboot problem. This fix is not an ideal solution because the root cause of the AER is in system firmware. Instead, it's a targeted work-around in the tg3 driver. Note also that the PCIe AER must be disabled on the tg3 device even if the system is configured to use "firmware first" error handling. V3: - Fix sparse warning on improper comparison of pdev->current_state - Adhere to netdev comment style Fixes: 9fc3bc7 ("tg3: power down device only on SYSTEM_POWER_OFF") Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
1 parent 7fbcfd8 commit 9556318

File tree

1 file changed

+58
-0
lines changed
  • drivers/net/ethernet/broadcom

1 file changed

+58
-0
lines changed

drivers/net/ethernet/broadcom/tg3.c

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
#include <linux/hwmon.h>
5656
#include <linux/hwmon-sysfs.h>
5757
#include <linux/crc32poly.h>
58+
#include <linux/dmi.h>
5859

5960
#include <net/checksum.h>
6061
#include <net/gso.h>
@@ -18151,6 +18152,50 @@ static int tg3_resume(struct device *device)
1815118152

1815218153
static SIMPLE_DEV_PM_OPS(tg3_pm_ops, tg3_suspend, tg3_resume);
1815318154

18155+
/* Systems where ACPI _PTS (Prepare To Sleep) S5 will result in a fatal
18156+
* PCIe AER event on the tg3 device if the tg3 device is not, or cannot
18157+
* be, powered down.
18158+
*/
18159+
static const struct dmi_system_id tg3_restart_aer_quirk_table[] = {
18160+
{
18161+
.matches = {
18162+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18163+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"),
18164+
},
18165+
},
18166+
{
18167+
.matches = {
18168+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18169+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R540"),
18170+
},
18171+
},
18172+
{
18173+
.matches = {
18174+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18175+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R640"),
18176+
},
18177+
},
18178+
{
18179+
.matches = {
18180+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18181+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R650"),
18182+
},
18183+
},
18184+
{
18185+
.matches = {
18186+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18187+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"),
18188+
},
18189+
},
18190+
{
18191+
.matches = {
18192+
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
18193+
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R750"),
18194+
},
18195+
},
18196+
{}
18197+
};
18198+
1815418199
static void tg3_shutdown(struct pci_dev *pdev)
1815518200
{
1815618201
struct net_device *dev = pci_get_drvdata(pdev);
@@ -18167,6 +18212,19 @@ static void tg3_shutdown(struct pci_dev *pdev)
1816718212

1816818213
if (system_state == SYSTEM_POWER_OFF)
1816918214
tg3_power_down(tp);
18215+
else if (system_state == SYSTEM_RESTART &&
18216+
dmi_first_match(tg3_restart_aer_quirk_table) &&
18217+
pdev->current_state != PCI_D3cold &&
18218+
pdev->current_state != PCI_UNKNOWN) {
18219+
/* Disable PCIe AER on the tg3 to avoid a fatal
18220+
* error during this system restart.
18221+
*/
18222+
pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL,
18223+
PCI_EXP_DEVCTL_CERE |
18224+
PCI_EXP_DEVCTL_NFERE |
18225+
PCI_EXP_DEVCTL_FERE |
18226+
PCI_EXP_DEVCTL_URRE);
18227+
}
1817018228

1817118229
rtnl_unlock();
1817218230

0 commit comments

Comments
 (0)