group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #10127
[Bug 1652018] Re: PowerNV: PCI Slot is invalid after fencedPHB Error injection
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Yakkety)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Xenial)
Status: New => Fix Committed
** Changed in: linux (Ubuntu Yakkety)
Status: New => Fix Committed
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1652018
Title:
PowerNV: PCI Slot is invalid after fencedPHB Error injection
Status in linux package in Ubuntu:
In Progress
Status in linux source package in Xenial:
Fix Committed
Status in linux source package in Yakkety:
Fix Committed
Bug description:
== Comment: #0 - Pridhiviraj Paidipeddi <ppaidipe@xxxxxxxxxx> - 2016-12-21 01:16:41 ==
---Problem Description---
PCI Slot is in invalid state after fencedPHB Error injection Test.
Contact Information = ppaidipe@xxxxxxxxxx
---uname output---
Linux brigstrat1p1 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:46:13 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = PowerNV CSE-829U
---Debugger---
A debugger is not configured
---Steps to Reproduce---
1. Boot the system to runtime.
2. Inject fencedPHB Error.
echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0002/err_injct_outbound
dmesg:
[42725.641368] EEH: PHB#2 failure detected, location: N/A
[42725.641450] CPU: 8 PID: 898 Comm: kworker/u320:1 Not tainted 4.4.0-57-generic #78-Ubuntu
[42725.641461] Workqueue: i40e i40e_service_task [i40e]
[42725.641464] Call Trace:
[42725.641469] [c00000000407f9e0] [c000000000b13b4c] dump_stack+0xb0/0xf0 (unreliable)
[42725.641474] [c00000000407fa20] [c0000000000376e0] eeh_dev_check_failure+0x200/0x580
[42725.641477] [c00000000407fac0] [c000000000037ae4] eeh_check_failure+0x84/0xd0
[42725.641485] [c00000000407fb00] [d000000035845710] i40e_service_task+0x17b0/0x1a30 [i40e]
[42725.641489] [c00000000407fc50] [c0000000000dde10] process_one_work+0x1e0/0x5a0
[42725.641492] [c00000000407fce0] [c0000000000de364] worker_thread+0x194/0x680
[42725.641496] [c00000000407fd80] [c0000000000e6e60] kthread+0x110/0x130
[42725.641499] [c00000000407fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
[42725.641509] EEH: Detected error on PHB#2
[42725.641514] EEH: This PCI device has failed 1 times in the last hour
[42725.641516] EEH: Notify device drivers to shutdown
[42725.641523] i40e 0002:01:00.0: i40e_pci_error_detected: error 2
[42725.641907] i40e 0002:01:00.0: VSI seid 396 Tx ring 0 disable timeout
[42725.642144] i40e 0002:01:00.0: VSI seid 396 Rx ring 0 disable timeout
[42725.666205] i40e 0002:01:00.1: i40e_pci_error_detected: error 2
[42725.666499] i40e 0002:01:00.2: i40e_pci_error_detected: error 2
[42725.666533] i40e 0002:01:00.0: ARQ event error -32
[42725.666601] i40e 0002:01:00.3: i40e_pci_error_detected: error 2
[42725.666700] EEH: Collect temporary log
[42725.666702] PHB3 PHB#2 Diag-data (Version: 1)
[42725.666703] brdgCtl: 0000ffff
[42725.666704] UtlSts: 00100000 00000000 00000000
[42725.666706] RootSts: ffffffff ffffffff ffffffff ffffffff 0000ffff
[42725.666707] RootErrSts: ffffffff ffffffff ffffffff
[42725.666708] RootErrLog: ffffffff ffffffff ffffffff ffffffff
[42725.666709] RootErrLog1: ffffffff 0000000000000000 0000000000000000
[42725.666711] nFir: 0000808000000000 0030006e00000000 0000800000000000
[42725.666712] PhbSts: 0000001800000000 0000001800000000
[42725.666713] Lem: 8000020000800000 42498e367f502eae 8000000000000000
[42725.666715] OutErr: 8000002000000000 8000000000000000 120800600003fffe 402002a800000000
[42725.666716] InBErr: 0000000040000000 0000000040000000 0000080000000000 000c10c010010000
[42725.666718] EEH: Reset without hotplug activity
[42730.052455] EEH: Notify device drivers the completion of reset
[42730.053334] EEH: Notify device driver to resume
[42730.184457] i40e 0002:01:00.0 enP2p1s0f0: NIC Link is Down
[42731.568230] i40e 0002:01:00.0 enP2p1s0f0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
OPAL LOG:
[42990.475630456,7] PHB#0002: CRESET: Starts
[42990.482717333,7] PHB#0002: CRESET: No pending transactions
[42991.023963215,7] PHB#0002: CRESET: Reinitialization
[42991.023964143,7] PHB#0002: Initializing PHB...
[42991.075167078,7] PHB#0002: Core revision 0xa30005
[42991.075171529,7] PHB#0002: Default system config: 0x421100fc30000000
[42991.075172655,7] PHB#0002: New system config : 0x421000fc30000000
[42991.075174000,7] PHB#0002: PHB_RESET is 0x2000000000000000
[42991.075410938,7] PHB#0002: Waiting for DLP PG reset to complete...
[42991.083713914,7] PHB#0002: Initialization complete
[42991.136599535,7] PHB#0002: FRESET: Starts
[42991.136600954,7] PHB#0002: FRESET: Prepare for link down
[42991.136602933,7] PHB#0002: FRESET: Assert
[42992.138625290,7] PHB#0002: FRESET: Deassert
[42993.140657592,7] PHB#0002: LINK: Start polling
[42993.193893558,7] PHB#0002: LINK: Electrical link detected
[42993.247138072,7] PHB#0002: LINK: Link is up
[42993.247174237,3] PCI-SLOT-0000000000000002 Invalid state 00000000
== Comment: #2 - VIPIN K. PARASHAR <viparash@xxxxxxxxxx> - 2016-12-22
04:57:28 ==
$ git log fbce44d0ed42e465317 -1
commit fbce44d0ed42e4653172376f4dfeaa5710f06a27
Author: Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx>
Date: Fri Jun 24 16:44:19 2016 +1000
powerpc/powernv: Call opal_pci_poll() if needed
When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.
This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@xxxxxxxxxx>
Signed-off-by: Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
$ git tag --contains fbce44d0e
v4.9
v4.9-rc1
v4.9-rc2
v4.9-rc3
v4.9-rc4
v4.9-rc5
v4.9-rc6
v4.9-rc7
v4.9-rc8
$
This issue is fixed by commit # fbce44d0ed4, available in kernel version 4.9.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652018/+subscriptions