← Back to team overview

kernel-packages team mailing list archive

[Bug 1584935] [NEW] Backport cxlflash patch related to EEH recovery into Xenial SRU stream

 

Public bug reported:

---Problem Description---
Request to backport cxlflash EEH patch to Xenial SRU
  
---System Hang---
 When a cxlflash adapter goes into EEH recovery and multiple processes
(each having established its own context) are active, the EEH recovery
can hang if the processes attempt to recover in parallel.
After debugging the problem, patch has been upstreamed for this issue and the system has been rebooted with the fix for the problem
  
---Steps to Reproduce---
 Injecting EEH when multiple processes are active can trigger the issue
 
Stack trace output:
     Call Trace:
    __switch_to+0x2f0/0x410
    __schedule+0x300/0x980
    schedule+0x48/0xc0
    rwsem_down_write_failed+0x294/0x410
    down_write+0x88/0xb0
    cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
    cxl_vphb_error_detected+0x88/0x110 [cxl]
    cxl_pci_error_detected+0xb0/0x1d0 [cxl]
    eeh_report_error+0xbc/0x130
    eeh_pe_dev_traverse+0x94/0x160
    eeh_handle_normal_event+0x17c/0x450
    eeh_handle_event+0x184/0x370
    eeh_event_handler+0x1c8/0x1d0
    kthread+0x110/0x130
    ret_from_kernel_thread+0x5c/0xa4
    INFO: task blockio:33215 blocked for more than 120 seconds.

 
The upstream patch we need backported to Xenial SRU stream is

635f6b0893cff193a1774881ebb1e4a4b9a7fead
    cxlflash: Fix to resolve dead-lock during EEH recovery

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-141713 severity-medium targetmilestone-inin16041

** Tags added: architecture-ppc64le bugnameltc-141713 severity-medium
targetmilestone-inin16041

** Changed in: ubuntu
     Assignee: (unassigned) => Taco Screen team (taco-screen-team)

** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1584935

Title:
  Backport cxlflash patch related to EEH recovery into Xenial SRU stream

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  Request to backport cxlflash EEH patch to Xenial SRU
    
  ---System Hang---
   When a cxlflash adapter goes into EEH recovery and multiple processes
  (each having established its own context) are active, the EEH recovery
  can hang if the processes attempt to recover in parallel.
  After debugging the problem, patch has been upstreamed for this issue and the system has been rebooted with the fix for the problem
    
  ---Steps to Reproduce---
   Injecting EEH when multiple processes are active can trigger the issue
   
  Stack trace output:
       Call Trace:
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_write_failed+0x294/0x410
      down_write+0x88/0xb0
      cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
      cxl_vphb_error_detected+0x88/0x110 [cxl]
      cxl_pci_error_detected+0xb0/0x1d0 [cxl]
      eeh_report_error+0xbc/0x130
      eeh_pe_dev_traverse+0x94/0x160
      eeh_handle_normal_event+0x17c/0x450
      eeh_handle_event+0x184/0x370
      eeh_event_handler+0x1c8/0x1d0
      kthread+0x110/0x130
      ret_from_kernel_thread+0x5c/0xa4
      INFO: task blockio:33215 blocked for more than 120 seconds.

   
  The upstream patch we need backported to Xenial SRU stream is

  635f6b0893cff193a1774881ebb1e4a4b9a7fead
      cxlflash: Fix to resolve dead-lock during EEH recovery

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1584935/+subscriptions


Follow ups