← Back to team overview

kernel-packages team mailing list archive

[Bug 1584935] Re: Backport cxlflash patch related to EEH recovery into Xenial SRU stream

 

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Yakkety)
   Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
       Status: New

** Changed in: linux (Ubuntu Yakkety)
       Status: New => Fix Released

** Changed in: linux (Ubuntu Yakkety)
     Assignee: Taco Screen team (taco-screen-team) => (unassigned)

** Changed in: linux (Ubuntu Xenial)
       Status: New => In Progress

** Changed in: linux (Ubuntu Xenial)
     Assignee: (unassigned) => Tim Gardner (timg-tpi)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1584935

Title:
  Backport cxlflash patch related to EEH recovery into Xenial SRU stream

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  Fix Released

Bug description:
  ---Problem Description---
  Request to backport cxlflash EEH patch to Xenial SRU
    
  ---System Hang---
   When a cxlflash adapter goes into EEH recovery and multiple processes
  (each having established its own context) are active, the EEH recovery
  can hang if the processes attempt to recover in parallel.
  After debugging the problem, patch has been upstreamed for this issue and the system has been rebooted with the fix for the problem
    
  ---Steps to Reproduce---
   Injecting EEH when multiple processes are active can trigger the issue
   
  Stack trace output:
       Call Trace:
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_write_failed+0x294/0x410
      down_write+0x88/0xb0
      cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
      cxl_vphb_error_detected+0x88/0x110 [cxl]
      cxl_pci_error_detected+0xb0/0x1d0 [cxl]
      eeh_report_error+0xbc/0x130
      eeh_pe_dev_traverse+0x94/0x160
      eeh_handle_normal_event+0x17c/0x450
      eeh_handle_event+0x184/0x370
      eeh_event_handler+0x1c8/0x1d0
      kthread+0x110/0x130
      ret_from_kernel_thread+0x5c/0xa4
      INFO: task blockio:33215 blocked for more than 120 seconds.

   
  The upstream patch we need backported to Xenial SRU stream is

  635f6b0893cff193a1774881ebb1e4a4b9a7fead
      cxlflash: Fix to resolve dead-lock during EEH recovery

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1584935/+subscriptions


References