← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1603449] Re: [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area, sig: 11 [#1] while executing Froze PE Error injection

 

This bug was fixed in the package linux - 4.4.0-36.55

---------------
linux (4.4.0-36.55) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1612305

  * I2C touchpad does not work on AMD platform (LP: #1612006)
    - SAUCE: pinctrl/amd: Remove the default de-bounce time

  * CVE-2016-5696
    - tcp: make challenge acks less predictable

linux (4.4.0-35.54) xenial; urgency=low

  [ Stefan Bader ]

  * Release Tracking Bug
    - LP: #1611215

  * [i915_bpo] Sync with v4.7 (LP: #1609742)
    - SAUCE: i915_bpo: Sync with v4.7

  * s390/cio: fix reset of channel measurement block (LP: #1609415)
    - s390/cio: allow to reset channel measurement block

  * in Ubuntu16.10: Hit on Call traces  and system goes down when transactional
    memory  tests are running in 32TB Brazos system (LP: #1606786)
    - powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
    - powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint()

  *  Power Menu does not display after press the Power Button (LP: #1609204)
    - intel-vbtn: new driver for Intel Virtual Button
    - [config] enable CONFIG_INTEL_VBTN=m

  * OptiPlex 7450 AIO hangs when rebooting (LP: #1608762)
    - x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk

  * virtualbox+usb 3.0 breaks boot, -28 kernel works (LP: #1604058)
    - SAUCE: xhci: Fix soft lockup in xhci_pci_probe path when XHCI_STATE_HALTED

  * linux-kernel: Freeing IRQ from IRQ context (LP: #1597908)
    - block: defer timeouts to a workqueue

  * Tunnel offload indications not stripped from encapsulated packets, causing
    performance overhead (LP: #1602755)
    - tunnels: Remove encapsulation offloads on decap.

  * lm-sensors is throwing "ERROR: Can't get value of subfeature temp1_input:
    I/O error" for be2net driver (LP: #1607387)
    - be2net: perform temperature query in adapter regardless of its interface
      state

  * Dell dock MAC Address pass through doesn't work in Ubuntu (LP: #1579984)
    - r8152: Add support for setting pass through MAC address on RTL8153-AD

  * vmxnet3 LRO IPv6 performance issues (stalling TCP) (LP: #1605494)
    - Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets

  * ISST-LTE:pVM:monklp5:Ubuntu16.04.1:system crashed at
    lpfc_sli4_scmd_to_wqidx_distr (LP: #1597974)
    - SAUCE: lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from
      lpfc_send_taskmgmt()

  * Backport cxlflash shutdown patch to Xenial SRU (LP: #1605405)
    - SAUCE: cxlflash: Verify problem state area is mapped before notifying
      shutdown

  * Xenial update to v4.4.16 stable release (LP: #1607404)
    - mac80211: fix fast_tx header alignment
    - mac80211: mesh: flush mesh paths unconditionally
    - mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL
    - mac80211: Fix mesh estab_plinks counting in STA removal case
    - EDAC, sb_edac: Fix rank lookup on Broadwell
    - IB/cm: Fix a recently introduced locking bug
    - IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs
    - powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
    - powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
    - usb: dwc2: fix regression on big-endian PowerPC/ARM systems
    - USB: EHCI: declare hostpc register as zero-length array
    - usb: common: otg-fsm: add license to usb-otg-fsm
    - mnt: fs_fully_visible test the proper mount for MNT_LOCKED
    - mnt: Account for MS_RDONLY in fs_fully_visible
    - mnt: If fs_fully_visible fails call put_filesystem.
    - of: fix autoloading due to broken modalias with no 'compatible'
    - of: irq: fix of_irq_get[_byname]() kernel-doc
    - locking/ww_mutex: Report recursive ww_mutex locking early
    - locking/qspinlock: Fix spin_unlock_wait() some more
    - locking/static_key: Fix concurrent static_key_slow_inc()
    - x86, build: copy ldlinux.c32 to image.iso
    - kprobes/x86: Clear TF bit in fault on single-stepping
    - x86/amd_nb: Fix boot crash on non-AMD systems
    - Revert "gpiolib: Split GPIO flags parsing and GPIO configuration"
    - uvc: Forward compat ioctls to their handlers directly
    - thermal: cpu_cooling: fix improper order during initialization
    - writeback: use higher precision calculation in domain_dirty_limits()
    - nfsd4/rpc: move backchannel create logic into rpc code
    - nfsd: Always lock state exclusively.
    - nfsd: Extend the mutex holding region around in nfsd4_process_open2()
    - posix_acl: Add set_posix_acl
    - nfsd: check permissions when setting ACLs
    - make nfs_atomic_open() call d_drop() on all ->open_context() errors.
    - NFS: Fix another OPEN_DOWNGRADE bug
    - ARM: imx6ul: Fix Micrel PHY mask
    - ARM: 8578/1: mm: ensure pmd_present only checks the valid bit
    - ARM: 8579/1: mm: Fix definition of pmd_mknotpresent
    - MIPS: KVM: Fix modular KVM under QEMU
    - mm: Export migrate_page_move_mapping and migrate_page_copy
    - UBIFS: Implement ->migratepage()
    - sched/fair: Fix cfs_rq avg tracking underflow
    - packet: Use symmetric hash for PACKET_FANOUT_HASH.
    - net_sched: fix mirrored packets checksum
    - cdc_ncm: workaround for EM7455 "silent" data interface
    - ipv6: Fix mem leak in rt6i_pcpu
    - ARCv2: Check for LL-SC livelock only if LLSC is enabled
    - ARCv2: LLSC: software backoff is NOT needed starting HS2.1c
    - kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
    - KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode.
    - HID: elo: kill not flush the work
    - HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
    - tracing: Handle NULL formats in hold_module_trace_bprintk_format()
    - base: make module_create_drivers_dir race-free
    - iommu/arm-smmu: Wire up map_sg for arm-smmu-v3
    - iommu/vt-d: Enable QI on all IOMMUs before setting root entry
    - iommu/amd: Fix unity mapping initialization race
    - drm/mgag200: Black screen fix for G200e rev 4
    - ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg()
    - arm64: Rework valid_user_regs
    - vfs: add d_real_inode() helper
    - af_unix: fix hard linked sockets on overlay
    - btrfs: account for non-CoW'd blocks in btrfs_abort_transaction
    - drm/radeon: fix asic initialization for virtualized environments
    - drm/amdgpu/gfx7: fix broken condition check
    - ubi: Make recover_peb power cut aware
    - drm/amdkfd: unbind only existing processes
    - drm/amdkfd: destroy dbgmgr in notifier release
    - drm/dp/mst: Always clear proposed vcpi table for port.
    - drm/nouveau/disp/sor/gf119: both links use the same training register
    - drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers
    - drm/nouveau/fbcon: fix out-of-bounds memory accesses
    - drm/nouveau: fix for disabled fbdev emulation
    - drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern
    - drm/i915/ilk: Don't disable SSC source if it's in use
    - drm/i915: Refresh cached DP port register value on resume
    - drm/i915: Update ifdeffery for mutex->owner
    - drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency
    - drm: add missing drm_mode_set_crtcinfo call
    - drm: make drm_atomic_set_mode_prop_for_crtc() more reliable
    - drm: atmel-hlcdc: actually disable scaling when no scaling is required
    - drm/ttm: Make ttm_bo_mem_compat available
    - drm/vmwgfx: Add an option to change assumed FB bpp
    - drm/vmwgfx: Work around mode set failure in 2D VMs
    - drm/vmwgfx: Check pin count before attempting to move a buffer
    - drm/vmwgfx: Delay pinning fbdev framebuffer until after mode set
    - drm/vmwgfx: Fix error paths when mapping framebuffer
    - memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
    - perf/x86: Fix undefined shift on 32-bit kernels
    - xen/balloon: Fix declared-but-not-defined warning
    - iio: Fix error handling in iio_trigger_attach_poll_func
    - iio:st_pressure: fix sampling gains (bring inline with ABI)
    - iio: light apds9960: Add the missing dev.parent
    - iio: proximity: as3935: correct IIO_CHAN_INFO_RAW output
    - iio: proximity: as3935: remove triggered buffer processing
    - iio: proximity: as3935: fix buffer stack trashing
    - iio: humidity: hdc100x: correct humidity integration time mask
    - iio: humidity: hdc100x: fix IIO_TEMP channel reporting
    - iio: hudmidity: hdc100x: fix incorrect shifting and scaling
    - staging: iio: accel: fix error check
    - iio: accel: kxsd9: fix the usage of spi_w8r8()
    - iio:ad7266: Fix broken regulator error handling
    - iio:ad7266: Fix support for optional regulators
    - iio:ad7266: Fix probe deferral for vref
    - tty/vt/keyboard: fix OOB access in do_compute_shiftstate()
    - hwmon: (dell-smm) Restrict fan control and serial number to CAP_SYS_ADMIN by
      default
    - hwmon: (dell-smm) Disallow fan_type() calls on broken machines
    - hwmon: (dell-smm) Cache fan_type() calls and change fan detection
    - ALSA: dummy: Fix a use-after-free at closing
    - ALSA: hda - Fix the headset mic jack detection on Dell machine
    - ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup
    - ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()
    - ALSA: echoaudio: Fix memory allocation
    - ALSA: timer: Fix negative queue usage by racy accesses
    - ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
    - ALSA: hda - Add PCI ID for Kabylake-H
    - ALSA: hda - fix read before array start
    - ALSA: hda/realtek - add new pin definition in alc225 pin quirk table
    - ALSA: pcm: Free chmap at PCM free callback, too
    - ALSA: ctl: Stop notification after disconnection
    - ALSA: hda - fix use-after-free after module unload
    - ALSA: hda: add AMD Stoney PCI ID with proper driver caps
    - ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13
    - ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 Linksys
    - ARM: mvebu: fix HW I/O coherency related deadlocks
    - ovl: Copy up underlying inode's ->i_mode to overlay inode
    - ovl: verify upper dentry in ovl_remove_and_whiteout()
    - scsi: fix race between simultaneous decrements of ->host_failed
    - 53c700: fix BUG on untagged commands
    - Fix reconnect to not defer smb3 session reconnect long after socket
      reconnect
    - cifs: dynamic allocation of ntlmssp blob
    - File names with trailing period or space need special case conversion
    - xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7
    - crypto: qat - make qat_asym_algs.o depend on asn1 headers
    - tmpfs: don't undo fallocate past its last page
    - tmpfs: fix regression hang in fallocate undo
    - drm/i915: Revert DisplayPort fast link training feature
    - ovl: verify upper dentry before unlink and rename
    - Linux 4.4.16

  * Regression caused by `fuse: Add support for pid namespaces` in 4.4.0-6.21
    (LP: #1605344)
    - SAUCE: (namespace) fuse: Permit requests from other pid namespaces

  * CVE-2016-5400
    - media: fix airspy usb probe error path

  * Cannot mount proc in unprivileged containers if /proc/xen is mounted
    (LP: #1607374)
    - SAUCE: xenbus: Use proc_create_mount_point() to create /proc/xen

  * Mic mute key does not work for Ideapad laptops (LP: #1607153)
    - ideapad_laptop: Add an event for mic mute hotkey

  * NVMe stress test fails after 12 hours on Ubuntu 16.04 (LP: #1604995)
    - block: atari: Return early for unsupported sector size

  * Console extremely slow with 4.4 kernels for servers with Matrox G200er2 or
    similar (LP: #1605662)
    - SAUCE: vesafb: Set mtrr:3 (write-combining) as default

  * Ubuntu 16.04 - Full EEH Recovery Support for NVMe devices (LP: #1602724)
    - nvme: use a work item to submit async event requests
    - nvme: don't poll the CQ from the kthread
    - nvme: replace the kthread with a per-device watchdog timer
    - NVMe: Fix reset/remove race
    - nvme: Avoid reset work on watchdog timer function during error recovery
    - NVMe: Always use MSI/MSI-x interrupts

  * [LTC-Test] - NMI watchdog Bug and call traces when trinity is executed.
    (LP: #1602524)
    - ext4: factor out determining of hole size
    - ext4: return hole from ext4_map_blocks()
    - ext4: more efficient SEEK_DATA implementation

  * changelog: add CVEs as first class citizens (LP: #1604344)
    - avoid duplicate CVE numbers in changelog

  * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area,
    sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449)
    - powerpc/eeh: Fix invalid cached PE primary bus

  * Hotplug remove and re-add adds PCI adapter to next PCI domain (PCI)
    (LP: #1603574)
    - powerpc/pci: Assign fixed PHB number based on device-tree properties

  * nvme - reset_controller is not working after adapter's firmware upgrade
    (adapter quirk is needed) (LP: #1602726)
    - NVMe: Create discard zero quirk white list
    - nvme/quirk: Add a delay before checking for adapter readiness

  * ovs nat: conntrack netlink event are missing (LP: #1603468)
    - openvswitch: fix conntrack netlink event delivery

  * FlashGT - In Tuleta 8284-22A with card in card slot P1-C9, system Fails to
    boot operating system (LP: #1602785)
    - cxl: Ignore CAPI adapters misplaced in switched slots

  * CVE-2016-5728
    - misc: mic: Fix for double fetch security bug in VOP driver

  * CVE-2016-5244 (LP: #1589041)
    - rds: fix an infoleak in rds_inc_info_copy

  * Miscellaneous Ubuntu changes
    - Added Snapcraft files
    - SAUCE: snapcraft: cleanup and remove unnecessary elements

 -- Stefan Bader <stefan.bader@xxxxxxxxxxxxx>  Thu, 11 Aug 2016 17:34:14
+0200

** Changed in: linux (Ubuntu Xenial)
       Status: Fix Committed => Fix Released

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2016-5244

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2016-5400

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2016-5696

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2016-5728

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1603449

Title:
  [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad
  area, sig: 11 [#1] while executing Froze PE Error injection

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Fix Released

Bug description:
  == Comment: #0 - PAVAMAN SUBRAMANIYAM <pavsubra@xxxxxxxxxx> - 2016-07-13 01:28:56 ==
  ---Problem Description---
  Machine crashed with Oops: Kernel access of bad area, sig: 11 [#1]
   
  ---uname output---
  Linux ltc-garri2 4.4.0-30-generic #49-Ubuntu SMP Fri Jul 1 10:00:36 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
   
  ---Additional Hardware Info---
  root@ltc-garri2:~# lspci
  0000:00:00.0 PCI bridge: IBM Device 03dc
  0000:01:00.0 Infiniband controller: Mellanox Technologies MT27600 [Connect-IB]
  0001:00:00.0 PCI bridge: IBM Device 03dc
  0002:00:00.0 PCI bridge: IBM Device 03dc
  0002:01:00.0 3D controller: NVIDIA Corporation Device 15fe (rev a1)
  0003:00:00.0 PCI bridge: IBM Device 03dc
  0004:00:00.0 PCI bridge: IBM Device 03dc
  0005:00:00.0 PCI bridge: IBM Device 03dc
  0005:01:00.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab)
  0005:02:01.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab)
  0005:02:02.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab)
  0005:02:03.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab)
  0005:02:04.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab)
  0005:03:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
  0005:04:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller (rev 11)
  0005:05:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03)
  0005:06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
  0005:07:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5718 Gigabit Ethernet PCIe (rev 10)
  0005:07:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5718 Gigabit Ethernet PCIe (rev 10)
  0006:00:00.0 PCI bridge: IBM Device 03dc
  0006:01:00.0 3D controller: NVIDIA Corporation Device 15fe (rev a1)
  0007:00:00.0 PCI bridge: IBM Device 03dc
  0008:00:00.0 Bridge: IBM Device 04ea
  0008:00:00.1 Bridge: IBM Device 04ea
  0008:00:01.0 Bridge: IBM Device 04ea
  0008:00:01.1 Bridge: IBM Device 04ea
  0009:00:00.0 Bridge: IBM Device 04ea
  0009:00:00.1 Bridge: IBM Device 04ea
  0009:00:01.0 Bridge: IBM Device 04ea
  0009:00:01.1 Bridge: IBM Device 04ea
   

   
  Machine Type = P8 
   
  ---Debugger---
  A debugger is not configured
   
  ---Steps to Reproduce---
   Install a P8 Open Power 8335-GTB Hardware with Ubuntu 16.04.1.
  Then execute the Frozen PE error injection tests as shown below:

  root@ltc-garri2:~# lspci | grep -i 0004:00:00.0
  0004:00:00.0 PCI bridge: IBM Device 03dc
  root@ltc-garri2:~# cat /proc/powerpc/eeh | tail -n 1
  eeh_slot_resets=0

  
  root@ltc-garri2:~# lspci | grep -i 0004:00:00.0
  0004:00:00.0 PCI bridge: IBM Device 03dc
  root@ltc-garri2:~# cat /proc/powerpc/eeh | tail -n 1
  eeh_slot_resets=0
  root@ltc-garri2:~# echo 0:0:4:0:0 > /sys/kernel/debug/powerpc/PCI0004/err_injct && lspci -ns 0004:00:00.0; echo $?
  0004:00:00.0 0604: 1014:03dc
  0

  Immediately the kernel crashes with a Oops Message.
   
  Contact Information = pavsubra@xxxxxxxxxx 
   
  Stack trace output:
   [  289.297946] Call Trace:
  [  289.297969] [c000000feeb8b9e0] [c000000000083c78] pnv_eeh_reset+0x58/0x170 (unreliable)
  [  289.298042] [c000000feeb8ba60] [c000000000038250] eeh_reset_pe+0xb0/0x1c0
  [  289.298105] [c000000feeb8bb00] [c000000000af444c] eeh_reset_device+0xd8/0x228
  [  289.298165] [c000000feeb8bba0] [c00000000003c520] eeh_handle_normal_event+0x390/0x440
  [  289.298234] [c000000feeb8bc20] [c00000000003c9c4] eeh_handle_event+0x184/0x370
  [  289.298304] [c000000feeb8bcd0] [c00000000003cd88] eeh_event_handler+0x1d8/0x1e0
  [  289.298374] [c000000feeb8bd80] [c0000000000e6420] kthread+0x110/0x130
  [  289.298434] [c000000feeb8be30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  [  289.298501] Instruction dump:
  [  289.298531] 60000000 813f0000 ebdf0010 792affe3 408200d4 e95e0250 812a000c 2f890002
  [  289.298630] 419e0054 7fe3fb78 4bfb70c5 60000000 <e9230010> 2fa90000 419e00dc e9290010

   
  Oops output:
   [  289.294622] EEH: Frozen PE#0 on PHB#4 detected
  [  289.294785] EEH: PE location: N/A, PHB location: N/A
  [  289.295598] EEH: This PCI device has failed 1 times in the last hour
  [  289.295600] EEH: Notify device drivers to shutdown
  [  289.295605] EEH: Collect temporary log
  [  289.295632] EEH: of node=0004:00:00:0
  [  289.295635] EEH: PCI device/vendor: 03dc1014
  [  289.295638] EEH: PCI cmd/status register: 00100106
  [  289.295641] EEH: Bridge secondary status: 0000
  [  289.295644] EEH: Bridge control: 0002
  [  289.295645] EEH: PCI-E capabilities and status follow:
  [  289.295654] EEH: PCI-E 00: 00420010 00008002 00000040 00300103
  [  289.295661] EEH: PCI-E 10: 01010008 00000000 00000000 00010010
  [  289.295664] EEH: PCI-E 20: 00000000
  [  289.295665] EEH: PCI-E AER capability register set follows:
  [  289.295674] EEH: PCI-E AER 00: 14810001 00000000 0008d000 00000000
  [  289.295680] EEH: PCI-E AER 10: 00000000 00000000 000001e0 00000000
  [  289.295687] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
  [  289.295690] EEH: PCI-E AER 30: 00000000 00000000
  [  289.295693] PHB3 PHB#4 Diag-data (Version: 1)
  [  289.295695] brdgCtl:     00000002
  [  289.295697] UtlSts:      00080000 00000000 00000000
  [  289.295699] RootSts:     00000040 00000000 01010008 00100102 00000000
  [  289.295701] PhbSts:      0000001c00000000 0000001c00000000
  [  289.295704] Lem:         0000000000100000 42498e367f502eae 0000000000000000
  [  289.295706] InAErr:      4000000000000000 4000000000000000 0202000000000000 0000000000000000
  [  289.295708] PE[  0] A/B: 8440002b00000000 8000000000000000
  [  289.295711] EEH: Reset with hotplug activity
  [  289.295726] pci_bus 0004:01: busn_res: [bus 01] is released
  [  289.295868] Unable to handle kernel paging request for data at address 0x00000010
  [  289.295937] Faulting instruction address: 0xc000000000083c7c
  [  289.295997] Oops: Kernel access of bad area, sig: 11 [#1]
  [  289.296043] SMP NR_CPUS=2048 NUMA PowerNV
  [  289.296098] Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables x_tables ipmi_devintf input_leds joydev mac_hid hid_generic usbhid hid nvidia(POE) opal_prd ofpart cmdlinepart ibmpowernv at24 powernv_flash uio_pdrv_genirq ipmi_powernv mtd ipmi_msghandler powernv_rng uio autofs4 uas usb_storage ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci mlx5_core
  [  289.296657] CPU: 1 PID: 651 Comm: eehd Tainted: P           OE   4.4.0-30-generic #49-Ubuntu
  [  289.296726] task: c000000feeb02a20 ti: c000000feeb88000 task.ti: c000000feeb88000
  [  289.296787] NIP: c000000000083c7c LR: c000000000083c78 CTR: c000000000083c20
  [  289.296848] REGS: c000000feeb8b760 TRAP: 0300   Tainted: P           OE    (4.4.0-30-generic)
  [  289.296915] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28008822  XER: 00000000
  [  289.297065] CFAR: c000000000008468 DAR: 0000000000000010 DSISR: 40000000 SOFTE: 1
                 GPR00: c000000000083c78 c000000feeb8b9e0 c0000000015b5d00 0000000000000000
                 GPR04: 0000000000000001 c000000feeb8bac0 c000001e4e693540 0000000000000ff7
                 GPR08: 0000000000000000 0000000000000000 0000000000000000 000000000000001c
                 GPR12: c000000000083c20 c000000007b20980 c0000000000e6318 c000001e4e7a0340
                 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
                 GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000d42468
                 GPR24: c000000000d42440 0000000000000100 c000000000036460 0000000000000000
                 GPR28: c00000000161a3f0 0000000000000001 c000001fff764480 c000001e4e744000
  [  289.297867] NIP [c000000000083c7c] pnv_eeh_reset+0x5c/0x170
  [  289.297907] LR [c000000000083c78] pnv_eeh_reset+0x58/0x170
  [  289.297946] Call Trace:
  [  289.297969] [c000000feeb8b9e0] [c000000000083c78] pnv_eeh_reset+0x58/0x170 (unreliable)
  [  289.298042] [c000000feeb8ba60] [c000000000038250] eeh_reset_pe+0xb0/0x1c0
  [  289.298105] [c000000feeb8bb00] [c000000000af444c] eeh_reset_device+0xd8/0x228
  [  289.298165] [c000000feeb8bba0] [c00000000003c520] eeh_handle_normal_event+0x390/0x440
  [  289.298234] [c000000feeb8bc20] [c00000000003c9c4] eeh_handle_event+0x184/0x370
  [  289.298304] [c000000feeb8bcd0] [c00000000003cd88] eeh_event_handler+0x1d8/0x1e0
  [  289.298374] [c000000feeb8bd80] [c0000000000e6420] kthread+0x110/0x130
  [  289.298434] [c000000feeb8be30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  [  289.298501] Instruction dump:
  [  289.298531] 60000000 813f0000 ebdf0010 792affe3 408200d4 e95e0250 812a000c 2f890002
  [  289.298630] 419e0054 7fe3fb78 4bfb70c5 60000000 <e9230010> 2fa90000 419e00dc e9290010
  [  289.298731] ---[ end trace 393da961db41eff1 ]---
  [  289.452447]

   
  System Dump Info:
    The system is not configured to capture a system dump.
   
  *Additional Instructions for pavsubra@xxxxxxxxxx: 
  -Post a private note with access information to the machine that the bug is occuring on. 
  -Attach sysctl -a output output to the bug.

  == Comment: #2 - Guo Wen Shan <gwshan@xxxxxxxxxxx> - 2016-07-15 09:42:09 ==
  Below two patches are needed:

  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=cca0e542e02e48cce541a49c4046ec094ec27c1e
  ("powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()")

  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a3aa256b7258b3d19f8b44557cc64525a993b941
  ("powerpc/eeh: Fix invalid cached PE primary bus")

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1603449/+subscriptions