← Back to team overview

kernel-packages team mailing list archive

[Bug 1532914] Re: Surelock GA2 SP1: capiredp01: cxl_init_adapter fails for CAPI devices 0000:01:00.0 and 0005:01:00.0 after upgrading to 840.10 Platform firmware build fips840/b1208b_1604.840

 

This bug was fixed in the package linux - 4.2.0-34.39

---------------
linux (4.2.0-34.39) wily; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1555821

  [ Florian Westphal ]

  * SAUCE: [nf] netfilter: x_tables: check for size overflow
    - LP: #1555353
  * SAUCE: [nf,v2] netfilter: x_tables: don't rely on well-behaving
    userspace
    - LP: #1555338

linux (4.2.0-33.38) wily; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1554649

  [ Upstream Kernel Changes ]

  * Revert "drm/radeon: call hpd_irq_event on resume"
    - LP: #1554608
  * cxl: Fix PSL timebase synchronization detection
    - LP: #1532914

linux (4.2.0-32.37) wily; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1550045

  [ Kamal Mostafa ]

  * Merged back Ubuntu-4.2.0-31.36

linux (4.2.0-31.36) wily; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1548579

  [ Andy Whitcroft ]

  * [Debian] hv: hv_set_ifconfig -- convert to python3
    - LP: #1506521
  * [Debian] hv: hv_set_ifconfig -- switch to approved indentation
    - LP: #1540586
  * [Debian] hv: hv_set_ifconfig -- fix numerous parameter handling issues
    - LP: #1540586

  [ Carol L Soto ]

  * SAUCE: IB/IPoIB: Do not set skb truesize since using one linearskb
    - LP: #1541326

  [ Dan Streetman ]

  * SAUCE: nbd: ratelimit error msgs after socket close
    - LP: #1505564

  [ Tim Gardner ]

  * Revert "SAUCE: (noup) cxlflash: Fix to avoid virtual LUN failover
    failure"
    - LP: #1541635
  * Revert "SAUCE: (noup) cxlflash: Fix to escalate LINK_RESET also on port
    1"
    - LP: #1541635
  * [Config] ARMV8_DEPRECATED=y
    - LP: #1545542

  [ Upstream Kernel Changes ]

  * x86/xen/p2m: hint at the last populated P2M entry
    - LP: #1542941
  * mm: add dma_pool_zalloc() call to DMA API
    - LP: #1543737
  * sctp: Prevent soft lockup when sctp_accept() is called during a timeout
    event
    - LP: #1543737
  * xen-netback: respect user provided max_queues
    - LP: #1543737
  * xen-netfront: respect user provided max_queues
    - LP: #1543737
  * xen-netfront: update num_queues to real created
    - LP: #1543737
  * iio: adis_buffer: Fix out-of-bounds memory access
    - LP: #1543737
  * KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
    - LP: #1543737
  * KVM: PPC: Fix ONE_REG AltiVec support
    - LP: #1543737
  * x86/irq: Call chip->irq_set_affinity in proper context
    - LP: #1543737
  * drm/amdgpu: fix tonga smu resume
    - LP: #1543737
  * perf kvm record/report: 'unprocessable sample' error while
    recording/reporting guest data
    - LP: #1543737
  * hrtimer: Handle remaining time proper for TIME_LOW_RES
    - LP: #1543737
  * timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper
    - LP: #1543737
  * posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper
    - LP: #1543737
  * itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper
    - LP: #1543737
  * drm/amdgpu: Use drm_calloc_large for VM page_tables array
    - LP: #1543737
  * drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2
    - LP: #1543737
  * drm/radeon: properly byte swap vce firmware setup
    - LP: #1543737
  * ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"
    - LP: #1543737
  * ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()
    - LP: #1543737
  * hwmon: (dell-smm) Blacklist Dell Studio XPS 8000
    - LP: #1543737
  * usb: cdc-acm: handle unlinked urb in acm read callback
    - LP: #1543737
  * usb: cdc-acm: send zero packet for intel 7260 modem
    - LP: #1543737
  * cdc-acm:exclude Samsung phone 04e8:685d
    - LP: #1543737
  * usb: hub: do not clear BOS field during reset device
    - LP: #1543737
  * USB: cp210x: add ID for IAI USB to RS485 adaptor
    - LP: #1543737
  * USB: visor: fix null-deref at probe
    - LP: #1543737
  * USB: serial: visor: fix crash on detecting device without write_urbs
    - LP: #1543737
  * USB: serial: option: Adding support for Telit LE922
    - LP: #1543737
  * ALSA: seq: Fix incorrect sanity check at snd_seq_oss_synth_cleanup()
    - LP: #1543737
  * ALSA: seq: Degrade the error message for too many opens
    - LP: #1543737
  * USB: serial: ftdi_sio: add support for Yaesu SCU-18 cable
    - LP: #1543737
  * arm64: kernel: fix architected PMU registers unconditional access
    - LP: #1543737
  * USB: option: fix Cinterion AHxx enumeration
    - LP: #1543737
  * ALSA: compress: Disable GET_CODEC_CAPS ioctl for some architectures
    - LP: #1543737
  * ALSA: usb-audio: Fix TEAC UD-501/UD-503/NT-503 usb delay
    - LP: #1543737
  * virtio_pci: fix use after free on release
    - LP: #1543737
  * ALSA: bebob: Use a signed return type for get_formation_index
    - LP: #1543737
  * arm64: errata: Add -mpc-relative-literal-loads to build flags
    - LP: #1533009, #1543737
  * arm64: mm: avoid calling apply_to_page_range on empty range
    - LP: #1543737
  * x86/mm: Fix types used in pgprot cacheability flags translations
    - LP: #1543737
  * powerpc/eeh: Fix PE location code
    - LP: #1543737
  * SCSI: fix crashes in sd and sr runtime PM
    - LP: #1543737
  * tty: Fix unsafe ldisc reference via ioctl(TIOCGETD)
    - LP: #1543737
  * n_tty: Fix unsafe reference to "other" ldisc
    - LP: #1543737
  * staging/speakup: Use tty_ldisc_ref() for paste kworker
    - LP: #1543737
  * tick/nohz: Set the correct expiry when switching to nohz/lowres mode
    - LP: #1543737
  * irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
    - LP: #1543737
  * seccomp: always propagate NO_NEW_PRIVS on tsync
    - LP: #1543737
  * drm/radeon: cleaned up VCO output settings for DP audio
    - LP: #1543737
  * drm/radeon: Add a common function for DFS handling
    - LP: #1543737
  * drm/radeon: fix DP audio support for APU with DCE4.1 display engine
    - LP: #1543737
  * cpufreq: Fix NULL reference crash while accessing policy->governor_data
    - LP: #1543737
  * cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
    - LP: #1543737
  * ALSA: dummy: Disable switching timer backend via sysfs
    - LP: #1543737
  * drm/vmwgfx: respect 'nomodeset'
    - LP: #1543737
  * Staging: speakup: Fix getting port information
    - LP: #1543737
  * x86/mm/pat: Avoid truncation when converting cpa->numpages to address
    - LP: #1543737
  * serial: 8250_pci: Add Intel Broadwell ports
    - LP: #1543737
  * perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
    - LP: #1543737
  * perf hists: Fix HISTC_MEM_DCACHELINE width setting
    - LP: #1543737
  * powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8
    - LP: #1543737
  * Linux 4.2.8-ckt4
    - LP: #1543737
  * cxlflash: Resolve oops in wait_port_offline
    - LP: #1541635
  * cxlflash: Fix to resolve cmd leak after host reset
    - LP: #1541635
  * cxlflash: Removed driver date print
    - LP: #1541635
  * cxlflash: drop unlikely before IS_ERR_OR_NULL
    - LP: #1541635
  * powerpc/powernv: Panic on unhandled Machine Check
    - LP: #1541635
  * cxlflash: Fix to avoid virtual LUN failover failure
    - LP: #1541635
  * cxlflash: Fix to escalate LINK_RESET also on port 1
    - LP: #1541635
  * IB/ipoib: Suppress warning for send only join failures
    - LP: #1542444
  * IB/ipoib: Expire sendonly multicast joins
    - LP: #1542444
  * IB/ipoib: increase the max mcast backlog queue
    - LP: #1542444
  * IB/ipoib: For sendonly join free the multicast group on leave
    - LP: #1542444
  * qeth: initialize net_device with carrier off
    - LP: #1541907
  * mwifiex: remove USB8897 chipset support
    - LP: #1494593
  * powerpc/powernv: Fix stale PE primary bus
    - LP: #1546145
  * ALSA: usb-audio: avoid freeing umidi object twice
    - LP: #1546177
    - CVE-2016-2384

 -- Brad Figg <brad.figg@xxxxxxxxxxxxx>  Thu, 10 Mar 2016 13:46:44 -0800

** Changed in: linux (Ubuntu Wily)
       Status: Fix Committed => Fix Released

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2016-2384

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1532914

Title:
  Surelock GA2 SP1: capiredp01: cxl_init_adapter fails for CAPI devices
  0000:01:00.0 and 0005:01:00.0 after upgrading to 840.10 Platform
  firmware build fips840/b1208b_1604.840

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Wily:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  Problem Description
  ++++++++++++++++++++
  I upgraded the Platform firmware to the 840.10 Platform firmware build (b1208b_1604.840) to prepare for Surelock GA2 SP1 testing.  After the upgrade, I used the ipmitool to power on capiredfsp.aus.stglabs.ibm.com and boot the Ubuntu 15.10 partition (capiredp01.aus.stglabs.ibm.com) in OPAL firmware mode.  In petitboot, I saw messages for "cxl-pci 0000:01:00.0: cxl_init_adapter failed: -5" and "cxl-pci 0005:01:00.0: cxl_init_adapter failed: -5."  After the partition started running, I didn't see any AFU devices in /dev/cxl/ or /sys/class/cxl/ although I was able to see PCI devices for the hardware accelerators (0000:01:00.0 and 0005:01:00.0) with the lspci command.

  ubuntu@capiredp01:~$ ls -l /dev/cxl/
  ls: cannot access /dev/cxl/: No such file or directory
  ubuntu@capiredp01:~$ ls -l /sys/class/cxl/
  total 0
  ubuntu@capiredp01:~$ sudo lscfg | grep -i afu
  ubuntu@capiredp01:~$ sudo lspci|egrep -i "04cf|0477"
  0000:01:00.0 Processing accelerators: IBM Device 04cf (rev 01)
  0005:01:00.0 Processing accelerators: IBM Device 04cf (rev 01)
  ubuntu@capiredp01:~$ lsscsi -g
  [0:0:0:0]    enclosu IBM      VSBPD12M1 6GSAS    03  -          /dev/sg1 
  [0:0:1:0]    cd/dvd  IBM.     RMBO0140512      RA65  /dev/sr0   /dev/sg2 
  [0:3:0:0]    no dev  IBM      57D7001SISIOA    0150  -          /dev/sg0 
  [1:0:0:0]    enclosu IBM      VSBPD12M1 6GSAS    03  -          /dev/sg4 
  [1:0:1:0]    disk    IBM      HUC109030CSS600  E5C6  /dev/sda   /dev/sg5 
  [1:0:2:0]    disk    IBM      HUC101212CSS600  A5AA  /dev/sdb   /dev/sg6 
  [1:0:3:0]    disk    IBM      HUC101212CSS600  A5AA  /dev/sdc   /dev/sg7 
  [1:0:4:0]    disk    IBM      HUC101212CSS600  A5AA  /dev/sdd   /dev/sg8 
  [1:0:5:0]    disk    IBM      ST1200MM0007     BF04  /dev/sde   /dev/sg9 
  [1:0:6:0]    disk    IBM      ST1200MM0007     BF04  /dev/sdf   /dev/sg10
  [1:3:0:0]    no dev  IBM      57D7001SISIOA    0150  -          /dev/sg3 

  
  This is a regression: the Linux kernel has failed to synchronize the PSL timebase.
  The corresponding error message is in the dmesg log attached in comment #4:

  [    1.687586] PSL: Timebase sync: giving up!

  CAPI devices are not enabled, because of this failure.

  PSL Timebase sync should not be a requirement for CAPI initialization,
  nor should it make an initialized card become unavailable.  Currently,
  timebase is an unused function of CAPI with hopes of adoption in the
  future.  Support of this feature should be considered optional at this
  time.

  I'm not sure what the fastest way to fix this is, but it needs to be
  fixed as quickly as possible.  CAPI is broken in Ubuntu 15.10.

  I can reproduce the bug, regardless of the skiboot level, with recent kernels.
  Older kernels behave as expected, regardless of the skiboot level.

  Firmware is not the cause of the regression, and kernel probably is.
  I sent this out to the capi-linux distro too, but I'll comment here as well.  I'm not sure what is being looked at to determine the PSL timebase sync failed.  As far as I know all PSL versions should support timebase.  The only timebase error the PSL logs is if CAPP returns a status that says timebase has an error.  I'd think if that is the issue that timebase has not been correctly enabled or sequenced correctly in the host CAPP.  The PSL can't be enabled for timebase until the CAPP unit in the host has been enabled.

  I have installed a recent mainline Linux kernel (4.4.0-rc8) on
  capiredp01. I have rebooted this kernel and verified that the PSL
  timebase syncs without problem.

  I will now compare the source code of Ubuntu kernel 4.2.0-19 (that
  hits the bug) with the source of mainline kernel 4.4.0-rc8 (that
  operates as expected).

  I have updated the Ubuntu kernel and modules with:

  $ sudo apt-get install linux-image-4.2.0-23-generic
  $ sudo apt-get install linux-image-extra-4.2.0-23-generic

  I have rebooted Ubuntu kernel linux-image-4.2.0-23-generic, and found that the cxl driver hits the bug.
  I have also downloaded the source for this Ubuntu kernel (and modules) with:

  $ sudo apt-get source linux-image-4.2.0-23-generic

  I have recompiled and installed, and noticed that the resulting kernel
  bears the version 4.2.6 (??). I have rebooted this Ubuntu kernel 4.2.6
  built from the Ubuntu source for 4.2.0-23-generic, and found that the
  timebase sync occurs normally.

  In short, the kernels linux-4.2.6 and linux-4.4.0-rc8 (that I have
  built from the source, respectively provided by Ubuntu and Linus)
  operate normally, when all kernels compiled by, and downloaded from,
  Ubuntu hit the timebase sync bug.

  I will try to investigate possible differences between kernel config
  files or toolchain and build procedures.

  I have found that the bug can be activated or prevented via the Linux kernel config file.
  I have compiled the Ubuntu kernel source downloaded with

  $ sudo apt-get source linux-image-4.2.0-23-generic

  1. with my own config file => PSL timebase sync works fine
  2. with the config fille supplied by Ubuntu => PSL timebase sync fails

  I will now diff the config files, and try to identify the set of
  config parameters that change the kernel behavior regarding timebase
  sync.

  Got it. Here is the difference between config-4.2.0-23-generic (that
  hits the bug) and .config (that operates normally):

  $ diff config-4.2.0-23-generic .config
  130,131c130,132
  < CONFIG_TICK_CPU_ACCOUNTING=y
  < # CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
  ---
  > CONFIG_VIRT_CPU_ACCOUNTING=y
  > # CONFIG_TICK_CPU_ACCOUNTING is not set
  > CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y

  For some reason, setting CONFIG_TICK_CPU_ACCOUNTING breaks PSL
  Timebase sync on ppc64le.  Investigating further.

  Canonical, can you please replace

  CONFIG_TICK_CPU_ACCOUNTING=y
  # CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set

  by
   
  CONFIG_VIRT_CPU_ACCOUNTING=y
  # CONFIG_TICK_CPU_ACCOUNTING is not set
  CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y

  in the default ppc64le Linux kernel configuration file?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1532914/+subscriptions