group.of.nepali.translators team mailing list archive

Thread
Date
[Bug 1832622] Re: QEMU - count cache flush Spectre v2 mitigation (CVE) (required for POWER9 DD2.3)

To: group.of.nepali.translators@xxxxxxxxxxxxxxxxxxx
From: Andrew Cloke <andrew.cloke@xxxxxxxxxxxxx>
Date: Fri, 04 Oct 2019 12:21:47 -0000
Reply-to: Bug 1832622 <1832622@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
** Changed in: ubuntu-power-systems
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1832622

Title:
  QEMU -  count cache flush Spectre v2 mitigation (CVE) (required for
  POWER9 DD2.3)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Won't Fix
Status in linux source package in Bionic:
  Fix Released
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Released
Status in qemu source package in Disco:
  Fix Released
Status in qemu source package in Eoan:
  Fix Released

Bug description:
  [Impact]

   * This belongs to the overall context of spectre mitigations and even 
     more the try to minimize the related performance impacts.
     On ppc64el there is a new chip revision (DD 2.3) which provides
     a facility that helps to better mitigate some of this.

   * Backport the patches that will make the feature (if supported by the 
     HW) will pass the capability to the guest - to allow guests that 
     support the improved mitigation to use it.

  [Test Case]

   * Start guests with and without this capability
     * Check if the capability is guest visible as intented
     * Check if there are any issues on pre DD2.3 HW
   * Test migrations (IBM outlined the intented paths that will work 
     below)
   * The problem with the above (and also the reasons I didn't add a list 
     of commands this time) is that it needs special HW (mentioned DD2.3 
     revision) of the chips which aren't available to us right now.
     Due to that testing / verification of this on all releases is on IBM

  [Regression Potential]

   * Adding new capabilities usually works fine, there are three common 
     pitfalls which here are the regression potential.
     - (severe) the code would announce a capability that isn't really 
       available. The guest tries to use it and crashes
     - (medium) several migration paths especially from systems with the 
       new cap to older (un-updated systems) will fail. But that applies 
       to any "from machine with Feature to machine without that feature" 
       and isn't really a new regression. As outlined by IBM below they 
       even tried to make it somewhat compatible (by being a new value in 
       an existing cap)
     - (low) the guest will see new caps and or facilities. A really odd
       guest could stumble due to that (would actually be a guest bug 
       then)
    Overall all of the above was considered by IBM when developing this 
    and should be ok. For archive wide SRU considerations, this has NO 
    effect on non ppc64el.

  [Other Info]
   
   * n/a

  ---

  Power9 DD 2.3  CPUs  running updated firmware will use a new Spectre
  v2 mitigation. The new mitigation improves performance of branch heavy
  workloads, but also requires kernel support in order to be fully
  secure.

  Without the kernel support there is a risk of a Spectre v2 attack
  across a process context switch, though it has not been demonstrated
  in practice.

  QEMU portion - platform definition needs to account for this new
  mitigation action.. so attribute for this needs to be added.

  In terms of support for virtualisation there are 2 sides, kvm and qemu
  support. Patch list for each,

  KVM:
  2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
  This is part of LP1822870 already.

  QEMU:
  8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
  399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS

  The KVM side is upstream as of v5.1-rc1.
  The QEMU side is upstream as of v4.0.0-rc0.

  In terms of migration the state is as follows.

  In order to specify to the guest to use the count cache flush
  workaround we use the spapr-cap cap-ibs (indirect branch speculation)
  with the value workaround. Previously the only valid values were
  broken, fixed-ibs (indirect branch serialisation) and fixed-ccd (count
  cache disabled). And add a new cap cap-ccf-assist (count cache flush
  assist) to specify the availability of the hardware assisted flush
  variant.

  Note the the way spapr caps work you can migrate to a host that supports a higher value, but not to one which doesn't support the current value (i.e. only supports lower values). Where for cap-ibs these are defined as:
  0 - Broken
  1 - Workaround
  2 - fixed-ibs
  3 - fixed-ccd

  So the following migrations would be valid for example:
  broken -> fixed-ccd, broken -> workaround, workaround -> fixed-ccd

  While the following would be invalid:
  fixed-ccd -> workaround, workaround ->broken, fixed-ccd -> broken

  This is done to maintain at least the level of protection specified on the command line on migration.
  Since the workaround must be communicated to the guest kernel at boot we cannot migrate a guest from a host with fixed-ccd to one with workaround since the guest wouldn't know to do the flush and so would be wholly unprotected.

  This means that to migrate a guest from 2.2 and before to 2.3 would
  require the guest to either be have been booted with broken
  previously, or to be rebooted with workaround specified on the command
  line which would allow the migration to succeed to a 2.3.

  == MICHAEL D. ROTH ==
  I've tested a backport of count-cache-flush support consisting of the following patches applied (cleanly) on top of bionic's QEMU 2.11+dfsg-1ubuntu7.14 source:

    target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
    ppc/spapr-caps: Change migration macro to take full spapr-cap name
    target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
    target/ppc: Factor out the parsing in kvmppc_get_cpu_characteristics()

  The following tests were done using a DD 2.3 Witherspoon machine and
  the results seem to align with what's expected in the original
  summary:

  == enablement tests (using 4.15.0-51-generic in both host and guests)
  ==

  with cap-ibs=workaround,cap-ccf-assist=on:
    mdroth@ubuntu:~$ dmesg | grep cache-flush
    [    0.000000] count-cache-flush: hardware assisted flush sequence enabled

  with cap-ibs=workaround,cap-ccf-assist=off:
    mdroth@ubuntu:~$ dmesg | grep cache-flush
    [    0.000000] count-cache-flush: full software flush sequence enabled.

  with cap-ibs=broken
    mdroth@ubuntu:~$ dmesg | grep cache-flush
    [    0.000000] count-cache-flush: software flush disabled.

  == migration tests (using 4.15.0-51-generic in both host and guests)
  ==

  Note that pseries-2.11-sxxm/bionic-sxxm defaults to:

      smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
      smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
      smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_FIXED_CCD

  but SPAPR_CAP_FIXED_CCD is not available on the DD 2.3 system I tested
  on (no fw-count-cache-disabled/enabled in host fw-features device
  tree), so I used pseries-2.11-sxxm,cap-ibs=broken as the base-level

  cross-migration: qemu 2.11+dfsg-1ubuntu7.14 -> 2.11+dfsg-1ubuntu7.14
  +ccf-backport

  source: -M bionic-sxxm,cap-ibs=broken
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off
      expected: warning
      actual: warning
        "cap-ibs lower level (0) in incoming stream than on destination (1))"
      software ccf enabled after reboot? yes
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on
      expected: warning
      actual: warning
        "cap-ccf-assist lower level (0) in incoming stream than on destination (1))"
      hardware ccf enabled after reboot? yes
    target: -M bionic-sxxm,cap-ibs=broken
      expected: success
      actual: success

  migration: 2.11+dfsg-1ubuntu7.14+ccf-backport -> 2.11+dfsg-1ubuntu7.14
  +ccf-backport

  source: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off
      expected: success
      actual: success
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on
      expected: warning
      actual: warning
        "cap-ccf-assist lower level (0) in incoming stream than on destination (1)"
      hardware ccf enabled after reboot? yes
    target: -M bionic-sxxm,cap-ibs=broken
      expected: fail
      actual: fail
        "cap-ibs higher level (1) in incoming stream than on destination (0)"

  source: -M bionic-sxxm,cap-ibs=workaround,ccf-assist=on
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on
      expected: success
      actual: success
    target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off
      expected: fail
      actual: fail, "cap-ccf-assist higher level (1) in incoming stream than on destination (0)"
    target: cap-ibs=broken (expected: fail, actual: )
      expected: fail
      actual: fail
        "cap-ibs higher level (1) in incoming stream than on destination (0)"
        "cap-ccf-assist higher level (1) in incoming stream than on destination (0)"

  Sorry, I forgot that I needed some fix-ups for the 4th/last patch,
  "target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST".

  I've gone ahead and posted my git tree, which is based on top of the
  qemu_2.11+dfsg-1ubuntu7.14 source, so the 4 patches there should apply
  cleanly. There's are notes in the commit notes on what changes were
  needed for patch 4.

  https://github.com/mdroth/qemu/commits/spectre-ccf-ubuntu-bionic-
  1ubuntu7.14

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1832622/+subscriptions