← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1613423] Re: Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created)

 

Preferred way for this fix was:

commit 0d34fbabc13891da41582b0823867dc5733fffef
Author: Rafael David Tinoco <rafael.tinoco@xxxxxxxxxxxxx>
Date:   Mon Oct 24 15:35:03 2016

    vhost: migration blocker only if shared log is used

    Commit 31190ed7 added a migration blocker in vhost_dev_init() to
    check if memfd would succeed. It is better if this blocker first
    checks if vhost backend requires shared log. This will avoid a
    situation where a blocker is added inappropriately (e.g. shared
    log allocation fails when vhost backend doesn't support it).

    Signed-off-by: Rafael David Tinoco <rafael.tinoco@xxxxxxxxxxxxx>
    Reviewed-by: Marc-André Lureau <marcandre.lureau@xxxxxxxxxx>
    Reviewed-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
    Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>

And accepted upstream. I'm closing this bug.

** Changed in: libvirt (Ubuntu)
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1613423

Title:
  Mitaka + Trusty (kernel 3.13) not using apparmor capability by
  default, when it does, live migration doesn't work (/tmp/memfd-XXX
  can't be created)

Status in OpenStack Compute (nova):
  Invalid
Status in OpenStack Security Advisory:
  Invalid
Status in libvirt package in Ubuntu:
  Incomplete

Bug description:
  In my environment: Trusty (3.13) + JuJu (1.25) w/ latest charms + Kilo
  upgraded to Mitaka (already using non-tunnelled live migrations, after
  latest SRU to disable tunnelled live migrations)

  BUG #1

  My compute nodes are NOT loading "apparmor" libvirt capability by
  default:

  inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $?
  1
  inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $?
  1
  inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $?
  1

  Because "libvirt" is loaded before apparmor profile is loaded and
  qemu.conf doesn't specify 'security_driver = "apparmor' in its file.
  If you try to add the security driver to the file, libvirt and nova-
  compute won't start because apparmor isn't started when they start.
  For trusty, apparmor is started as a legacy SYS-V init script, at the
  end of initialisation, causing this problem.

  After re-starting libvirt-bin service, apparmor starts being used:

  inaddy@tkcompute01:~$ sudo service libvirt-bin restart
  libvirt-bin stop/waiting
  libvirt-bin start/running, process 7031
  inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $?
  0

  inaddy@tkcompute02:~$ sudo service libvirt-bin restart
  libvirt-bin stop/waiting
  libvirt-bin start/running, process 7031
  inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $?
  0

  inaddy@tkcompute03:~$ sudo service libvirt-bin restart
  libvirt-bin stop/waiting
  libvirt-bin start/running, process 7031
  inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $?
  0

  BUG #2 (after fixing BUG #1)

  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5) uses a fallback mechanism for creating shared memory
  for live-migrations. This fall back mechanism, on kernels 3.13 - that
  don't have memfd_create() system-call, try to create files on /tmp/
  directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int *fd)
  {
      if (memfd_create)... ### only works with HWE kernels

      else                 ### 3.13 kernels, gets blocked by apparmor
         tmpdir = g_get_tmp_dir
         ...
         mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: unable to execute QEMU command 'migrate': Migration disabled: failed to allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 audit(1471289780.407:76): apparmor="DENIED" operation="mknod" profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 audit(1471289780.411:77): apparmor="DENIED" operation="open" profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 audit(1471289780.411:78): apparmor="DENIED" operation="open" profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

  Still trying to figure out rules from /etc/apparmor.d/abstraction
  /libvirt-qemu that are not allowing the live migration to work.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1613423/+subscriptions