← Back to team overview

kernel-packages team mailing list archive

[Bug 1413540] Re: issues with KSM enabled for nested KVM VMs

 

I think you'll find that nested kvm and ksm will be a worst-case
scenario w.r.t memory swap-out.  KSM actively scans the hosts memory for
pages it can unmap , which means when a guest (level 1) or nested guest
(level 2) needs to write to that memory, then you incur a page table
walk (Guest virtual to host physical) which is quite costly.  This can
be further affected by overcommit of memory, which ends up having the
hypervisor swap memory to disk to handle the working set size.  KSM will
not have any view into the the page tables, or swapping activity of the
guest (l1, or l2) meaning that overtime it's increasingly likely that
KSM will have swapped out memory needed for either the L1  or L2 guest.
Swapping L1 memory utilized to run the L2 guest is likely to the the
most painful since there are two levels of swap-in occuring (host to l1,
and the l1 to l2).  Swap in/out will be IO intensive, and blocking on io
is also more likely to trigger soft-lockups in either l1 or l2 guests.

I suggest looking at the openstack environment and disabling/turning
down the amount of memory overcommit to reduce the memory pressure on
the host.  Given that KSM isn't optimized at all for nested KVM, it's
certainly worth  disabling KSM when running nested guests (certainly in
the L1 guest, possibly the host as well) unless one wants to invest in
tuning KSM as well as close monitoring of the host memory pressure any
any L1 guest memory pressure if the L1 guest also runs an L2 guest.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1413540

Title:
  issues with KSM enabled for nested KVM VMs

Status in linux package in Ubuntu:
  Incomplete
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  When installing qemu-kvm on a VM, KSM is enabled.

  I have encountered this problem in trusty:$ lsb_release -a
  Distributor ID: Ubuntu
  Description:    Ubuntu 14.04.1 LTS
  Release:        14.04
  Codename:       trusty
  $ uname -a
  Linux juju-gema-machine-2 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

  The way to see the behaviour:
  1) $ more /sys/kernel/mm/ksm/run
  0
  2) $ sudo apt-get install qemu-kvm
  3) $ more /sys/kernel/mm/ksm/run
  1

  To see the soft lockups, deploy a cloud on a virtualised env like ctsstack, run tempest on it, the compute nodes of the virtualised deployment will eventually stop responding with (run tempest 2 times at least):
   24096.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791]
  [24124.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791]
  [24152.072002] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
  [24180.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
  [24208.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
  [24236.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
  [24264.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]

  I am not sure whether the problem is that we are enabling KSM on a VM
  or the problem is that nested KSM is not behaving properly. Either way
  I can easily reproduce, please contact me if you need further details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540/+subscriptions