yahoo-eng-team team mailing list archive
  
  - 
     yahoo-eng-team team yahoo-eng-team team
- 
    Mailing list archive
  
- 
    Message #25480
  
 [Bug 1401647] [NEW] Huge pages: Compute driver fails to set appropriate page size when using flavor extra spec -- 'hw:mem_page_size=any'
  
Public bug reported:
Description of problem
----------------------
>From the proposed Nova specification "Virt driver large page allocation
for guest RAM"[*], if you set the Nova flavor extra_spec for huge pages
as 'any' ('nova flavor-key m1.hugepages set hw:mem_page_size=any',
it means: "leave policy upto the compute driver implementation to
decide. When seeing 'any' the libvirt driver might try to find large
pages, but fallback to small pages"
However, booting a guest with a Nova flavor defined with huge pages size
set to 'any', results in:
    libvirtError: internal error: Unable to find any usable hugetlbfs
mount for 4 KiB
>From Nova Conductor logs:
. . .
2014-12-11 13:06:34.738 ERROR nova.scheduler.utils [req-7812c740-ec60-461e-a6b7-66b4bd4359ee admin admin] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c
9fb] Error from last host: fedvm1 (node fedvm1): [u'Traceback (most recent call last):\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manage
r.py", line 2060, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manager.py", line 220
0, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance c8e1093b-81d6-4
bc8-a319-7a8ea384c9fb was re-scheduled: internal error: Unable to find any usable hugetlbfs mount for 4 KiB\n']
. . .
 
[*] http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/virt-driver-large-pages.html#proposed-change
Version
-------
Apply the virt-driver-large-pages patch series to Nova git, and test via
DevStack:
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp
/virt-driver-large-pages,n,z
    $ git log | grep "commit\ " | head -8
    commit c0c5d6a497c0e275e6f2037c1f7d45983a077cbc
    commit 9d1d59bd82a7f2747487884d5880270bfdc9734a
    commit eda126cce41fd5061b630a1beafbf5c37292946e
    commit 6980502683bdcf514b386038ca0e0ef8226c27ca
    commit b1ddc34efdba271f406a6db39c8deeeeadcb8cc9
        This commit also add a new exceptions MemoryPageSizeInvalid and
    commit 2fcfc675aa04ef2760f0e763697c73b6d90a4fca
    commit 567987035bc3ef685ea09ac2b82be55aa5e23ca5
    $ git describe
    2014.2-1358-gc0c5d6a
libvirt version: libvirt-1.2.11 (built from libvirt git)
    $ git log | head -1 
    commit a2a35d0164f4244b9c6f143f54e9bb9f3c9af7d3a
    $ git describe
    CVE-2014-7823-247-ga2a35d0
Steps to Reproduce
------------------
Test environment: I was testing Nova huge pages in a DevStack VM with KVM
nested virtualization, i.e. the Nova instances will be the nested guests.
Check if the 'hugetlbfs' is present in /proc filesystem:
    $ cat /proc/filesystems  | grep hugetlbfs
    nodev   hugetlbfs
Get the number of total huge pages:
    $ grep HugePages_Total /proc/meminfo
    HugePages_Total:     512
Get the number of free huge pages:
    $ grep HugePages_Free /proc/meminfo
    HugePages_Free:      512
Create flavor:
    nova flavor-create m1.hugepages 999 2048 1 4
Set extra_spec values for NUMA and Huge pages, with value as 'any':
    nova flavor-key m1.hugepages set hw:numa_nodes=1
    nova flavor-key m1.hugepages set hw:mem_page_size=any
Enumerate the newly created flavor properties:
    $ nova flavor-show m1.hugepages
    +----------------------------+-----------------------------------------------------+
    | Property                   | Value                                               |
    +----------------------------+-----------------------------------------------------+
    | OS-FLV-DISABLED:disabled   | False                                               |
    | OS-FLV-EXT-DATA:ephemeral  | 0                                                   |
    | disk                       | 1                                                   |
    | extra_specs                | {"hw:mem_page_size": "any", "hw:numa_nodes": "1"}   |
    | id                         | 999                                                 |
    | name                       | m1.hugepages                                        |
    | os-flavor-access:is_public | True                                                |
    | ram                        | 2048                                                |
    | rxtx_factor                | 1.0                                                 |
    | swap                       |                                                     |
    | vcpus                      | 4                                                   |
    +----------------------------+-----------------------------------------------------+
Boot a guest with the above falvor:
Actual results
--------------
(1) Contextual error messages from Nova Compute log (screen-n-cpu.log):
. . .
2014-12-11 13:06:34.141 ERROR nova.compute.manager [-] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] Instance failed to spawn
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] Traceback (most recent call last):
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/compute
/manager.py", line 2282, in _build_resources
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     yield resources
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/compute
/manager.py", line 2152, in _build_and_run_instance
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     flavor=flavor)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/li
bvirt/driver.py", line 2384, in spawn
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     block_device_info=block_device_info)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4278, in _create_domain_and_network
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     power_on=power_on)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4211, in _create_domain
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     LOG.error(err)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/oslo/utils/excutils.py", line 82, in __exit__
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     six.reraise(self.type_, self.value, self.tb)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4201, in _create_domain
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     domain.createWithFlags(launch_flags)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     rv = execute(f, *args, **kwargs)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     six.reraise(c, e, tb)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     rv = meth(*args, **kwargs)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1033, in createWithFlags
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] libvirtError: internal error: Unable to find any usable hugetlbfs mount for 4 KiB
. . .
Expected results
----------------
As specified in the SPEC, Compute driver
Additional info
---------------
(2) Contextual error messages from Nova Conductor log (screen-n-cond.log):
----------------------------------------
. . .
2014-12-11 13:06:34.738 ERROR nova.scheduler.utils [req-7812c740-ec60-461e-a6b7-66b4bd4359ee admin admin] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c
9fb] Error from last host: fedvm1 (node fedvm1): [u'Traceback (most recent call last):\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manage
r.py", line 2060, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manager.py", line 220
0, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance c8e1093b-81d6-4
bc8-a319-7a8ea384c9fb was re-scheduled: internal error: Unable to find any usable hugetlbfs mount for 4 KiB\n']
. . .
----------------------------------------
(3) This error comes from libvirt, added in this commit:
    http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=281f70013e
** Affects: nova
     Importance: Undecided
         Status: New
-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1401647
Title:
  Huge pages: Compute driver fails to set appropriate page size when
  using flavor extra spec --  'hw:mem_page_size=any'
Status in OpenStack Compute (Nova):
  New
Bug description:
  Description of problem
  ----------------------
  From the proposed Nova specification "Virt driver large page allocation
  for guest RAM"[*], if you set the Nova flavor extra_spec for huge pages
  as 'any' ('nova flavor-key m1.hugepages set hw:mem_page_size=any',
  it means: "leave policy upto the compute driver implementation to
  decide. When seeing 'any' the libvirt driver might try to find large
  pages, but fallback to small pages"
  However, booting a guest with a Nova flavor defined with huge pages size
  set to 'any', results in:
      libvirtError: internal error: Unable to find any usable hugetlbfs
  mount for 4 KiB
  
  From Nova Conductor logs:
  . . .
  2014-12-11 13:06:34.738 ERROR nova.scheduler.utils [req-7812c740-ec60-461e-a6b7-66b4bd4359ee admin admin] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c
  9fb] Error from last host: fedvm1 (node fedvm1): [u'Traceback (most recent call last):\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manage
  r.py", line 2060, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manager.py", line 220
  0, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance c8e1093b-81d6-4
  bc8-a319-7a8ea384c9fb was re-scheduled: internal error: Unable to find any usable hugetlbfs mount for 4 KiB\n']
  . . .
   
  [*] http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/virt-driver-large-pages.html#proposed-change
  
  Version
  -------
  Apply the virt-driver-large-pages patch series to Nova git, and test via
  DevStack:
  https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp
  /virt-driver-large-pages,n,z
      $ git log | grep "commit\ " | head -8
      commit c0c5d6a497c0e275e6f2037c1f7d45983a077cbc
      commit 9d1d59bd82a7f2747487884d5880270bfdc9734a
      commit eda126cce41fd5061b630a1beafbf5c37292946e
      commit 6980502683bdcf514b386038ca0e0ef8226c27ca
      commit b1ddc34efdba271f406a6db39c8deeeeadcb8cc9
          This commit also add a new exceptions MemoryPageSizeInvalid and
      commit 2fcfc675aa04ef2760f0e763697c73b6d90a4fca
      commit 567987035bc3ef685ea09ac2b82be55aa5e23ca5
      $ git describe
      2014.2-1358-gc0c5d6a
  
  libvirt version: libvirt-1.2.11 (built from libvirt git)
      $ git log | head -1 
      commit a2a35d0164f4244b9c6f143f54e9bb9f3c9af7d3a
      $ git describe
      CVE-2014-7823-247-ga2a35d0
  Steps to Reproduce
  ------------------
  Test environment: I was testing Nova huge pages in a DevStack VM with KVM
  nested virtualization, i.e. the Nova instances will be the nested guests.
  Check if the 'hugetlbfs' is present in /proc filesystem:
      $ cat /proc/filesystems  | grep hugetlbfs
      nodev   hugetlbfs
  Get the number of total huge pages:
      $ grep HugePages_Total /proc/meminfo
      HugePages_Total:     512
  Get the number of free huge pages:
      $ grep HugePages_Free /proc/meminfo
      HugePages_Free:      512
  Create flavor:
      nova flavor-create m1.hugepages 999 2048 1 4
  Set extra_spec values for NUMA and Huge pages, with value as 'any':
      nova flavor-key m1.hugepages set hw:numa_nodes=1
      nova flavor-key m1.hugepages set hw:mem_page_size=any
  Enumerate the newly created flavor properties:
      $ nova flavor-show m1.hugepages
      +----------------------------+-----------------------------------------------------+
      | Property                   | Value                                               |
      +----------------------------+-----------------------------------------------------+
      | OS-FLV-DISABLED:disabled   | False                                               |
      | OS-FLV-EXT-DATA:ephemeral  | 0                                                   |
      | disk                       | 1                                                   |
      | extra_specs                | {"hw:mem_page_size": "any", "hw:numa_nodes": "1"}   |
      | id                         | 999                                                 |
      | name                       | m1.hugepages                                        |
      | os-flavor-access:is_public | True                                                |
      | ram                        | 2048                                                |
      | rxtx_factor                | 1.0                                                 |
      | swap                       |                                                     |
      | vcpus                      | 4                                                   |
      +----------------------------+-----------------------------------------------------+
  
  Boot a guest with the above falvor:
  
  Actual results
  --------------
  
  (1) Contextual error messages from Nova Compute log (screen-n-cpu.log):
  . . .
  2014-12-11 13:06:34.141 ERROR nova.compute.manager [-] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] Instance failed to spawn
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] Traceback (most recent call last):
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/compute
  /manager.py", line 2282, in _build_resources
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     yield resources
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/compute
  /manager.py", line 2152, in _build_and_run_instance
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     flavor=flavor)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/li
  bvirt/driver.py", line 2384, in spawn
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     block_device_info=block_device_info)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4278, in _create_domain_and_network
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     power_on=power_on)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4211, in _create_domain
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     LOG.error(err)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/oslo/utils/excutils.py", line 82, in __exit__
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     six.reraise(self.type_, self.value, self.tb)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4201, in _create_domain
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     domain.createWithFlags(launch_flags)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     rv = execute(f, *args, **kwargs)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     six.reraise(c, e, tb)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     rv = meth(*args, **kwargs)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1033, in createWithFlags
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
  2014-12-11 13:06:34.141 TRACE nova.compute.manager [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c9fb] libvirtError: internal error: Unable to find any usable hugetlbfs mount for 4 KiB
  . . .
  
  Expected results
  ----------------
  As specified in the SPEC, Compute driver
  Additional info
  ---------------
  (2) Contextual error messages from Nova Conductor log (screen-n-cond.log):
  ----------------------------------------
  . . .
  2014-12-11 13:06:34.738 ERROR nova.scheduler.utils [req-7812c740-ec60-461e-a6b7-66b4bd4359ee admin admin] [instance: c8e1093b-81d6-4bc8-a319-7a8ea384c
  9fb] Error from last host: fedvm1 (node fedvm1): [u'Traceback (most recent call last):\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manage
  r.py", line 2060, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/home/kashyapc/src/cloud/nova/nova/compute/manager.py", line 220
  0, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance c8e1093b-81d6-4
  bc8-a319-7a8ea384c9fb was re-scheduled: internal error: Unable to find any usable hugetlbfs mount for 4 KiB\n']
  . . .
  ----------------------------------------
  
  (3) This error comes from libvirt, added in this commit:
      http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=281f70013e
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1401647/+subscriptions
Follow ups
References