← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1884231] [NEW] 'hw:realtime_mask' extra spec is not validated

 

Public bug reported:

The 'hw:realtime_mask' extra spec is (currently) used to specify what
cores in a host should *not* be part of the realtime set of cores on the
host. Currently, this is mandatory and omitting it will cause a HTTP 400
error. For example:

  $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
      --property hw:cpu_policy=dedicated
      --property hw:cpu_realtime=yes \
      test.rt

will fail with:

  Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask

Similarly, attempting to mask *all* values will result in a failure. For
example:

  $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
      --property hw:cpu_policy=dedicated
      --property hw:cpu_realtime=yes \
      --property hw:cpu_realtime_mask=^0-1
      test.rt

will also fail with:

  Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask

However, the value is otherwise unvalidated by nova, which can cause
libvirt to explode when specific values are passed. For example,
consider the following flavor:

  $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
      --property hw:cpu_policy=dedicated
      --property hw:cpu_realtime=yes \
      --property hw:cpu_realtime_mask='^2' \
      test.rt

This states that the instances should have two cores, and some imaginary
third core (masks are 0-indexed) will be the non-realtime one. This is
clearly nonsensical and, surely enough, creating an instance using this
core causes things to go bang:

  Failed to build and run instance: libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
  Traceback (most recent call last):
    File "/opt/stack/nova/nova/compute/manager.py", line 2378, in _build_and_run_instance
      accel_info=accel_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn
      cleanup_instance_disks=created_disks)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in _create_domain_and_network
      cleanup_instance_disks=cleanup_instance_disks)
    File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
      raise value
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in _create_domain_and_network
      post_xml_callback=post_xml_callback)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in _create_domain
      guest = libvirt_guest.Guest.create(xml, self._host)
    File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create
      encodeutils.safe_decode(xml))
    File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
      raise value
    File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 123, in create
      guest = host.write_instance_config(xml)
    File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1141, in write_instance_config
      domain = self.get_connection().defineXML(xml)
    File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 190,in doit
      result = proxy_call(self._autowrap, f, *args, **kwargs)
    File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
      rv = execute(f, *args, **kwargs)
    File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 129, in execute
      six.reraise(c, e, tb)
    File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
      raise value
    File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 83, in tworker
      rv = meth(*args, **kwargs)
    File "/usr/local/lib64/python3.7/site-packages/libvirt.py", line 4048, in defineXML
      if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
  libvirt.libvirtError: invalid argument: Failed to parse bitmap ''

The error happens because libvirt is attempting to configure the set
CPUs on which to pin emulators threads, which in the realtime case are
all the non-realtime cores. However, since there are no cores set aside
for non-realtime purposes - due to the invalid mask - we end up with an
empty emulator thread set [1]. One *could* work around this by
configuring an emulator thread policy. For example:

  openstack flavor create --ram 512 --disk 1 --vcpus 2 \
    --property 'hw:cpu_policy=dedicated' \
    --property 'hw:emulator_threads_policy=isolate' \
    --property 'hw:cpu_realtime=true' \
    --property 'hw:cpu_realtime_mask=^2' \
    test.rt

Similarly, they could ensure at least one core in the range is valid:

  openstack flavor create --ram 512 --disk 1 --vcpus 2 \
    --property 'hw:cpu_policy=dedicated' \
    --property 'hw:emulator_threads_policy=isolate' \
    --property 'hw:cpu_realtime=true' \
    --property 'hw:cpu_realtime_mask=^1-5' \
    test.rt

However, both cases are still wrong and the 'hw:cpu_realtime_mask' value
is almost certainly user error. Nova should be validating things
properly and rejecting invalid values. we could probably also look at
dropping the requirement to specify 'hw:cpu_realtime_mask' if
'hw:emulator_threads_policy' is configured, however, that's more of a
feature than a bug.

** Affects: nova
     Importance: Medium
     Assignee: Stephen Finucane (stephenfinucane)
         Status: Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
     Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Changed in: nova
       Status: New => Confirmed

** Description changed:

  The 'hw:realtime_mask' extra spec is (currently) used to specify what
  cores in a host should *not* be part of the realtime set of cores on the
  host. Currently, this is mandatory and omitting it will cause a HTTP 400
  error. For example:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-       --property hw:cpu_policy=dedicated
-       --property hw:cpu_realtime=yes \
-       test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+       --property hw:cpu_policy=dedicated
+       --property hw:cpu_realtime=yes \
+       test.rt
  
  will fail with:
  
-   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
+   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
  and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask
  
  Similarly, attempting to mask *all* values will result in a failure. For
  example:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-       --property hw:cpu_policy=dedicated
-       --property hw:cpu_realtime=yes \
-       --property hw:cpu_realtime_mask=^0-1
-       test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+       --property hw:cpu_policy=dedicated
+       --property hw:cpu_realtime=yes \
+       --property hw:cpu_realtime_mask=^0-1
+       test.rt
  
  will also fail with:
  
-   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
+   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
  and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask
  
  However, the value is otherwise unvalidated by nova, which can cause
  libvirt to explode when specific values are passed. For example,
  consider the following flavor:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-       --property hw:cpu_policy=dedicated
-       --property hw:cpu_realtime=yes \
-       --property hw:cpu_realtime_mask='^2' \
-       test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+       --property hw:cpu_policy=dedicated
+       --property hw:cpu_realtime=yes \
+       --property hw:cpu_realtime_mask='^2' \
+       test.rt
  
  This states that the instances should have two cores, and some imaginary
  third core (masks are 0-indexed) will be the non-realtime one. This is
  clearly nonsensical and, surely enough, creating an instance using this
  core causes things to go bang:
  
-   Failed to build and run instance: libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
-   Traceback (most recent call last):
-     File "/opt/stack/nova/nova/compute/manager.py", line 2378, in _build_and_run_instance
-       accel_info=accel_info)
-     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn
-       cleanup_instance_disks=created_disks)
-     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in _create_domain_and_network
-       cleanup_instance_disks=cleanup_instance_disks)
-     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
-       self.force_reraise()
-     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
-       six.reraise(self.type_, self.value, self.tb)
-     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
-       raise value
-     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in _create_domain_and_network
-       post_xml_callback=post_xml_callback)
-     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in _create_domain
-       guest = libvirt_guest.Guest.create(xml, self._host)
-     File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create
-       encodeutils.safe_decode(xml))
-     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
-       self.force_reraise()
-     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
-       six.reraise(self.type_, self.value, self.tb)
-     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
-       raise value
-     File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 123, in create
-       guest = host.write_instance_config(xml)
-     File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1141, in write_instance_config
-       domain = self.get_connection().defineXML(xml)
-     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 190,in doit
-       result = proxy_call(self._autowrap, f, *args, **kwargs)
-     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
-       rv = execute(f, *args, **kwargs)
-     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 129, in execute
-       six.reraise(c, e, tb)
-     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
-       raise value
-     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 83, in tworker
-       rv = meth(*args, **kwargs)
-     File "/usr/local/lib64/python3.7/site-packages/libvirt.py", line 4048, in defineXML
-       if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
-   libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
+   Failed to build and run instance: libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
+   Traceback (most recent call last):
+     File "/opt/stack/nova/nova/compute/manager.py", line 2378, in _build_and_run_instance
+       accel_info=accel_info)
+     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn
+       cleanup_instance_disks=created_disks)
+     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in _create_domain_and_network
+       cleanup_instance_disks=cleanup_instance_disks)
+     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
+       self.force_reraise()
+     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
+       six.reraise(self.type_, self.value, self.tb)
+     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
+       raise value
+     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in _create_domain_and_network
+       post_xml_callback=post_xml_callback)
+     File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in _create_domain
+       guest = libvirt_guest.Guest.create(xml, self._host)
+     File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create
+       encodeutils.safe_decode(xml))
+     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
+       self.force_reraise()
+     File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
+       six.reraise(self.type_, self.value, self.tb)
+     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
+       raise value
+     File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 123, in create
+       guest = host.write_instance_config(xml)
+     File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1141, in write_instance_config
+       domain = self.get_connection().defineXML(xml)
+     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 190,in doit
+       result = proxy_call(self._autowrap, f, *args, **kwargs)
+     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
+       rv = execute(f, *args, **kwargs)
+     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 129, in execute
+       six.reraise(c, e, tb)
+     File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
+       raise value
+     File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 83, in tworker
+       rv = meth(*args, **kwargs)
+     File "/usr/local/lib64/python3.7/site-packages/libvirt.py", line 4048, in defineXML
+       if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
+   libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
  
  The error happens because libvirt is attempting to configure the set
  CPUs on which to pin emulators threads, which in the realtime case are
  all the non-realtime cores. However, since there are no cores set aside
  for non-realtime purposes - due to the invalid mask - we end up with an
  empty emulator thread set [1]. One *could* work around this by
  configuring an emulator thread policy. For example:
  
-   openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-     --property 'hw:cpu_policy=dedicated' \
-     --property 'hw:emulator_threads_policy=isolate' \
-     --property 'hw:cpu_realtime=true' \
-     --property 'hw:cpu_realtime_mask=^2' \
-     test.rt
+   openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+     --property 'hw:cpu_policy=dedicated' \
+     --property 'hw:emulator_threads_policy=isolate' \
+     --property 'hw:cpu_realtime=true' \
+     --property 'hw:cpu_realtime_mask=^2' \
+     test.rt
  
  Similarly, they could ensure at least one core in the range is valid:
  
-   openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-     --property 'hw:cpu_policy=dedicated' \
-     --property 'hw:emulator_threads_policy=isolate' \
-     --property 'hw:cpu_realtime=true' \
-     --property 'hw:cpu_realtime_mask=^1-5' \
-     test.rt
+   openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+     --property 'hw:cpu_policy=dedicated' \
+     --property 'hw:emulator_threads_policy=isolate' \
+     --property 'hw:cpu_realtime=true' \
+     --property 'hw:cpu_realtime_mask=^1-5' \
+     test.rt
  
  However, both cases are still wrong and the 'hw:cpu_realtime_mask' value
  is almost certainly user error. Nova should be validating things
  properly and rejecting invalid values. we could probably also look at
  dropping the requirement to specify 'hw:cpu_realtime_mask' if
  'hw:emulator_threads_policy' is configured, however, that's more of a
  feature than a bug.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1884231

Title:
  'hw:realtime_mask' extra spec is not validated

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The 'hw:realtime_mask' extra spec is (currently) used to specify what
  cores in a host should *not* be part of the realtime set of cores on
  the host. Currently, this is mandatory and omitting it will cause a
  HTTP 400 error. For example:

    $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
        --property hw:cpu_policy=dedicated
        --property hw:cpu_realtime=yes \
        test.rt

  will fail with:

    Realtime policy needs vCPU(s) mask configured with at least 1 RT
  vCPU and 1 ordinary vCPU. See hw:cpu_realtime_mask or
  hw_cpu_realtime_mask

  Similarly, attempting to mask *all* values will result in a failure.
  For example:

    $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
        --property hw:cpu_policy=dedicated
        --property hw:cpu_realtime=yes \
        --property hw:cpu_realtime_mask=^0-1
        test.rt

  will also fail with:

    Realtime policy needs vCPU(s) mask configured with at least 1 RT
  vCPU and 1 ordinary vCPU. See hw:cpu_realtime_mask or
  hw_cpu_realtime_mask

  However, the value is otherwise unvalidated by nova, which can cause
  libvirt to explode when specific values are passed. For example,
  consider the following flavor:

    $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
        --property hw:cpu_policy=dedicated
        --property hw:cpu_realtime=yes \
        --property hw:cpu_realtime_mask='^2' \
        test.rt

  This states that the instances should have two cores, and some
  imaginary third core (masks are 0-indexed) will be the non-realtime
  one. This is clearly nonsensical and, surely enough, creating an
  instance using this core causes things to go bang:

    Failed to build and run instance: libvirt.libvirtError: invalid argument: Failed to parse bitmap ''
    Traceback (most recent call last):
      File "/opt/stack/nova/nova/compute/manager.py", line 2378, in _build_and_run_instance
        accel_info=accel_info)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn
        cleanup_instance_disks=created_disks)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in _create_domain_and_network
        cleanup_instance_disks=cleanup_instance_disks)
      File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
        self.force_reraise()
      File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
        six.reraise(self.type_, self.value, self.tb)
      File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
        raise value
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in _create_domain_and_network
        post_xml_callback=post_xml_callback)
      File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in _create_domain
        guest = libvirt_guest.Guest.create(xml, self._host)
      File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create
        encodeutils.safe_decode(xml))
      File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__
        self.force_reraise()
      File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise
        six.reraise(self.type_, self.value, self.tb)
      File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
        raise value
      File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 123, in create
        guest = host.write_instance_config(xml)
      File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1141, in write_instance_config
        domain = self.get_connection().defineXML(xml)
      File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 190,in doit
        result = proxy_call(self._autowrap, f, *args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
        rv = execute(f, *args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 129, in execute
        six.reraise(c, e, tb)
      File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
        raise value
      File "/usr/local/lib/python3.7/site-packages/eventlet/tpool.py", line 83, in tworker
        rv = meth(*args, **kwargs)
      File "/usr/local/lib64/python3.7/site-packages/libvirt.py", line 4048, in defineXML
        if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
    libvirt.libvirtError: invalid argument: Failed to parse bitmap ''

  The error happens because libvirt is attempting to configure the set
  CPUs on which to pin emulators threads, which in the realtime case are
  all the non-realtime cores. However, since there are no cores set
  aside for non-realtime purposes - due to the invalid mask - we end up
  with an empty emulator thread set [1]. One *could* work around this by
  configuring an emulator thread policy. For example:

    openstack flavor create --ram 512 --disk 1 --vcpus 2 \
      --property 'hw:cpu_policy=dedicated' \
      --property 'hw:emulator_threads_policy=isolate' \
      --property 'hw:cpu_realtime=true' \
      --property 'hw:cpu_realtime_mask=^2' \
      test.rt

  Similarly, they could ensure at least one core in the range is valid:

    openstack flavor create --ram 512 --disk 1 --vcpus 2 \
      --property 'hw:cpu_policy=dedicated' \
      --property 'hw:emulator_threads_policy=isolate' \
      --property 'hw:cpu_realtime=true' \
      --property 'hw:cpu_realtime_mask=^1-5' \
      test.rt

  However, both cases are still wrong and the 'hw:cpu_realtime_mask'
  value is almost certainly user error. Nova should be validating things
  properly and rejecting invalid values. we could probably also look at
  dropping the requirement to specify 'hw:cpu_realtime_mask' if
  'hw:emulator_threads_policy' is configured, however, that's more of a
  feature than a bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1884231/+subscriptions


Follow ups