← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1766301] [NEW] ironic baremetal node ownership not checked with early vif plugging

 

Public bug reported:

It is possible for scheduling to tell nova that a baremetal node can
support multiple instances when in reality that is not the case. The
issue is that the virt driver for ironic ironic does not check nor
assert that the node is in use. This is not an issue, except when before
the virt driver has claimed the node. Due to needing the vif plugging
information completed for block device mapping,
https://github.com/openstack/nova/blob/stable/queens/nova/virt/ironic/driver.py#L1809
can cause resource exhaustion without checking to see if a node is
locked.

Depending on scheduling, we can end up having multiple vif plugging
requests for the same node. All actions beyond the first vif plugging
will fail if only one port is assigned to the node.

This demonstrates itself as:

Message: Build of instance c7c5191b-59ed-44a0-8b2a-0f68e48e9a52 aborted:
Failure prepping block device., Code: 500

With logging from nova-compute:
2018-04-19 19:49:06.832 18246 ERROR nova.virt.ironic.driver [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] Can
40-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 due to error: Unable to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP
able to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 400)
2018-04-19 19:49:06.833 18246 ERROR nova.virt.ironic.driver [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] [in
-4067-af8e-cd6e25fb6b59] Error preparing deploy for instance 964f0d93-e9c4-4067-af8e-cd6e25fb6b59 on baremetal node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8.: VirtualInterfacePlugException: Cann
0-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 due to error: Unable to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 
2018-04-19 19:49:06.833 18246 ERROR nova.compute.manager [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] [insta
67-af8e-cd6e25fb6b59] Failure prepping block device: VirtualInterfacePlugException: Cannot attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 du
 attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 400)

HTTP logging:

192.168.24.1 - - [19/Apr/2018:19:49:00 -0400] "POST /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs HTTP/1.1" 204 - "-" "python-ironicclient"
192.168.24.1 - - [19/Apr/2018:19:49:04 -0400] "POST /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs HTTP/1.1" 400 177 "-" "python-ironicclient"
192.168.24.1 - - [19/Apr/2018:19:49:05 -0400] "PATCH /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 HTTP/1.1" 200 1239 "-" "python-ironicclient"
192.168.24.1 - - [19/Apr/2018:19:49:06 -0400] "DELETE /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs/7d0a6b40-50b3-489b-ae1e-0840e0608253 HTTP/1.1" 400 225 "-" "python-ironicclient"
192.168.24.1 - - [19/Apr/2018:19:49:20 -0400] "POST /v1/nodes/67eeb698-cfd7-4f30-83c7-c6748f78da60/vifs HTTP/1.1" 204 - "-" "python-ironicclient"

RedHat Bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=1560690

How to reproduce:

Bulk schedule nodes, ideally in most cases with resource class
scheduling by flavor disabled which will result in a high liklihood that
the same physical baremetal node will be selected. This can be
preproduced fairly easily with TripleO and a lack of a resource class
defined on the flavor.

** Affects: nova
     Importance: Undecided
     Assignee: Julia Kreger (juliaashleykreger)
         Status: New


** Tags: ironic

** Changed in: nova
     Assignee: (unassigned) => Julia Kreger (juliaashleykreger)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1766301

Title:
  ironic baremetal node ownership not checked with early vif plugging

Status in OpenStack Compute (nova):
  New

Bug description:
  It is possible for scheduling to tell nova that a baremetal node can
  support multiple instances when in reality that is not the case. The
  issue is that the virt driver for ironic ironic does not check nor
  assert that the node is in use. This is not an issue, except when
  before the virt driver has claimed the node. Due to needing the vif
  plugging information completed for block device mapping,
  https://github.com/openstack/nova/blob/stable/queens/nova/virt/ironic/driver.py#L1809
  can cause resource exhaustion without checking to see if a node is
  locked.

  Depending on scheduling, we can end up having multiple vif plugging
  requests for the same node. All actions beyond the first vif plugging
  will fail if only one port is assigned to the node.

  This demonstrates itself as:

  Message: Build of instance c7c5191b-59ed-44a0-8b2a-0f68e48e9a52
  aborted: Failure prepping block device., Code: 500

  With logging from nova-compute:
  2018-04-19 19:49:06.832 18246 ERROR nova.virt.ironic.driver [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] Can
  40-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 due to error: Unable to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP
  able to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 400)
  2018-04-19 19:49:06.833 18246 ERROR nova.virt.ironic.driver [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] [in
  -4067-af8e-cd6e25fb6b59] Error preparing deploy for instance 964f0d93-e9c4-4067-af8e-cd6e25fb6b59 on baremetal node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8.: VirtualInterfacePlugException: Cann
  0-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 due to error: Unable to attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 
  2018-04-19 19:49:06.833 18246 ERROR nova.compute.manager [req-90f1e5e7-1ee0-4f1d-af88-a42f74b0a8e0 e9b4e6ab60ae40cc84ee5689c38608ef f3dccd2210514e3695c4d087d81a65a7 - default default] [insta
  67-af8e-cd6e25fb6b59] Failure prepping block device: VirtualInterfacePlugException: Cannot attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253 to the node 64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 du
   attach VIF 7d0a6b40-50b3-489b-ae1e-0840e0608253, not enough free physical ports. (HTTP 400)

  HTTP logging:

  192.168.24.1 - - [19/Apr/2018:19:49:00 -0400] "POST /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs HTTP/1.1" 204 - "-" "python-ironicclient"
  192.168.24.1 - - [19/Apr/2018:19:49:04 -0400] "POST /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs HTTP/1.1" 400 177 "-" "python-ironicclient"
  192.168.24.1 - - [19/Apr/2018:19:49:05 -0400] "PATCH /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8 HTTP/1.1" 200 1239 "-" "python-ironicclient"
  192.168.24.1 - - [19/Apr/2018:19:49:06 -0400] "DELETE /v1/nodes/64eea9c8-b69f-4a30-a05f-2c7b09f5b1d8/vifs/7d0a6b40-50b3-489b-ae1e-0840e0608253 HTTP/1.1" 400 225 "-" "python-ironicclient"
  192.168.24.1 - - [19/Apr/2018:19:49:20 -0400] "POST /v1/nodes/67eeb698-cfd7-4f30-83c7-c6748f78da60/vifs HTTP/1.1" 204 - "-" "python-ironicclient"

  RedHat Bugzilla:

  https://bugzilla.redhat.com/show_bug.cgi?id=1560690

  How to reproduce:

  Bulk schedule nodes, ideally in most cases with resource class
  scheduling by flavor disabled which will result in a high liklihood
  that the same physical baremetal node will be selected. This can be
  preproduced fairly easily with TripleO and a lack of a resource class
  defined on the flavor.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1766301/+subscriptions


Follow ups