yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #38274
[Bug 1493714] [NEW] nova allows booting two instances with the same neutron port in parallel
Public bug reported:
It seems that to reproduce the problem we need a multi node deployment
with at least two nova-compute service.
To reproduce it do the following:
1) create a neutron port
2) boot two instances in parallel with that port
Sometimes both instances become ACTIVE in nova which is clearly wrong.
vagrant@controller:~/devstack$ neutron net-list
+--------------------------------------+---------+----------------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+----------------------------------------------------------+
| fc257a00-d3bf-47e6-b91f-e2cef985c414 | public | a610715c-614f-492c-8810-f51e03c5d383 2001:db8::/64 |
| | | a923871f-90ad-4354-935d-db24861a5890 172.24.4.0/24 |
| 7a057b12-0e69-4c31-859e-098263abeeba | private | 04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe fda4:10b7:acaa::/64 |
| | | ee70023c-f273-471a-8b84-cb25bb64fcf9 10.0.0.0/24 |
+--------------------------------------+---------+----------------------------------------------------------+
vagrant@controller:~/devstack$ neutron port-create 7a057b12-0e69-4c31-859e-098263abeeba
Created a new port:
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | |
| binding:profile | {} |
| binding:vif_details | {} |
| binding:vif_type | unbound |
| binding:vnic_type | normal |
| device_id | |
| device_owner | |
| fixed_ips | {"subnet_id": "ee70023c-f273-471a-8b84-cb25bb64fcf9", "ip_address": "10.0.0.4"} |
| | {"subnet_id": "04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe", "ip_address": "fda4:10b7:acaa:0:f816:3eff:fe0c:285d"} |
| id | f2da8f78-8ae4-49f0-bca0-5820588d33ea |
| mac_address | fa:16:3e:0c:28:5d |
| name | |
| network_id | 7a057b12-0e69-4c31-859e-098263abeeba |
| port_security_enabled | True |
| security_groups | 73853e74-a6c7-4b71-ba45-5b82b5e1ad81 |
| status | DOWN |
| tenant_id | 16f8c1dbfa2d472da0d1335b8a70aee0 |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
vagrant@controller:~/devstack$ nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm1 & nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm2 &
[1] 18785
[2] 18786
vagrant@controller:~/devstack$ nova list
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| 24e67346-18fe-4cfa-8b01-f017c915a111 | vm1 | ACTIVE | - | Running | private=fda4:10b7:acaa:0:f816:3eff:fe0c:285d, 10.0.0.4 |
| d90292a2-5b8f-4566-a621-defdf7aa0246 | vm2 | ACTIVE | - | Running | |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
Based on the code we have an atomicity problem on the neutron REST API:
At https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L611 nova will check if the device_id of the neutron port are empty or not. Then nova sets the current instance_uuid to the device_id later at https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L692
So it is possible that two nova-compute processes check the port status
before one of them sets the device_id, so as a result both nova-compute
will think that the port are free to use and the slower nova-compute
will overwrite the device_id of the port.
** Affects: nova
Importance: Undecided
Assignee: Balazs Gibizer (balazs-gibizer)
Status: New
** Changed in: nova
Assignee: (unassigned) => Balazs Gibizer (balazs-gibizer)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1493714
Title:
nova allows booting two instances with the same neutron port in
parallel
Status in OpenStack Compute (nova):
New
Bug description:
It seems that to reproduce the problem we need a multi node deployment
with at least two nova-compute service.
To reproduce it do the following:
1) create a neutron port
2) boot two instances in parallel with that port
Sometimes both instances become ACTIVE in nova which is clearly wrong.
vagrant@controller:~/devstack$ neutron net-list
+--------------------------------------+---------+----------------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+----------------------------------------------------------+
| fc257a00-d3bf-47e6-b91f-e2cef985c414 | public | a610715c-614f-492c-8810-f51e03c5d383 2001:db8::/64 |
| | | a923871f-90ad-4354-935d-db24861a5890 172.24.4.0/24 |
| 7a057b12-0e69-4c31-859e-098263abeeba | private | 04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe fda4:10b7:acaa::/64 |
| | | ee70023c-f273-471a-8b84-cb25bb64fcf9 10.0.0.0/24 |
+--------------------------------------+---------+----------------------------------------------------------+
vagrant@controller:~/devstack$ neutron port-create 7a057b12-0e69-4c31-859e-098263abeeba
Created a new port:
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | |
| binding:profile | {} |
| binding:vif_details | {} |
| binding:vif_type | unbound |
| binding:vnic_type | normal |
| device_id | |
| device_owner | |
| fixed_ips | {"subnet_id": "ee70023c-f273-471a-8b84-cb25bb64fcf9", "ip_address": "10.0.0.4"} |
| | {"subnet_id": "04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe", "ip_address": "fda4:10b7:acaa:0:f816:3eff:fe0c:285d"} |
| id | f2da8f78-8ae4-49f0-bca0-5820588d33ea |
| mac_address | fa:16:3e:0c:28:5d |
| name | |
| network_id | 7a057b12-0e69-4c31-859e-098263abeeba |
| port_security_enabled | True |
| security_groups | 73853e74-a6c7-4b71-ba45-5b82b5e1ad81 |
| status | DOWN |
| tenant_id | 16f8c1dbfa2d472da0d1335b8a70aee0 |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
vagrant@controller:~/devstack$ nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm1 & nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm2 &
[1] 18785
[2] 18786
vagrant@controller:~/devstack$ nova list
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| 24e67346-18fe-4cfa-8b01-f017c915a111 | vm1 | ACTIVE | - | Running | private=fda4:10b7:acaa:0:f816:3eff:fe0c:285d, 10.0.0.4 |
| d90292a2-5b8f-4566-a621-defdf7aa0246 | vm2 | ACTIVE | - | Running | |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
Based on the code we have an atomicity problem on the neutron REST API:
At https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L611 nova will check if the device_id of the neutron port are empty or not. Then nova sets the current instance_uuid to the device_id later at https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L692
So it is possible that two nova-compute processes check the port
status before one of them sets the device_id, so as a result both
nova-compute will think that the port are free to use and the slower
nova-compute will overwrite the device_id of the port.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1493714/+subscriptions
Follow ups