yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #82414
[Bug 1874860] [NEW] After removing a compute (scalein) , openstack compute service list shows compute enabled / on , after user powers on removed compute
Public bug reported:
Description
===========
After removing a compute node , user powers ON the removed node via
ipmitool / console.
Executing 'openstack compute service list' while sourcing overcloudrc on
undercloud shows the removed compute as enabled / on , even though it
was removed via scaling procedure.
Steps to reproduce
==================
A chronological list of steps :
* Deploy a Rocky version , containerized , 3 controller 3 ovs-compute
overcloud.
* Scale-in ( remove ) ovs-compute-2 . (
https://access.redhat.com/documentation/en-
us/red_hat_openstack_platform/9/html/director_installation_and_usage
/sect-scaling_the_overcloud )
* observe output 'openstack compute service list --service nova-compute' node is removed.
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T16:57:44.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T16:57:37.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
* Power on the removed compute via IPMItool or console.
* observe output 'openstack compute service list --service nova-compute' removed node shows back in list as State Up.
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
| 143 | nova-compute | overcloud-ovscompute-2.localdomain | nova | enabled | up | 2020-04-24T17:07:28.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Other observations:
Watching the nova-compute.log of the removed node - during power-up. ,
we see that it communicates with the nova hypervisor and reports itself
as available. This occurs as soon as the nova_compute container becomes
healthy after power on. Please see the screen shot attached. If there is
a missing step , such as disabling the nova_container on the removed
node , this should be specified. I would hope that deleting the nova
service via "nova service-delete [service-id]" from the procedure should
be enough, but it appears it is not.
Expected result
===============
There should be no trace of the removed compute.
We expect to see the following output
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Actual result
=============
Instead, the actual result ( after user powers on the scaled-in node ) shows the compute!
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
| 143 | nova-compute | overcloud-ovscompute-2.localdomain | nova | enabled | up | 2020-04-24T17:07:28.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Environment
===========
Openstack Rocky
OS CentOS Linux 7.7.1908
kernel 4.18.0
qemu-kvm-ev 2.12.0
libvirt 4.5.0
openvswitch 2.11.0
dpdk 18.11.2
OpenStack Services
aodh 7.0.1
barbican 7.0.1
ceilometer 11.0.2
cinder 13.0.6
glance 17.0.1
gnocchi 4.3.3
heatclient 1.16.2
heat-engine 11.0.3
horizon 14.0.3
keystone 14.1.1
manila 7.3.1
neutron 13.0.4
nova 18.2.1
panko 5.0.1
pacemaker 1.1.20
corosync 2.4.3
haproxy 1.5.18
mariadb 10.1.20
rabbitmq-server 3.6.16
Ceph Storage
ceph-base 12.2.11
ceph-ansible 3.1.6
Monitoring
elasticsearch 6.2.1
kibana 6.2.1
logstash 6.2.4
** Affects: nova
Importance: Undecided
Status: New
** Tags: compute nova rocky scale-in
** Attachment added: "Screen Shot 2020-04-24 at 10.39.37 AM.png"
https://bugs.launchpad.net/bugs/1874860/+attachment/5359654/+files/Screen%20Shot%202020-04-24%20at%2010.39.37%20AM.png
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1874860
Title:
After removing a compute (scalein) , openstack compute service list
shows compute enabled / on , after user powers on removed compute
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
After removing a compute node , user powers ON the removed node via
ipmitool / console.
Executing 'openstack compute service list' while sourcing overcloudrc
on undercloud shows the removed compute as enabled / on , even though
it was removed via scaling procedure.
Steps to reproduce
==================
A chronological list of steps :
* Deploy a Rocky version , containerized , 3 controller 3 ovs-compute
overcloud.
* Scale-in ( remove ) ovs-compute-2 . (
https://access.redhat.com/documentation/en-
us/red_hat_openstack_platform/9/html/director_installation_and_usage
/sect-scaling_the_overcloud )
* observe output 'openstack compute service list --service nova-compute' node is removed.
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T16:57:44.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T16:57:37.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
* Power on the removed compute via IPMItool or console.
* observe output 'openstack compute service list --service nova-compute' removed node shows back in list as State Up.
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
| 143 | nova-compute | overcloud-ovscompute-2.localdomain | nova | enabled | up | 2020-04-24T17:07:28.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Other observations:
Watching the nova-compute.log of the removed node - during power-up. ,
we see that it communicates with the nova hypervisor and reports
itself as available. This occurs as soon as the nova_compute container
becomes healthy after power on. Please see the screen shot attached.
If there is a missing step , such as disabling the nova_container on
the removed node , this should be specified. I would hope that
deleting the nova service via "nova service-delete [service-id]" from
the procedure should be enough, but it appears it is not.
Expected result
===============
There should be no trace of the removed compute.
We expect to see the following output
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Actual result
=============
Instead, the actual result ( after user powers on the scaled-in node ) shows the compute!
[stack@undercloud (overcloudrc) ~]$ openstack compute service list --service nova-compute
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
| 113 | nova-compute | overcloud-ovscompute-0.localdomain | zone1 | enabled | up | 2020-04-24T17:07:34.000000 |
| 116 | nova-compute | overcloud-ovscompute-1.localdomain | zone1 | enabled | up | 2020-04-24T17:07:27.000000 |
| 143 | nova-compute | overcloud-ovscompute-2.localdomain | nova | enabled | up | 2020-04-24T17:07:28.000000 |
+-----+--------------+------------------------------------+-------+---------+-------+----------------------------+
Environment
===========
Openstack Rocky
OS CentOS Linux 7.7.1908
kernel 4.18.0
qemu-kvm-ev 2.12.0
libvirt 4.5.0
openvswitch 2.11.0
dpdk 18.11.2
OpenStack Services
aodh 7.0.1
barbican 7.0.1
ceilometer 11.0.2
cinder 13.0.6
glance 17.0.1
gnocchi 4.3.3
heatclient 1.16.2
heat-engine 11.0.3
horizon 14.0.3
keystone 14.1.1
manila 7.3.1
neutron 13.0.4
nova 18.2.1
panko 5.0.1
pacemaker 1.1.20
corosync 2.4.3
haproxy 1.5.18
mariadb 10.1.20
rabbitmq-server 3.6.16
Ceph Storage
ceph-base 12.2.11
ceph-ansible 3.1.6
Monitoring
elasticsearch 6.2.1
kibana 6.2.1
logstash 6.2.4
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1874860/+subscriptions
Follow ups