yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70334
[Bug 1742827] [NEW] nova-scheduler reports dead compute nodes but nova-compute is enabled and up
Public bug reported:
(originally reported by David Manchado in
https://bugzilla.redhat.com/show_bug.cgi?id=1533196 )
Description of problem:
We are seeing that nova scheduler is removing compute nodes because it considers them as dead but openstack compute service list reports nova-compute to be up an running.
We can see in nova-scheduler entries with the following pattern:
- Removing dead compute node XXX from scheduler
- Filter ComputeFilter returned 0 hosts
- Filtering removed all hosts for the request with instance ID '11feeba9-f46c-416d-a97e-7c0c9d565b5a'. Filter results: ['AggregateInstanceExtraSpecsFilter: (start: 19, end: 2)', 'AggregateCoreFilter: (start: 2, end: 2)', 'AggregateDiskFilter: (start: 2, end: 2)', 'AggregateRamFilter: (start: 2, end: 2)', 'RetryFilter: (start: 2, end: 2)', 'AvailabilityZoneFilter: (start: 2, end: 2)', 'ComputeFilter: (start: 2, end: 0)']
Version-Release number of selected component (if applicable):
Ocata
How reproducible:
N/A
Actual results:
Instances are not being spawned reporting 'no valid host found' because of
Additional info:
This has been happening for a week.
We did an upgrade from Newton three weeks ago.
We have also done a minor update and the issue still persists.
Nova related RPMs
openstack-nova-scheduler-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
python2-novaclient-7.1.2-1.el7.noarch
openstack-nova-novncproxy-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-cert-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-console-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-conductor-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-common-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-compute-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-placement-api-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
puppet-nova-10.4.2-0.20180102233330.f4bc1f0.el7.centos.noarch
openstack-nova-api-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
python-nova-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1742827
Title:
nova-scheduler reports dead compute nodes but nova-compute is enabled
and up
Status in OpenStack Compute (nova):
New
Bug description:
(originally reported by David Manchado in
https://bugzilla.redhat.com/show_bug.cgi?id=1533196 )
Description of problem:
We are seeing that nova scheduler is removing compute nodes because it considers them as dead but openstack compute service list reports nova-compute to be up an running.
We can see in nova-scheduler entries with the following pattern:
- Removing dead compute node XXX from scheduler
- Filter ComputeFilter returned 0 hosts
- Filtering removed all hosts for the request with instance ID '11feeba9-f46c-416d-a97e-7c0c9d565b5a'. Filter results: ['AggregateInstanceExtraSpecsFilter: (start: 19, end: 2)', 'AggregateCoreFilter: (start: 2, end: 2)', 'AggregateDiskFilter: (start: 2, end: 2)', 'AggregateRamFilter: (start: 2, end: 2)', 'RetryFilter: (start: 2, end: 2)', 'AvailabilityZoneFilter: (start: 2, end: 2)', 'ComputeFilter: (start: 2, end: 0)']
Version-Release number of selected component (if applicable):
Ocata
How reproducible:
N/A
Actual results:
Instances are not being spawned reporting 'no valid host found' because of
Additional info:
This has been happening for a week.
We did an upgrade from Newton three weeks ago.
We have also done a minor update and the issue still persists.
Nova related RPMs
openstack-nova-scheduler-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
python2-novaclient-7.1.2-1.el7.noarch
openstack-nova-novncproxy-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-cert-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-console-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-conductor-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-common-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-compute-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
openstack-nova-placement-api-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
puppet-nova-10.4.2-0.20180102233330.f4bc1f0.el7.centos.noarch
openstack-nova-api-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
python-nova-15.1.1-0.20180103153502.ff2231f.el7.centos.noarch
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1742827/+subscriptions
Follow ups