← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1799892] [NEW] Placement API crashes with 500s in Rocky upgrade with downed compute nodes

 

Public bug reported:

I ran into this upgrading another environment into Rocky, deleted the
problematic resource provider, but just ran into it again in another
upgrade of another environment so there's something wonky.  Here's the
traceback:

=============
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap [req-8ad1c999-7646-4b0a-91c0-cd26a3581766 b61d42657d364008bfdc6fa715e67daf a894e8109af3430aa7ae03e0c49a0aa0 - default default] Placement API unexpected error: 19: KeyError: 19
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap Traceback (most recent call last):
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/fault_wrap.py", line 40, in __call__
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.application(environ, start_response)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     resp = self.call_func(req, *args, **kw)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.func(req, *args, **kwargs)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/microversion_parse/middleware.py", line 80, in __call__
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     response = req.get_response(self.application)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     application, catch_exc_info=False)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in call_application
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     app_iter = application(self.environ, start_response)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", line 209, in __call__
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return dispatch(environ, start_response, self._map)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", line 146, in dispatch
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return handler(environ, start_response)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     resp = self.call_func(req, *args, **kw)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py", line 29, in call_func
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     super(PlacementWsgify, self).call_func(req, *args, **kwargs)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.func(req, *args, **kwargs)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/microversion.py", line 164, in decorated_func
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return _find_method(f, version, status_code)(req, *args, **kwargs)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/util.py", line 81, in decorated_function
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return f(req)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handlers/allocation_candidate.py", line 316, in list_allocation_candidates
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, requests, limit=limit, group_policy=group_policy)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 3965, in get_by_requests
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, requests, limit=limit, group_policy=group_policy)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 993, in wrapper
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return fn(*args, **kwargs)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 4071, in _get_by_requests
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, request, sharing, has_trees)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 4045, in _get_by_one_request
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return _alloc_candidates_single_provider(context, resources, rp_ids)
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 3490, in _alloc_candidates_single_provider
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     rp_summary = summaries[rp_id]
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap KeyError: 19
2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
=============

The resource provider (nova-compute) with ID 19 was down during the
upgrade (it was put down for a long time ago).  The only oddities I
found was in the database, `root_provider_id` was set to NULL for that
record too.  Upon deleting the resource provider, the placement API
stopped giving 500s when it tried to schedule new VMs.

In the other environment that had a problem too, it actually was the
downed instance as well.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1799892

Title:
  Placement API crashes with 500s in Rocky upgrade with downed compute
  nodes

Status in OpenStack Compute (nova):
  New

Bug description:
  I ran into this upgrading another environment into Rocky, deleted the
  problematic resource provider, but just ran into it again in another
  upgrade of another environment so there's something wonky.  Here's the
  traceback:

  =============
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap [req-8ad1c999-7646-4b0a-91c0-cd26a3581766 b61d42657d364008bfdc6fa715e67daf a894e8109af3430aa7ae03e0c49a0aa0 - default default] Placement API unexpected error: 19: KeyError: 19
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap Traceback (most recent call last):
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/fault_wrap.py", line 40, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.application(environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     resp = self.call_func(req, *args, **kw)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/microversion_parse/middleware.py", line 80, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     response = req.get_response(self.application)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     application, catch_exc_info=False)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in call_application
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     app_iter = application(self.environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", line 209, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return dispatch(environ, start_response, self._map)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", line 146, in dispatch
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return handler(environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     resp = self.call_func(req, *args, **kw)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py", line 29, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     super(PlacementWsgify, self).call_func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return self.func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/microversion.py", line 164, in decorated_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return _find_method(f, version, status_code)(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/util.py", line 81, in decorated_function
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return f(req)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handlers/allocation_candidate.py", line 316, in list_allocation_candidates
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, requests, limit=limit, group_policy=group_policy)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 3965, in get_by_requests
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, requests, limit=limit, group_policy=group_policy)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 993, in wrapper
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return fn(*args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 4071, in _get_by_requests
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     context, request, sharing, has_trees)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 4045, in _get_by_one_request
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     return _alloc_candidates_single_provider(context, resources, rp_ids)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py", line 3490, in _alloc_candidates_single_provider
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap     rp_summary = summaries[rp_id]
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap KeyError: 19
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
  =============

  The resource provider (nova-compute) with ID 19 was down during the
  upgrade (it was put down for a long time ago).  The only oddities I
  found was in the database, `root_provider_id` was set to NULL for that
  record too.  Upon deleting the resource provider, the placement API
  stopped giving 500s when it tried to schedule new VMs.

  In the other environment that had a problem too, it actually was the
  downed instance as well.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1799892/+subscriptions


Follow ups