yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #54191
[Bug 1605804] [NEW] Instance creation sometimes fails after host aggregate deletion
Public bug reported:
Instance creation starts failing if nova scheduler gets in an inconsistent state wrt host aggregates. If remove_host_from_aggregate operation is invoked for multiple hosts in quick succession, followed by aggregate deletion, the nova scheduler host_manager maps (host_aggregates_map and aggs_by_id) get out of sync, as there are some stale references left behind in the host_aggregates_map for an aggregate that is deleted from the aggs_by_id map.
This is because it cleans up state based on aggregate.hosts which is empty when aggregate is deleted, but the prior aggregate updates to remove individual hosts could have incorrect list of hosts added to the host_aggregates_map.
Instance creation fails with below error once scheduler gets in this state:
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher [req-7f29701b-0272-444c-8650-a1035777e642 d2c755daa21e451e86c1d2b5be705aa2 0546d7f9c747456aa0ffb306cfe5627d - - -] Exception during message handling: 1
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher incoming.message))
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 150, in inner
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return func(*args, **kwargs)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/manager.py", line 84, in select_destinations
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher filter_properties)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 72, in select_destinations
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher filter_properties)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 164, in _schedule
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher hosts = self._get_all_host_states(elevated)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 222, in _get_all_host_states
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return self.host_manager.get_all_host_states(context)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 585, in get_all_host_states
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher host_state.host]]
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher KeyError: 1
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher
2016-07-21 18:20:16.784 15692 ERROR oslo_messaging._drivers.common [req-7f29701b-0272-444c-8650-a1035777e642 d2c755daa21e451e86c1d2b5be705aa2 0546d7f9c747456aa0ffb306cfe5627d - - -] Returning exception 1 to caller
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1605804
Title:
Instance creation sometimes fails after host aggregate deletion
Status in OpenStack Compute (nova):
New
Bug description:
Instance creation starts failing if nova scheduler gets in an inconsistent state wrt host aggregates. If remove_host_from_aggregate operation is invoked for multiple hosts in quick succession, followed by aggregate deletion, the nova scheduler host_manager maps (host_aggregates_map and aggs_by_id) get out of sync, as there are some stale references left behind in the host_aggregates_map for an aggregate that is deleted from the aggs_by_id map.
This is because it cleans up state based on aggregate.hosts which is empty when aggregate is deleted, but the prior aggregate updates to remove individual hosts could have incorrect list of hosts added to the host_aggregates_map.
Instance creation fails with below error once scheduler gets in this state:
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher [req-7f29701b-0272-444c-8650-a1035777e642 d2c755daa21e451e86c1d2b5be705aa2 0546d7f9c747456aa0ffb306cfe5627d - - -] Exception during message handling: 1
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher incoming.message))
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 150, in inner
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return func(*args, **kwargs)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/manager.py", line 84, in select_destinations
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher filter_properties)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 72, in select_destinations
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher filter_properties)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 164, in _schedule
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher hosts = self._get_all_host_states(elevated)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 222, in _get_all_host_states
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher return self.host_manager.get_all_host_states(context)
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher File "/opt/pf9/nova/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 585, in get_all_host_states
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher host_state.host]]
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher KeyError: 1
2016-07-21 18:20:16.780 15692 ERROR oslo_messaging.rpc.dispatcher
2016-07-21 18:20:16.784 15692 ERROR oslo_messaging._drivers.common [req-7f29701b-0272-444c-8650-a1035777e642 d2c755daa21e451e86c1d2b5be705aa2 0546d7f9c747456aa0ffb306cfe5627d - - -] Returning exception 1 to caller
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1605804/+subscriptions
Follow ups