yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #62620
[Bug 1675732] [NEW] [Ironic] Nova compute will fail to start if it can not talk to the Ironic API
Public bug reported:
This can happen during an upgrade. The Ironic driver in nova will try to
reach the Ironic API for a certain # of times and after that, if the API
doesn't become available the whole service will stop with:
4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
2017-03-24 10:28:48.704 TRACE oslo_service.service service.start()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/service.py", line 162, in start
2017-03-24 10:28:48.704 TRACE oslo_service.service self.manager.pre_start_hook()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
2017-03-24 10:28:48.704 TRACE oslo_service.service startup=True)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
2017-03-24 10:28:48.704 TRACE oslo_service.service nodenames = set(self.driver.get_available_nodes())
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
2017-03-24 10:28:48.704 TRACE oslo_service.service self._refresh_cache()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
2017-03-24 10:28:48.704 TRACE oslo_service.service for node in self._get_node_list(detail=True, limit=0):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
2017-03-24 10:28:48.704 TRACE oslo_service.service node_list = self.ironicclient.call("node.list", **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
2017-03-24 10:28:48.704 TRACE oslo_service.service return self._multi_getattr(client, method)(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
2017-03-24 10:28:48.704 TRACE oslo_service.service limit=limit)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
2017-03-24 10:28:48.704 TRACE oslo_service.service resp, body = self.api.json_request('GET', url)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = self._http_request(url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
2017-03-24 10:28:48.704 TRACE oslo_service.service return func(self, url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise_exc=False, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-03-24 10:28:48.704 TRACE oslo_service.service return wrapped(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = send(**kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise exceptions.ConnectFailure(msg)
2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
2017-03-24 10:28:48.704 TRACE oslo_service.service
---
I don't believe that should be the right behavior. If the ironic nova
driver tries to fetch the ndoes from the Ironic service but it's not
available I think it should log the error and just return a list of
empty nodes.
This happens in the get_available_nodes() call of the driver, which runs
periodically in nova so it will be retried later once the Ironic API is
available again.
[UPDATE]
Apparently we had a similar bug in the past:
https://bugs.launchpad.net/nova/+bug/1430616
** Affects: nova
Importance: Undecided
Assignee: Lucas Alvares Gomes (lucasagomes)
Status: New
** Changed in: nova
Assignee: (unassigned) => Lucas Alvares Gomes (lucasagomes)
** Description changed:
This can happen during an upgrade. The Ironic driver in nova will try to
reach the Ironic API for a certain # of times and after that, if the API
doesn't become available the whole service will stop with:
4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
2017-03-24 10:28:48.704 TRACE oslo_service.service service.start()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/service.py", line 162, in start
2017-03-24 10:28:48.704 TRACE oslo_service.service self.manager.pre_start_hook()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
2017-03-24 10:28:48.704 TRACE oslo_service.service startup=True)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
2017-03-24 10:28:48.704 TRACE oslo_service.service nodenames = set(self.driver.get_available_nodes())
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
2017-03-24 10:28:48.704 TRACE oslo_service.service self._refresh_cache()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
2017-03-24 10:28:48.704 TRACE oslo_service.service for node in self._get_node_list(detail=True, limit=0):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
2017-03-24 10:28:48.704 TRACE oslo_service.service node_list = self.ironicclient.call("node.list", **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
2017-03-24 10:28:48.704 TRACE oslo_service.service return self._multi_getattr(client, method)(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
2017-03-24 10:28:48.704 TRACE oslo_service.service limit=limit)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
2017-03-24 10:28:48.704 TRACE oslo_service.service resp, body = self.api.json_request('GET', url)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = self._http_request(url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
2017-03-24 10:28:48.704 TRACE oslo_service.service return func(self, url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise_exc=False, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-03-24 10:28:48.704 TRACE oslo_service.service return wrapped(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = send(**kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise exceptions.ConnectFailure(msg)
2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
- 2017-03-24 10:28:48.704 TRACE oslo_service.service
+ 2017-03-24 10:28:48.704 TRACE oslo_service.service
+
+ ---
I don't believe that should be the right behavior. If the ironic nova
driver tries to fetch the ndoes from the Ironic service but it's not
available I think it should log the error and just return a list of
empty nodes.
This happens in the get_available_nodes() call of the driver, which runs
periodically in nova so it will be retried later once the Ironic API is
available again.
+
+
+ [UPDATE]
+
+ Apparently we had a similar bug in the past:
+ https://bugs.launchpad.net/nova/+bug/1430616
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1675732
Title:
[Ironic] Nova compute will fail to start if it can not talk to the
Ironic API
Status in OpenStack Compute (nova):
New
Bug description:
This can happen during an upgrade. The Ironic driver in nova will try
to reach the Ironic API for a certain # of times and after that, if
the API doesn't become available the whole service will stop with:
4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
2017-03-24 10:28:48.704 TRACE oslo_service.service service.start()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/service.py", line 162, in start
2017-03-24 10:28:48.704 TRACE oslo_service.service self.manager.pre_start_hook()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
2017-03-24 10:28:48.704 TRACE oslo_service.service startup=True)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
2017-03-24 10:28:48.704 TRACE oslo_service.service nodenames = set(self.driver.get_available_nodes())
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
2017-03-24 10:28:48.704 TRACE oslo_service.service self._refresh_cache()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
2017-03-24 10:28:48.704 TRACE oslo_service.service for node in self._get_node_list(detail=True, limit=0):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
2017-03-24 10:28:48.704 TRACE oslo_service.service node_list = self.ironicclient.call("node.list", **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
2017-03-24 10:28:48.704 TRACE oslo_service.service return self._multi_getattr(client, method)(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
2017-03-24 10:28:48.704 TRACE oslo_service.service limit=limit)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
2017-03-24 10:28:48.704 TRACE oslo_service.service resp, body = self.api.json_request('GET', url)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = self._http_request(url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
2017-03-24 10:28:48.704 TRACE oslo_service.service return func(self, url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise_exc=False, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-03-24 10:28:48.704 TRACE oslo_service.service return wrapped(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = send(**kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise exceptions.ConnectFailure(msg)
2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
2017-03-24 10:28:48.704 TRACE oslo_service.service
---
I don't believe that should be the right behavior. If the ironic nova
driver tries to fetch the ndoes from the Ironic service but it's not
available I think it should log the error and just return a list of
empty nodes.
This happens in the get_available_nodes() call of the driver, which
runs periodically in nova so it will be retried later once the Ironic
API is available again.
[UPDATE]
Apparently we had a similar bug in the past:
https://bugs.launchpad.net/nova/+bug/1430616
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1675732/+subscriptions