← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1675732] [NEW] [Ironic] Nova compute will fail to start if it can not talk to the Ironic API

 

Public bug reported:

This can happen during an upgrade. The Ironic driver in nova will try to
reach the Ironic API for a certain # of times and after that, if the API
doesn't become available the whole service will stop with:

4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
2017-03-24 10:28:48.704 TRACE oslo_service.service     service.start()
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/service.py", line 162, in start
2017-03-24 10:28:48.704 TRACE oslo_service.service     self.manager.pre_start_hook()
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
2017-03-24 10:28:48.704 TRACE oslo_service.service     startup=True)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
2017-03-24 10:28:48.704 TRACE oslo_service.service     nodenames = set(self.driver.get_available_nodes())
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
2017-03-24 10:28:48.704 TRACE oslo_service.service     self._refresh_cache()
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
2017-03-24 10:28:48.704 TRACE oslo_service.service     for node in self._get_node_list(detail=True, limit=0):
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
2017-03-24 10:28:48.704 TRACE oslo_service.service     node_list = self.ironicclient.call("node.list", **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
2017-03-24 10:28:48.704 TRACE oslo_service.service     return self._multi_getattr(client, method)(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
2017-03-24 10:28:48.704 TRACE oslo_service.service     limit=limit)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
2017-03-24 10:28:48.704 TRACE oslo_service.service     resp, body = self.api.json_request('GET', url)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = self._http_request(url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
2017-03-24 10:28:48.704 TRACE oslo_service.service     return func(self, url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
2017-03-24 10:28:48.704 TRACE oslo_service.service     raise_exc=False, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-03-24 10:28:48.704 TRACE oslo_service.service     return wrapped(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = send(**kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
2017-03-24 10:28:48.704 TRACE oslo_service.service     raise exceptions.ConnectFailure(msg)
2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
2017-03-24 10:28:48.704 TRACE oslo_service.service

---

I don't believe that should be the right behavior. If the ironic nova
driver tries to fetch the ndoes from the Ironic service but it's not
available I think it should log the error and just return a list of
empty nodes.

This happens in the get_available_nodes() call of the driver, which runs
periodically in nova so it will be retried later once the Ironic API is
available again.


[UPDATE]

Apparently we had a similar bug in the past:
https://bugs.launchpad.net/nova/+bug/1430616

** Affects: nova
     Importance: Undecided
     Assignee: Lucas Alvares Gomes (lucasagomes)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Lucas Alvares Gomes (lucasagomes)

** Description changed:

  This can happen during an upgrade. The Ironic driver in nova will try to
  reach the Ironic API for a certain # of times and after that, if the API
  doesn't become available the whole service will stop with:
  
  4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
  2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
  tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
  bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
  2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
  2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
  2017-03-24 10:28:48.704 TRACE oslo_service.service     service.start()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/service.py", line 162, in start
  2017-03-24 10:28:48.704 TRACE oslo_service.service     self.manager.pre_start_hook()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
  2017-03-24 10:28:48.704 TRACE oslo_service.service     startup=True)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
  2017-03-24 10:28:48.704 TRACE oslo_service.service     nodenames = set(self.driver.get_available_nodes())
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
  2017-03-24 10:28:48.704 TRACE oslo_service.service     self._refresh_cache()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
  2017-03-24 10:28:48.704 TRACE oslo_service.service     for node in self._get_node_list(detail=True, limit=0):
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
  2017-03-24 10:28:48.704 TRACE oslo_service.service     node_list = self.ironicclient.call("node.list", **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return self._multi_getattr(client, method)(*args, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
  2017-03-24 10:28:48.704 TRACE oslo_service.service     limit=limit)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp, body = self.api.json_request('GET', url)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = self._http_request(url, method, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return func(self, url, method, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     raise_exc=False, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return wrapped(*args, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = send(**kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     raise exceptions.ConnectFailure(msg)
  2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
  ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
- 2017-03-24 10:28:48.704 TRACE oslo_service.service 
+ 2017-03-24 10:28:48.704 TRACE oslo_service.service
+ 
+ ---
  
  I don't believe that should be the right behavior. If the ironic nova
  driver tries to fetch the ndoes from the Ironic service but it's not
  available I think it should log the error and just return a list of
  empty nodes.
  
  This happens in the get_available_nodes() call of the driver, which runs
  periodically in nova so it will be retried later once the Ironic API is
  available again.
+ 
+ 
+ [UPDATE]
+ 
+ Apparently we had a similar bug in the past:
+ https://bugs.launchpad.net/nova/+bug/1430616

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1675732

Title:
  [Ironic] Nova compute will fail to start if it can not talk to the
  Ironic API

Status in OpenStack Compute (nova):
  New

Bug description:
  This can happen during an upgrade. The Ironic driver in nova will try
  to reach the Ironic API for a certain # of times and after that, if
  the API doesn't become available the whole service will stop with:

  4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
  2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
  tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
  bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
  2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
  2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
  2017-03-24 10:28:48.704 TRACE oslo_service.service     service.start()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/service.py", line 162, in start
  2017-03-24 10:28:48.704 TRACE oslo_service.service     self.manager.pre_start_hook()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
  2017-03-24 10:28:48.704 TRACE oslo_service.service     startup=True)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
  2017-03-24 10:28:48.704 TRACE oslo_service.service     nodenames = set(self.driver.get_available_nodes())
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
  2017-03-24 10:28:48.704 TRACE oslo_service.service     self._refresh_cache()
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
  2017-03-24 10:28:48.704 TRACE oslo_service.service     for node in self._get_node_list(detail=True, limit=0):
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
  2017-03-24 10:28:48.704 TRACE oslo_service.service     node_list = self.ironicclient.call("node.list", **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return self._multi_getattr(client, method)(*args, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
  2017-03-24 10:28:48.704 TRACE oslo_service.service     limit=limit)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp, body = self.api.json_request('GET', url)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = self._http_request(url, method, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return func(self, url, method, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     raise_exc=False, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
  2017-03-24 10:28:48.704 TRACE oslo_service.service     return wrapped(*args, **kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     resp = send(**kwargs)
  2017-03-24 10:28:48.704 TRACE oslo_service.service   File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
  2017-03-24 10:28:48.704 TRACE oslo_service.service     raise exceptions.ConnectFailure(msg)
  2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
  ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
  2017-03-24 10:28:48.704 TRACE oslo_service.service

  ---

  I don't believe that should be the right behavior. If the ironic nova
  driver tries to fetch the ndoes from the Ironic service but it's not
  available I think it should log the error and just return a list of
  empty nodes.

  This happens in the get_available_nodes() call of the driver, which
  runs periodically in nova so it will be retried later once the Ironic
  API is available again.

  
  [UPDATE]

  Apparently we had a similar bug in the past:
  https://bugs.launchpad.net/nova/+bug/1430616

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1675732/+subscriptions