← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1850291] [NEW] CI jobs fails due to instances in ERROR state

 

Public bug reported:

I see that in various jobs that some tests are failing because instance is in ERROR state.
After some checking it seems for me that the issue is in scheduler as I see there errors like:

Oct 27 12:29:31.361318 ubuntu-bionic-ovh-bhs1-0012520618 nova-
scheduler[19272]: WARNING nova.context [None req-9e056bb6-787f-
49fe-8896-41285d7418b0 tempest-ServersTestManualDisk-1766481780 tempest-
ServersTestManualDisk-1766481780] Timed out waiting for response from
cell 43118bd8-e32a-4aa4-b93a-37969e41dba6

or

Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: WARNING nova.context [None req-37a8fab9-7d64-4ef4-9464-f70e5ed35d53 tempest-ServersTestJSON-1383648109 tempest-ServersTestJSON-1383648109] Timed out waiting for response from cell: CellTimeout: Timeout waiting for response from cell
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context Traceback (most recent call last):
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/opt/stack/new/nova/nova/context.py", line 443, in scatter_gather_cells
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     cell_uuid, result = queue.get()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 322, in get
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return waiter.wait()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 141, in wait
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return get_hub().switch()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 298, in switch
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return self.greenlet.switch()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context CellTimeout: Timeout waiting for response from cell

Looking at logstash it seems that this happens quite often on various
jobs:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timed%20out%20waiting%20for%20response%20from%20cell%5C%22

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1850291

Title:
  CI jobs fails due to instances in ERROR state

Status in OpenStack Compute (nova):
  New

Bug description:
  I see that in various jobs that some tests are failing because instance is in ERROR state.
  After some checking it seems for me that the issue is in scheduler as I see there errors like:

  Oct 27 12:29:31.361318 ubuntu-bionic-ovh-bhs1-0012520618 nova-
  scheduler[19272]: WARNING nova.context [None req-9e056bb6-787f-
  49fe-8896-41285d7418b0 tempest-ServersTestManualDisk-1766481780
  tempest-ServersTestManualDisk-1766481780] Timed out waiting for
  response from cell 43118bd8-e32a-4aa4-b93a-37969e41dba6

  or

  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: WARNING nova.context [None req-37a8fab9-7d64-4ef4-9464-f70e5ed35d53 tempest-ServersTestJSON-1383648109 tempest-ServersTestJSON-1383648109] Timed out waiting for response from cell: CellTimeout: Timeout waiting for response from cell
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context Traceback (most recent call last):
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/opt/stack/new/nova/nova/context.py", line 443, in scatter_gather_cells
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     cell_uuid, result = queue.get()
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 322, in get
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return waiter.wait()
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 141, in wait
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return get_hub().switch()
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 298, in switch
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context     return self.greenlet.switch()
  Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context CellTimeout: Timeout waiting for response from cell

  Looking at logstash it seems that this happens quite often on various
  jobs:
  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timed%20out%20waiting%20for%20response%20from%20cell%5C%22

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1850291/+subscriptions