← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1478450] [NEW] can not nova reboot vm

 

Public bug reported:

I have 10 vms, run nova reboot the 10 vms. After reboot 285 times,can
not nova reboot again.The detail information:

nova list:
+--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+
| ID                                   | Name                                      | Status | Task State     | Power State | Networks           |
+--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+
| 26d74bba-2e9a-4594-b6a1-1b27f337759b | test-26d74bba-2e9a-4594-b6a1-1b27f337759b | REBOOT | reboot_started | Running     | net01=192.168.0.8  |
| 7d129f4d-0af7-4ea8-9de5-3483cdcb7a68 | test-7d129f4d-0af7-4ea8-9de5-3483cdcb7a68 | REBOOT | reboot_started | Running     | net01=192.168.0.13 |
| 8d2e097b-e478-4a7f-b06d-7b578dfaf7c0 | test-8d2e097b-e478-4a7f-b06d-7b578dfaf7c0 | REBOOT | reboot_started | Running     | net01=192.168.0.9  |
| a2d264f0-c1d0-449c-9ba3-32041247490c | test-a2d264f0-c1d0-449c-9ba3-32041247490c | REBOOT | reboot_started | Running     | net01=192.168.0.12 |
| a3d9448d-25d8-4c01-925c-825a19164970 | test-a3d9448d-25d8-4c01-925c-825a19164970 | REBOOT | reboot_started | Running     | net01=192.168.0.11 |
| a993222f-4e41-483e-8e91-300be0d525a4 | test-a993222f-4e41-483e-8e91-300be0d525a4 | REBOOT | reboot_started | Running     | net01=192.168.0.5  |
| ace0736a-0371-4a3d-8a7f-525e299a924b | test-ace0736a-0371-4a3d-8a7f-525e299a924b | REBOOT | reboot_started | Running     | net01=192.168.0.6  |
| e5515b31-8edb-4558-bb9f-f6f8b1142db2 | test-e5515b31-8edb-4558-bb9f-f6f8b1142db2 | REBOOT | reboot_started | Running     | net01=192.168.0.4  |
| f380346a-bc24-4d63-a6e0-e36b5a508a59 | test-f380346a-bc24-4d63-a6e0-e36b5a508a59 | REBOOT | reboot_started | Running     | net01=192.168.0.10 |
| fc396066-7ec6-4fe9-b7bf-70f162e293fe | test-fc396066-7ec6-4fe9-b7bf-70f162e293fe | REBOOT | reboot_started | Running     | net01=192.168.0.7  |
+--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+

The nova-compute.log:

2015-07-22 14:51:43.808 15940 WARNING nova.openstack.common.deadlock_monitor [-] service function nova.servicegroup.drivers.db._report_state is deadlocked!
2015-07-22 14:51:43.984 15940 WARNING nova.openstack.common.deadlock_monitor [-] deadlock traceback is ['  File "/usr/lib/python2.7/site-packages/nova/openstack/common/loopingcall.py", line 78, in _inner\n    self.f(*self.args, **self.kw)\n', '  File "/usr/lib/python2.7/site-packages/nova/openstack/common/deadlock_monitor.py", line 58, in __deco\n    ret = func(*args, **kwargs)\n', '  File "/usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py", line 116, in _report_state\n    service.service_ref, state_catalog)\n', '  File "/usr/lib/python2.7/site-packages/nova/conductor/api.py", line 218, in service_update\n    return self._manager.service_update(context, service, values)\n', '  File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 330, in service_update\n    service=service_p, values=values)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 150, in call\n    wait_for_reply=True, timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send\n    timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 412, in send\n    return self._send(target, ctxt, message, wait_for_reply, timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 400, in _send\n    conn.topic_send(topic, msg, timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 147, in __exit__\n    self._done()\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 136, in _done\n    self.connection.reset()\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 656, in reset\n    self.channel.close()\n', '  File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 163, in close\n    (20, 41),  # Channel.close_ok\n', '  File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 69, in wait\n    self.channel_id, allowed_methods)\n', '  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 204, in _wait_method\n    self.method_reader.read_method()\n', '  File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 189, in read_method\n    self._next_method()\n', '  File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 112, in _next_method\n    frame_type, channel, payload = read_frame()\n', '  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 147, in read_frame\n    frame_type, channel, size = unpack(\'>BHI\', read(7, True))\n', '  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 245, in _read\n    s = recv(131072)\n', '  File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 264, in recv\n    timeout_exc=socket.timeout("timed out"))\n', '  File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 155, in trampoline\n    return hub.switch()\n', '  File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 187, in switch\n    return self.greenlet.switch()\n']
2015-07-22 14:51:44.007 15940 WARNING nova.openstack.common.deadlock_monitor [-] nova compute exit!

But the openstack-nova-compute status is always active:
[root@slot13 drivers(keystone_admin)]# systemctl status openstack-nova-compute.service 
openstack-nova-compute.service - OpenStack Nova Compute Server
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled)
   Active: active (running) since Tue 2015-07-21 22:18:38 CST; 19h ago
 Main PID: 15940 (nova-compute)
   CGroup: /system.slice/openstack-nova-compute.service
           忖15940 /usr/bin/python /usr/bin/nova-compute

Jul 22 14:44:25 slot13 sudo[22639]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...8af615-3b
Jul 22 14:44:25 slot13 sudo[22642]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ip link set qbr...1fa-53 up
Jul 22 14:44:25 slot13 sudo[22646]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...48a15b-ae
Jul 22 14:44:25 slot13 sudo[22666]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...2021fa-53

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1478450

Title:
  can not nova reboot vm

Status in OpenStack Compute (nova):
  New

Bug description:
  I have 10 vms, run nova reboot the 10 vms. After reboot 285 times,can
  not nova reboot again.The detail information:

  nova list:
  +--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+
  | ID                                   | Name                                      | Status | Task State     | Power State | Networks           |
  +--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+
  | 26d74bba-2e9a-4594-b6a1-1b27f337759b | test-26d74bba-2e9a-4594-b6a1-1b27f337759b | REBOOT | reboot_started | Running     | net01=192.168.0.8  |
  | 7d129f4d-0af7-4ea8-9de5-3483cdcb7a68 | test-7d129f4d-0af7-4ea8-9de5-3483cdcb7a68 | REBOOT | reboot_started | Running     | net01=192.168.0.13 |
  | 8d2e097b-e478-4a7f-b06d-7b578dfaf7c0 | test-8d2e097b-e478-4a7f-b06d-7b578dfaf7c0 | REBOOT | reboot_started | Running     | net01=192.168.0.9  |
  | a2d264f0-c1d0-449c-9ba3-32041247490c | test-a2d264f0-c1d0-449c-9ba3-32041247490c | REBOOT | reboot_started | Running     | net01=192.168.0.12 |
  | a3d9448d-25d8-4c01-925c-825a19164970 | test-a3d9448d-25d8-4c01-925c-825a19164970 | REBOOT | reboot_started | Running     | net01=192.168.0.11 |
  | a993222f-4e41-483e-8e91-300be0d525a4 | test-a993222f-4e41-483e-8e91-300be0d525a4 | REBOOT | reboot_started | Running     | net01=192.168.0.5  |
  | ace0736a-0371-4a3d-8a7f-525e299a924b | test-ace0736a-0371-4a3d-8a7f-525e299a924b | REBOOT | reboot_started | Running     | net01=192.168.0.6  |
  | e5515b31-8edb-4558-bb9f-f6f8b1142db2 | test-e5515b31-8edb-4558-bb9f-f6f8b1142db2 | REBOOT | reboot_started | Running     | net01=192.168.0.4  |
  | f380346a-bc24-4d63-a6e0-e36b5a508a59 | test-f380346a-bc24-4d63-a6e0-e36b5a508a59 | REBOOT | reboot_started | Running     | net01=192.168.0.10 |
  | fc396066-7ec6-4fe9-b7bf-70f162e293fe | test-fc396066-7ec6-4fe9-b7bf-70f162e293fe | REBOOT | reboot_started | Running     | net01=192.168.0.7  |
  +--------------------------------------+-------------------------------------------+--------+----------------+-------------+--------------------+

  The nova-compute.log:

  2015-07-22 14:51:43.808 15940 WARNING nova.openstack.common.deadlock_monitor [-] service function nova.servicegroup.drivers.db._report_state is deadlocked!
  2015-07-22 14:51:43.984 15940 WARNING nova.openstack.common.deadlock_monitor [-] deadlock traceback is ['  File "/usr/lib/python2.7/site-packages/nova/openstack/common/loopingcall.py", line 78, in _inner\n    self.f(*self.args, **self.kw)\n', '  File "/usr/lib/python2.7/site-packages/nova/openstack/common/deadlock_monitor.py", line 58, in __deco\n    ret = func(*args, **kwargs)\n', '  File "/usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py", line 116, in _report_state\n    service.service_ref, state_catalog)\n', '  File "/usr/lib/python2.7/site-packages/nova/conductor/api.py", line 218, in service_update\n    return self._manager.service_update(context, service, values)\n', '  File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 330, in service_update\n    service=service_p, values=values)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 150, in call\n    wait_for_reply=True, timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send\n    timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 412, in send\n    return self._send(target, ctxt, message, wait_for_reply, timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 400, in _send\n    conn.topic_send(topic, msg, timeout=timeout)\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 147, in __exit__\n    self._done()\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqp.py", line 136, in _done\n    self.connection.reset()\n', '  File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 656, in reset\n    self.channel.close()\n', '  File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 163, in close\n    (20, 41),  # Channel.close_ok\n', '  File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 69, in wait\n    self.channel_id, allowed_methods)\n', '  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 204, in _wait_method\n    self.method_reader.read_method()\n', '  File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 189, in read_method\n    self._next_method()\n', '  File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 112, in _next_method\n    frame_type, channel, payload = read_frame()\n', '  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 147, in read_frame\n    frame_type, channel, size = unpack(\'>BHI\', read(7, True))\n', '  File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 245, in _read\n    s = recv(131072)\n', '  File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 264, in recv\n    timeout_exc=socket.timeout("timed out"))\n', '  File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 155, in trampoline\n    return hub.switch()\n', '  File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 187, in switch\n    return self.greenlet.switch()\n']
  2015-07-22 14:51:44.007 15940 WARNING nova.openstack.common.deadlock_monitor [-] nova compute exit!

  But the openstack-nova-compute status is always active:
  [root@slot13 drivers(keystone_admin)]# systemctl status openstack-nova-compute.service 
  openstack-nova-compute.service - OpenStack Nova Compute Server
     Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled)
     Active: active (running) since Tue 2015-07-21 22:18:38 CST; 19h ago
   Main PID: 15940 (nova-compute)
     CGroup: /system.slice/openstack-nova-compute.service
             忖15940 /usr/bin/python /usr/bin/nova-compute

  Jul 22 14:44:25 slot13 sudo[22639]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...8af615-3b
  Jul 22 14:44:25 slot13 sudo[22642]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ip link set qbr...1fa-53 up
  Jul 22 14:44:25 slot13 sudo[22646]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...48a15b-ae
  Jul 22 14:44:25 slot13 sudo[22666]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/nova-rootwrap /etc/nova/rootwrap.conf ovs-vsctl --tim...2021fa-53

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1478450/+subscriptions


Follow ups