yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #22857
[Bug 1239864] Re: nova-api fails to query ServiceGroup status from Zookeeper
** Changed in: nova
Status: Fix Committed => Fix Released
** Changed in: nova
Milestone: None => juno-rc1
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1239864
Title:
nova-api fails to query ServiceGroup status from Zookeeper
Status in OpenStack Compute (Nova):
Fix Released
Bug description:
I am running with the ZooKeeper servicegroup driver on CentOS 6.4
(Python 2.6) with the RDO distro of Grizzly.
All nova services are successfully connecting to ZooKeeper, which I've
verified using zkCli.
However, when I run `nova service-list` I get an HTTP 500 error from
nova-api. The nova-api log (/var/log/nova/api.log) shows:
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/api.py"\
, line 93, in service_is_up
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack return self._driver.is_up(member)
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/drivers\
/zk.py", line 116, in is_up
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack all_members = self.get_all(group_id)
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/drivers\
/zk.py", line 141, in get_all
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack raise exception.ServiceGroupUnavailable(driver="ZooKeeperDrive\
r")
2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack ServiceGroupUnavailable: The service from servicegroup driver ZooK\
eeperDriver is temporarily unavailable.
The problem seems to be around evzookeeper (using version 0.4.0).
To isolate the problem, I added some evzookeeper.ZKSession synchronous
get() calls to test the roundtrip communication to ZooKeeper. When I
do a `self._session.get(CONF.zookeeper.sg_prefix)` in the zk.py
ZooKeeperDriver __init__() method it works fine. The logs show that
this is immediately before the wsgi server starts up.
When I do the get() operation from within the ZooKeeperDriver
get_all() method, the web request hangs indefinitely. However, if I
recreate the evzookeeper.ZKSession within the get_all() method (after
the wsgi server has started) the nova-api request is successful.
diff --git a/nova/servicegroup/drivers/zk.py b/nova/servicegroup/drivers/zk.py
index 2a3edae..7de2488 100644
--- a/nova/servicegroup/drivers/zk.py
+++ b/nova/servicegroup/drivers/zk.py
@@ -122,7 +122,14 @@ class ZooKeeperDriver(api.ServiceGroupDriver):
monitor = self._monitors.get(group_id, None)
if monitor is None:
path = "%s/%s" % (CONF.zookeeper.sg_prefix, group_id)
- monitor = membership.MembershipMonitor(self._session, path)
+
+ null = open(os.devnull, "w")
+ local_session = evzookeeper.ZKSession(CONF.zookeeper.address,
+ recv_timeout=
+ CONF.zookeeper.recv_timeout,
+ zklog_fd=null)
+
+ monitor = membership.MembershipMonitor(local_session, path)
self._monitors[group_id] = monitor
# Note(maoy): When initialized for the first time, it takes a
# while to retrieve all members from zookeeper. To prevent
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1239864/+subscriptions