yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1389782] [NEW] Servicegroups: Multi process nova-conductor is unable to join servicegroups when zk driver is used

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Pawel Palucki <pawel.palucki@xxxxxxxxx>
Date: Wed, 05 Nov 2014 16:54:21 -0000
Reply-to: Bug 1389782 <1389782@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Public bug reported:

I have found that nova-conductor when run as multi process (default),
shares the handle to zookeeper process that causes a lock probably
inside zookeeper.c. Probably some internal zookeeper structures like
sockets are shared and this is not allowed by zookeeper.

Checkout the consequences.

There is similar complementary bug but there are other effect - multiple
unnecessary registration and over-use of resources.

https://bugs.launchpad.net/nova/+bug/1382153

How to reproduce:
-----------------

devstack + ubuntu 14.04 + zookeeper 3.4.5

nova.conf:

[DEFAULT]
servicegroup_driver = zk

[conductor]
workers = 2

then run nova-conductor.

We can observer in logs (with debug=True):

DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
/servicegroups/conductor/somehost

but there is no following expected:

DEBUG evzookeeper.membership [req-xxx None None ] created zknode
/servicegroups/conductor/somehost

We can check that zookeeper conductor node wasn't created:

/usr/share/zookeeper/bin/zkCli.sh ls /servicegroups

I investigated that the problem lies only in zookeeper c library
implementation and is not caused by python zookeeper bindings
evzookeeper.

Here is a little snippet that show that program is blocked when
zookeeper handle is used by child process (requires only zookeeper
server and python).

http://paste.openstack.org/show/129636/ (attached)

We can check the logs in zookeeper-server and observer that the request
for creation from client isn't send to zookeeper-server at all.

I was trying to go deeply inside internals of zookeeper.c but I couldn't
find a clue why it isn't working.

>From the point of evzookeeper (zk.driver), the callback isn't called so
green thread just waiting infinitely for response.

Consequences
------------

Nova-conductor works fine (because communication with zookeeper is in
backgrounded green thread) but:

a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)

The effects from the perspective of OpenStack cluster are:

* effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
  exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
  So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)

  Additionally it causes internal horizon 500 TemplateSyntaxError in
horizon when trying to list all hypervisors at /admin/hypervisors/.

* effect of b) causes that service-list or "System Information" gives a
false negative: it shows service is down when in reality service is
working

AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
If workers>1 `launch` function starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.

I checked how sharing common resource (socket) affects other drivers.
It's not a problem for memcache or db driver, because connection to
memcache/db is created in lazy manner (connection/socket isn't created
until required by child process).

Possible solutions:
1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work and maybe even a blueprint
3. based on first solution but with a difference that parent process registers the parent node (host) and each subproccess registers subnode (pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty

The problem with zookeeper and forking isn't new for openstack:

http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
using-the-zkc-mt-lib

but the right solution wasn't found.

** Affects: nova
     Importance: Undecided
     Assignee: Pawel Palucki (pawel-palucki-q)
         Status: New


** Tags: conductor zookeeper

** Attachment added: "Example how zookeeper is blocking access from multiple processes with the same handle"
   https://bugs.launchpad.net/bugs/1389782/+attachment/4253990/+files/deadlock_zk.py

** Description changed:

  I have found that nova-conductor run as multi process (workers > 2),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
-     
- DEBUG evzookeeper.membership [req-xxx None None ] created zknode /servicegroups/conductor/somehost
+ 
+ DEBUG evzookeeper.membership [req-xxx None None ] created zknode
+ /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
- http://paste.openstack.org/show/129636/
+ http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine because (communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effect from the perspective OpenStack cluster is:
  
- * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of  
-   exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
-   So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
+ * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
+   exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
+   So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
-   Additionally it causes internal horizon 500 TemplateSyntaxError in
+   Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 launch starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket is created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Description changed:

- I have found that nova-conductor run as multi process (workers > 2),
+ I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
+ 
+ Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
+ 
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine because (communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effect from the perspective OpenStack cluster is:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 launch starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket is created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Changed in: nova
     Assignee: (unassigned) => Pawel Palucki (pawel-palucki-q)

** Description changed:

  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
- 
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
- Nova-conductor works fine because (communication with zookeeper is in
+ Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effect from the perspective OpenStack cluster is:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 launch starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket is created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Description changed:

  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
- The effect from the perspective OpenStack cluster is:
+ The effects from the perspective of OpenStack cluster are:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 launch starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket is created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Description changed:

  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effects from the perspective of OpenStack cluster are:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
- If workers>1 launch starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
+ If workers>1 `launch` function starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket is created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Description changed:

  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effects from the perspective of OpenStack cluster are:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 `launch` function starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
- memcache/db is created in lazy manner (connection/socket is created
+ memcache/db is created in lazy manner (connection/socket isn't created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
  3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

** Description changed:

  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.
  
  Checkout the consequences.
  
  There is similar complementary bug but there are other effect - multiple
  unnecessary registration and over-use of resources.
  
  https://bugs.launchpad.net/nova/+bug/1382153
  
  How to reproduce:
  -----------------
  
  devstack + ubuntu 14.04 + zookeeper 3.4.5
  
  nova.conf:
  
  [DEFAULT]
  servicegroup_driver = zk
  
  [conductor]
  workers = 2
  
  then run nova-conductor.
  
  We can observer in logs (with debug=True):
  
  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost
  
  but there is no following expected:
  
  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost
  
  We can check that zookeeper conductor node wasn't created:
  
  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups
  
  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.
  
  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).
  
  http://paste.openstack.org/show/129636/ (attached)
  
  We can check the logs in zookeeper-server and observer that the request
  for creation from client isn't send to zookeeper-server at all.
  
  I was trying to go deeply inside internals of zookeeper.c but I couldn't
  find a clue why it isn't working.
  
  From the point of evzookeeper (zk.driver), the callback isn't called so
  green thread just waiting infinitely for response.
  
  Consequences
  ------------
  
  Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:
  
  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)
  
  The effects from the perspective of OpenStack cluster are:
  
  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)
  
    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.
  
  * effect of b) causes that service-list or "System Information" gives a
  false negative: it shows service is down when in reality service is
  working
  
  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 `launch` function starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.
  
  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket isn't created
  until required by child process).
  
  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
- 2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work a maybe even a blueprint
- 3. based on first solution but with a difference that parent process register the parent node and each subproccess registers subnode (identified by pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
+ 2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work and maybe even a blueprint
+ 3. based on first solution but with a difference that parent process registers the parent node (host) and each subproccess registers subnode (pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty
  
  The problem with zookeeper and forking isn't new for openstack:
  
  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-when-
  using-the-zkc-mt-lib
  
  but the right solution wasn't found.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1389782

Title:
  Servicegroups: Multi process nova-conductor is unable to join
  servicegroups when zk driver is used

Status in OpenStack Compute (Nova):
  New

Bug description:
  I have found that nova-conductor when run as multi process (default),
  shares the handle to zookeeper process that causes a lock probably
  inside zookeeper.c. Probably some internal zookeeper structures like
  sockets are shared and this is not allowed by zookeeper.

  Checkout the consequences.

  There is similar complementary bug but there are other effect -
  multiple unnecessary registration and over-use of resources.

  https://bugs.launchpad.net/nova/+bug/1382153

  How to reproduce:
  -----------------

  devstack + ubuntu 14.04 + zookeeper 3.4.5

  nova.conf:

  [DEFAULT]
  servicegroup_driver = zk

  [conductor]
  workers = 2

  then run nova-conductor.

  We can observer in logs (with debug=True):

  DEBUG evzookeeper.membership [req-xxx None None] Membership._join on
  /servicegroups/conductor/somehost

  but there is no following expected:

  DEBUG evzookeeper.membership [req-xxx None None ] created zknode
  /servicegroups/conductor/somehost

  We can check that zookeeper conductor node wasn't created:

  /usr/share/zookeeper/bin/zkCli.sh ls /servicegroups

  I investigated that the problem lies only in zookeeper c library
  implementation and is not caused by python zookeeper bindings
  evzookeeper.

  Here is a little snippet that show that program is blocked when
  zookeeper handle is used by child process (requires only zookeeper
  server and python).

  http://paste.openstack.org/show/129636/ (attached)

  We can check the logs in zookeeper-server and observer that the
  request for creation from client isn't send to zookeeper-server at
  all.

  I was trying to go deeply inside internals of zookeeper.c but I
  couldn't find a clue why it isn't working.

  From the point of evzookeeper (zk.driver), the callback isn't called
  so green thread just waiting infinitely for response.

  Consequences
  ------------

  Nova-conductor works fine (because communication with zookeeper is in
  backgrounded green thread) but:

  a) the namespace in zookeeper /servicegroups/conductor isn't created (if namespace wasn't created before)
  b) the ephemeral node for conductors in namespace isn't created (if namespace somehow exists)

  The effects from the perspective of OpenStack cluster are:

  * effect of a) causes internal exceptions in nova-api service and therefore 'novaclient service-list' and horizon/"System Information"/"Compute services" doesn't work because of
    exceptions 'NoNodeException: no node' followed by 'ServiceGroupUnavailable: The service from servicegroup driver ZooKeeperDriver is temporarily unavailable.'
    So it isn't to possible to list any working services only because the namespace for conductors wasn't prepared (in reality all services working, zookeeper is working)

    Additionally it causes internal horizon 500 TemplateSyntaxError in
  horizon when trying to list all hypervisors at /admin/hypervisors/.

  * effect of b) causes that service-list or "System Information" gives
  a false negative: it shows service is down when in reality service is
  working

  AFAIK only nova-conductor is affected by this for now, because it is the only one of nova services that passes `workers` argument to openstack.common.service.launch(server, workers) and it is based on that are service.Service (not WSGIService based).
  If workers>1 `launch` function starts the service by ProcessLanucher. ProcessLauncher is responsible for forking. The problem is that service object is already created with initialized zk driver object (in parent process).
  Zk driver object is already initialized with connection (handle) that will be shared by child processes. Then in Service.start (in fork) there is a try to join servicegroup that doesn't work.

  I checked how sharing common resource (socket) affects other drivers.
  It's not a problem for memcache or db driver, because connection to
  memcache/db is created in lazy manner (connection/socket isn't created
  until required by child process).

  Possible solutions:
  1. simple but not clean: initialize zookeeper driver in lazy manner (like db/memcache), so each process will create own handle to zookeeper, ignoring the problem that each process tries to create the same node in zookeeper
  2. refactor base nova.service.Service that only parent process is responsible for joining the servicegroups - requires a lot of work and maybe even a blueprint
  3. based on first solution but with a difference that parent process registers the parent node (host) and each subproccess registers subnode (pid) for example: /servicesgroups/conductor/HOST/PID - then get_all shouldn't check if HOST node exist but if is empty

  The problem with zookeeper and forking isn't new for openstack:

  http://qnalist.com/questions/27169/how-to-deal-with-fork-properly-
  when-using-the-zkc-mt-lib

  but the right solution wasn't found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1389782/+subscriptions
Follow ups

[Bug 1389782] Re: Servicegroups: Multi process nova-conductor is unable to join servicegroups when zk driver is used
From: Thierry Carrez, 2015-03-20
[Bug 1389782] [NEW] Servicegroups: Multi process nova-conductor is unable to join servicegroups when zk driver is used
From: Pawel Palucki, 2014-11-05
References

[Bug 1389782] [NEW] Servicegroups: Multi process nova-conductor is unable to join servicegroups when zk driver is used
From: Pawel Palucki, 2014-11-05