yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93899
[Bug 2063451] [NEW] Using the number of CPUs as a default for the workers leads to problems with big setups
Public bug reported:
Nova uses the CPU count as a default for the following worker settings,
which is problematic for people deploying on machines with a large
number of CPUs:
[DEFAULT]
osapi_compute_workers=
metadata_workers=
[conductor]
workers=
[scheduler]
workers=
In our case, it is a setup with >100 CPUs where the huge number of
workers lead to a lot of traffic to the cell1 database (MariaDB Galera)
for an otherwise empty OpenStack cluster, which in turn quickly filled
the database filesystem because of the growing MariaDB binlog. These
problems disappeared as soon as we explicitely configured the workers
for nova-scheduler and nova-conductor with a count of 8, each (we also
lowered the other workers for the sake of consistency).
I suggest that nova should apply a limit for the default. I couldn't
find guidelines for the worker counts in the nova docs – however,
according to other OpenStack projects, there seems to be some kind of
concensus of using a worker count way below 20:
* Kolla Ansible sets a maximum of 5 workers [1]
* puppet-openstacklib sets a maximum of 12 workers [2]
[1] https://github.com/openstack/kolla-ansible/blob/5a663aec1dc6ede45a860eecab84af05cd06b67f/ansible/group_vars/all.yml#L742
[2] https://github.com/openstack/puppet-openstacklib/blob/master/lib/facter/os_workers.rb#L45
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2063451
Title:
Using the number of CPUs as a default for the workers leads to
problems with big setups
Status in OpenStack Compute (nova):
New
Bug description:
Nova uses the CPU count as a default for the following worker
settings, which is problematic for people deploying on machines with a
large number of CPUs:
[DEFAULT]
osapi_compute_workers=
metadata_workers=
[conductor]
workers=
[scheduler]
workers=
In our case, it is a setup with >100 CPUs where the huge number of
workers lead to a lot of traffic to the cell1 database (MariaDB
Galera) for an otherwise empty OpenStack cluster, which in turn
quickly filled the database filesystem because of the growing MariaDB
binlog. These problems disappeared as soon as we explicitely
configured the workers for nova-scheduler and nova-conductor with a
count of 8, each (we also lowered the other workers for the sake of
consistency).
I suggest that nova should apply a limit for the default. I couldn't
find guidelines for the worker counts in the nova docs – however,
according to other OpenStack projects, there seems to be some kind of
concensus of using a worker count way below 20:
* Kolla Ansible sets a maximum of 5 workers [1]
* puppet-openstacklib sets a maximum of 12 workers [2]
[1] https://github.com/openstack/kolla-ansible/blob/5a663aec1dc6ede45a860eecab84af05cd06b67f/ansible/group_vars/all.yml#L742
[2] https://github.com/openstack/puppet-openstacklib/blob/master/lib/facter/os_workers.rb#L45
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2063451/+subscriptions