yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70317
[Bug 1742747] [NEW] RT overrides default allocation_ratios for ram cpu and disk
Public bug reported:
Description
===========
Resource tracker overrides default allocation ratio values with values
from configuration files without checking it those values are "valid
ones".
Allocation ratios values are taken directly from configuration files. This is a good approach unless allocation ratios in configuration file are set to 0.0. Here comes a problem. Default configuration parameter sets those ratios to be 0.0:
https://github.com/openstack/nova/blob/master/nova/conf/compute.py#L397
So if allocation ratio is set as 0.0 (or not set, because 0.0 is default value), we would have issues with send this ratio with RT update to placement.
*BUT here comes the solution*:
https://github.com/openstack/nova/blob/master/nova/objects/compute_node.py#L198
When we read ComputeNode object from DB we also check if ratios are 0.0,
if yes we override them (CPU-16x, RAM-1.5x, DISK-1x).
But just after initialization of ComputeNode object here:
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py?utf8=✓#L539
We copy actual resources to it (thanks to _copy_resources).
We override allocations from ComputeNode to those that are taken from
configuration file - yes, thats ok. If operator wants to change ratios -
he will do it in conf file and then restart the service.
But what if he would leave those parameters untouched in config? Here comes the problem!
Those params would be always set to 0.0 - placement api doesn't like it and raise:
InvalidInventoryCapacity: Invalid inventory for 'VCPU' on resource provider '52559824-5fb1-424b-a4cf-79da9199447d'. The reserved value is greater than or equal to total.
The exception is raised here:
https://github.com/openstack/nova/blob/master/nova/objects/resource_provider.py#L228
Some code around problem:
Code:
> /opt/stack/nova/nova/compute/resource_tracker.py(610)
602 def _copy_resources(self, compute_node, resources):
603 """Copy resource values to supplied compute_node."""
604 # purge old stats and init with anything passed in by the driver
605 self.stats.clear()
606 self.stats.digest_stats(resources.get('stats'))
607 compute_node.stats = copy.deepcopy(self.stats)
608
609 # update the allocation ratios for the related ComputeNode object
610 -> compute_node.ram_allocation_ratio = self.ram_allocation_ratio
611 compute_node.cpu_allocation_ratio = self.cpu_allocation_ratio
612 compute_node.disk_allocation_ratio = self.disk_allocation_ratio
613
614 # now copy rest to compute_node
615 compute_node.update_from_virt_driver(resources)
(Pdb++) self.cpu_allocation_ratio
0.0
self.cpu_allocation_ratio comes directly from config:
https://github.com/openstack/nova/blob/master/nova/conf/compute.py#L397
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L148
Environment
===========
Latest master
How to reproduce
===========
1. Spawn devstack
2. Leave configuration files untouched
3. Observe overrides in
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py?utf8=✓#L611
4. Watch how RT sends it to placement and placement responds with 400 - bad request.
** Affects: nova
Importance: Undecided
Assignee: Maciej Jozefczyk (maciej.jozefczyk)
Status: New
** Changed in: nova
Assignee: (unassigned) => Maciej Jozefczyk (maciej.jozefczyk)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1742747
Title:
RT overrides default allocation_ratios for ram cpu and disk
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
Resource tracker overrides default allocation ratio values with values
from configuration files without checking it those values are "valid
ones".
Allocation ratios values are taken directly from configuration files. This is a good approach unless allocation ratios in configuration file are set to 0.0. Here comes a problem. Default configuration parameter sets those ratios to be 0.0:
https://github.com/openstack/nova/blob/master/nova/conf/compute.py#L397
So if allocation ratio is set as 0.0 (or not set, because 0.0 is default value), we would have issues with send this ratio with RT update to placement.
*BUT here comes the solution*:
https://github.com/openstack/nova/blob/master/nova/objects/compute_node.py#L198
When we read ComputeNode object from DB we also check if ratios are
0.0, if yes we override them (CPU-16x, RAM-1.5x, DISK-1x).
But just after initialization of ComputeNode object here:
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py?utf8=✓#L539
We copy actual resources to it (thanks to _copy_resources).
We override allocations from ComputeNode to those that are taken from
configuration file - yes, thats ok. If operator wants to change ratios
- he will do it in conf file and then restart the service.
But what if he would leave those parameters untouched in config? Here comes the problem!
Those params would be always set to 0.0 - placement api doesn't like it and raise:
InvalidInventoryCapacity: Invalid inventory for 'VCPU' on resource provider '52559824-5fb1-424b-a4cf-79da9199447d'. The reserved value is greater than or equal to total.
The exception is raised here:
https://github.com/openstack/nova/blob/master/nova/objects/resource_provider.py#L228
Some code around problem:
Code:
> /opt/stack/nova/nova/compute/resource_tracker.py(610)
602 def _copy_resources(self, compute_node, resources):
603 """Copy resource values to supplied compute_node."""
604 # purge old stats and init with anything passed in by the driver
605 self.stats.clear()
606 self.stats.digest_stats(resources.get('stats'))
607 compute_node.stats = copy.deepcopy(self.stats)
608
609 # update the allocation ratios for the related ComputeNode object
610 -> compute_node.ram_allocation_ratio = self.ram_allocation_ratio
611 compute_node.cpu_allocation_ratio = self.cpu_allocation_ratio
612 compute_node.disk_allocation_ratio = self.disk_allocation_ratio
613
614 # now copy rest to compute_node
615 compute_node.update_from_virt_driver(resources)
(Pdb++) self.cpu_allocation_ratio
0.0
self.cpu_allocation_ratio comes directly from config:
https://github.com/openstack/nova/blob/master/nova/conf/compute.py#L397
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L148
Environment
===========
Latest master
How to reproduce
===========
1. Spawn devstack
2. Leave configuration files untouched
3. Observe overrides in
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py?utf8=✓#L611
4. Watch how RT sends it to placement and placement responds with 400 - bad request.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1742747/+subscriptions