yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1790204] [NEW] Allocations are "doubled up" on same host resize even though there is only 1 server on the host

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Fri, 31 Aug 2018 19:43:15 -0000
Reply-to: Bug 1790204 <1790204@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Public bug reported:

This is a long-standing known issue from at least Pike when the nova
FilterScheduler started using placement to create allocations during
server create and move (e.g. resize) operations.

In Pike, resize to the same host resulted in allocations against the
compute node provider in placement to come from both the old and new
flavor and were both tied to the instance as the resource consumer.

Move operations and allocation handling was improved in Queens with this
blueprint:

https://specs.openstack.org/openstack/nova-
specs/specs/queens/implemented/migration-allocations.html

Where the source node allocations are moved to the migration record as
the consumer and the target node allocations are against the instance
record consumer.

That is also true of resize to the same host, however, we still have the
issue that the compute node resource provider usage is still effectively
"doubled up" during the resize because it's showing usage for two
flavors total when really there is only one being used.

The reported resource usage on the compute node provider during a same
host resize should be the *maximum* of both the old and new flavor, not
the combined aggregate.

Here is a simple recreate with devstack (created from master today):

1. we start with no resource usage on the single node provider

stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
+----------------+-------+
| resource_class | usage |
+----------------+-------+
| VCPU           |     0 |
| MEMORY_MB      |     0 |
| DISK_GB        |     0 |
+----------------+-------+

2. create a server and show there is usage:

stack@stein:~$ openstack flavor list
+----+-----------+-------+------+-----------+-------+-----------+
| ID | Name      |   RAM | Disk | Ephemeral | VCPUs | Is Public |
+----+-----------+-------+------+-----------+-------+-----------+
| 1  | m1.tiny   |   512 |    1 |         0 |     1 | True      |
| 2  | m1.small  |  2048 |   20 |         0 |     1 | True      |
| 3  | m1.medium |  4096 |   40 |         0 |     2 | True      |
| 4  | m1.large  |  8192 |   80 |         0 |     4 | True      |
| 5  | m1.xlarge | 16384 |  160 |         0 |     8 | True      |
| c1 | cirros256 |   256 |    0 |         0 |     1 | True      |
| d1 | ds512M    |   512 |    5 |         0 |     1 | True      |
| d2 | ds1G      |  1024 |   10 |         0 |     1 | True      |
| d3 | ds2G      |  2048 |   10 |         0 |     2 | True      |
| d4 | ds4G      |  4096 |   20 |         0 |     4 | True      |
+----+-----------+-------+------+-----------+-------+-----------+


stack@stein:~$ openstack server create --flavor m1.tiny --image cirros-0.3.5-x86_64-disk resize-same-host

stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
+----------------+-------+
| resource_class | usage |
+----------------+-------+
| VCPU           |     1 |
| MEMORY_MB      |   512 |
| DISK_GB        |     1 |
+----------------+-------+

3. resize the server and check usage:

stack@stein:~$ openstack server resize resize-same-host --flavor m1.small
stack@stein:~$ openstack server list
+--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
| ID                                   | Name             | Status        | Networks                                               | Image                    | Flavor   |
+--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
| d7d743d8-7561-4c9c-a7bf-e9fe1e89dea1 | resize-same-host | VERIFY_RESIZE | private=fdde:1239:d41d:0:f816:3eff:fe1f:a19, 10.0.0.13 | cirros-0.3.5-x86_64-disk | m1.small |
+--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
+----------------+-------+
| resource_class | usage |
+----------------+-------+
| VCPU           |     2 |
| MEMORY_MB      |  2560 |
| DISK_GB        |    21 |
+----------------+-------+


And here we see the old/new flavor usage are cumulative on the single node provider.

4. confirm the resize and the usage is just the new m1.small flavor.

stack@stein:~$ openstack server resize resize-same-host --confirm
stack@stein:~$ openstack server list
+--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
| ID                                   | Name             | Status | Networks                                               | Image                    | Flavor   |
+--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
| d7d743d8-7561-4c9c-a7bf-e9fe1e89dea1 | resize-same-host | ACTIVE | private=fdde:1239:d41d:0:f816:3eff:fe1f:a19, 10.0.0.13 | cirros-0.3.5-x86_64-disk | m1.small |
+--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
+----------------+-------+
| resource_class | usage |
+----------------+-------+
| VCPU           |     1 |
| MEMORY_MB      |  2048 |
| DISK_GB        |    20 |
+----------------+-------+
stack@stein:~$ 


===

Same-host resize is disabled by default but can be important in at least
two cases:

1. Servers in an affinity (same-host) group cannot resize if they are
not allowed to resize on the same host.

2. "Edge" deployment scenarios where there are 1 or 2 compute hosts
means being able to resize on the same host is critical - and probably
what's more critical in those edge scenarios is not reporting resource
usage that is not really there, since it could result in scheduling
failures to that host which otherwise would have fit.

** Affects: nova
     Importance: Medium
         Status: Triaged


** Tags: placement resize

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1790204

Title:
  Allocations are "doubled up" on same host resize even though there is
  only 1 server on the host

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  This is a long-standing known issue from at least Pike when the nova
  FilterScheduler started using placement to create allocations during
  server create and move (e.g. resize) operations.

  In Pike, resize to the same host resulted in allocations against the
  compute node provider in placement to come from both the old and new
  flavor and were both tied to the instance as the resource consumer.

  Move operations and allocation handling was improved in Queens with
  this blueprint:

  https://specs.openstack.org/openstack/nova-
  specs/specs/queens/implemented/migration-allocations.html

  Where the source node allocations are moved to the migration record as
  the consumer and the target node allocations are against the instance
  record consumer.

  That is also true of resize to the same host, however, we still have
  the issue that the compute node resource provider usage is still
  effectively "doubled up" during the resize because it's showing usage
  for two flavors total when really there is only one being used.

  The reported resource usage on the compute node provider during a same
  host resize should be the *maximum* of both the old and new flavor,
  not the combined aggregate.

  Here is a simple recreate with devstack (created from master today):

  1. we start with no resource usage on the single node provider

  stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
  +----------------+-------+
  | resource_class | usage |
  +----------------+-------+
  | VCPU           |     0 |
  | MEMORY_MB      |     0 |
  | DISK_GB        |     0 |
  +----------------+-------+

  2. create a server and show there is usage:

  stack@stein:~$ openstack flavor list
  +----+-----------+-------+------+-----------+-------+-----------+
  | ID | Name      |   RAM | Disk | Ephemeral | VCPUs | Is Public |
  +----+-----------+-------+------+-----------+-------+-----------+
  | 1  | m1.tiny   |   512 |    1 |         0 |     1 | True      |
  | 2  | m1.small  |  2048 |   20 |         0 |     1 | True      |
  | 3  | m1.medium |  4096 |   40 |         0 |     2 | True      |
  | 4  | m1.large  |  8192 |   80 |         0 |     4 | True      |
  | 5  | m1.xlarge | 16384 |  160 |         0 |     8 | True      |
  | c1 | cirros256 |   256 |    0 |         0 |     1 | True      |
  | d1 | ds512M    |   512 |    5 |         0 |     1 | True      |
  | d2 | ds1G      |  1024 |   10 |         0 |     1 | True      |
  | d3 | ds2G      |  2048 |   10 |         0 |     2 | True      |
  | d4 | ds4G      |  4096 |   20 |         0 |     4 | True      |
  +----+-----------+-------+------+-----------+-------+-----------+

  
  stack@stein:~$ openstack server create --flavor m1.tiny --image cirros-0.3.5-x86_64-disk resize-same-host

  stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
  +----------------+-------+
  | resource_class | usage |
  +----------------+-------+
  | VCPU           |     1 |
  | MEMORY_MB      |   512 |
  | DISK_GB        |     1 |
  +----------------+-------+

  3. resize the server and check usage:

  stack@stein:~$ openstack server resize resize-same-host --flavor m1.small
  stack@stein:~$ openstack server list
  +--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
  | ID                                   | Name             | Status        | Networks                                               | Image                    | Flavor   |
  +--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
  | d7d743d8-7561-4c9c-a7bf-e9fe1e89dea1 | resize-same-host | VERIFY_RESIZE | private=fdde:1239:d41d:0:f816:3eff:fe1f:a19, 10.0.0.13 | cirros-0.3.5-x86_64-disk | m1.small |
  +--------------------------------------+------------------+---------------+--------------------------------------------------------+--------------------------+----------+
  stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
  +----------------+-------+
  | resource_class | usage |
  +----------------+-------+
  | VCPU           |     2 |
  | MEMORY_MB      |  2560 |
  | DISK_GB        |    21 |
  +----------------+-------+

  
  And here we see the old/new flavor usage are cumulative on the single node provider.

  4. confirm the resize and the usage is just the new m1.small flavor.

  stack@stein:~$ openstack server resize resize-same-host --confirm
  stack@stein:~$ openstack server list
  +--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
  | ID                                   | Name             | Status | Networks                                               | Image                    | Flavor   |
  +--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
  | d7d743d8-7561-4c9c-a7bf-e9fe1e89dea1 | resize-same-host | ACTIVE | private=fdde:1239:d41d:0:f816:3eff:fe1f:a19, 10.0.0.13 | cirros-0.3.5-x86_64-disk | m1.small |
  +--------------------------------------+------------------+--------+--------------------------------------------------------+--------------------------+----------+
  stack@stein:~$ openstack resource provider usage show e2bc5091-b7fd-4d18-80a8-aeecb87b0fd0
  +----------------+-------+
  | resource_class | usage |
  +----------------+-------+
  | VCPU           |     1 |
  | MEMORY_MB      |  2048 |
  | DISK_GB        |    20 |
  +----------------+-------+
  stack@stein:~$ 

  
  ===

  Same-host resize is disabled by default but can be important in at
  least two cases:

  1. Servers in an affinity (same-host) group cannot resize if they are
  not allowed to resize on the same host.

  2. "Edge" deployment scenarios where there are 1 or 2 compute hosts
  means being able to resize on the same host is critical - and probably
  what's more critical in those edge scenarios is not reporting resource
  usage that is not really there, since it could result in scheduling
  failures to that host which otherwise would have fit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1790204/+subscriptions