← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1847367] [NEW] Images with hw:vif_multiqueue_enabled can be limited to 8 queues even if more are supported

 

Public bug reported:

Nova version: 18.2.3
Release: Rocky
Compute node OS: CentOS 7.3
Compute node kernel: 3.10.0-327.13.1.el7.x86_64

In https://bugs.launchpad.net/nova/+bug/1570631 and commit
https://review.opendev.org/#/c/332660/, a bug was fixed by making the
assumption that the kernel version should also dictate the max number of
queues on the tap interface when setting hw:vif_multiqueue_enabled=True.
It was decided that 3.x kernels have a max queue count of 8.
Unfortunately not all distributions follow this, and CentOS/RHEL has
supported up to 256 queues since at least 7.2 even with a 3.x kernel.

The result of this is that a 20 core VM created in Mitaka will have 20
queues enabled (because the limit of 8 had not been added). The very
same host after being upgraded to Rocky will instead only give 8 queues
to the VM even though the kernel supports 256.

Could a workaround option be implemented to disable this check, or
manually define the max queue count?

Snippet of drivers/net/tun.c from CentOS 7.2 kernel source code:
/* MAX_TAP_QUEUES 256 is chosen to allow rx/tx queues to be equal
 * to max number of VCPUs in guest. */
#define MAX_TAP_QUEUES 256
#define MAX_TAP_FLOWS  4096

Snippet from the 3.10.0 kernel code from https://elixir.bootlin.com/linux/v3.10/source/drivers/net/tun.c:
/* DEFAULT_MAX_NUM_RSS_QUEUES were choosed to let the rx/tx queues allocated for
 * the netdevice to be fit in one page. So we can make sure the success of
 * memory allocation. TODO: increase the limit. */
#define MAX_TAP_QUEUES DEFAULT_MAX_NUM_RSS_QUEUES
#define MAX_TAP_FLOWS  4096

In the above example, DEFAULT_MAX_NUM_RSS_QUEUES is set to 8.

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

  Nova version: 18.2.3
  Release: Rocky
  Compute node OS: CentOS 7.3
  Compute node kernel: 3.10.0-327.13.1.el7.x86_64
  
  In https://bugs.launchpad.net/nova/+bug/1570631 and commit
  https://review.opendev.org/#/c/332660/, a bug was fixed by making the
  assumption that the kernel version should also dictate the max number of
  queues on the tap interface when setting hw:vif_multiqueue_enabled=True.
  It was decided that 3.x kernels have a max queue count of 8.
  Unfortunately not all distributions follow this, and CentOS/RHEL has
  supported up to 256 queues since at least 7.2 even with a 3.x kernel.
  
  The result of this is that a 20 core VM created in Mitaka will have 20
  queues enabled (because the limit of 8 had not been added). The very
  same host after being upgraded to Rocky will instead only give 8 queues
  to the VM even though the kernel supports 256.
  
  Could a workaround option be implemented to disable this check, or
  manually define the max queue count?
  
- Snippet of drivers/net/tun.c from CentOS 7.2 kernel source code
+ Snippet of drivers/net/tun.c from CentOS 7.2 kernel source code:
  /* MAX_TAP_QUEUES 256 is chosen to allow rx/tx queues to be equal
-  * to max number of VCPUs in guest. */
+  * to max number of VCPUs in guest. */
  #define MAX_TAP_QUEUES 256
  #define MAX_TAP_FLOWS  4096
  
  Snippet from the 3.10.0 kernel code from https://elixir.bootlin.com/linux/v3.10/source/drivers/net/tun.c:
  /* DEFAULT_MAX_NUM_RSS_QUEUES were choosed to let the rx/tx queues allocated for
-  * the netdevice to be fit in one page. So we can make sure the success of
-  * memory allocation. TODO: increase the limit. */
+  * the netdevice to be fit in one page. So we can make sure the success of
+  * memory allocation. TODO: increase the limit. */
  #define MAX_TAP_QUEUES DEFAULT_MAX_NUM_RSS_QUEUES
  #define MAX_TAP_FLOWS  4096
  
  In the above example, DEFAULT_MAX_NUM_RSS_QUEUES is set to 8.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1847367

Title:
  Images with hw:vif_multiqueue_enabled can be limited to 8 queues even
  if more are supported

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova version: 18.2.3
  Release: Rocky
  Compute node OS: CentOS 7.3
  Compute node kernel: 3.10.0-327.13.1.el7.x86_64

  In https://bugs.launchpad.net/nova/+bug/1570631 and commit
  https://review.opendev.org/#/c/332660/, a bug was fixed by making the
  assumption that the kernel version should also dictate the max number
  of queues on the tap interface when setting
  hw:vif_multiqueue_enabled=True. It was decided that 3.x kernels have a
  max queue count of 8. Unfortunately not all distributions follow this,
  and CentOS/RHEL has supported up to 256 queues since at least 7.2 even
  with a 3.x kernel.

  The result of this is that a 20 core VM created in Mitaka will have 20
  queues enabled (because the limit of 8 had not been added). The very
  same host after being upgraded to Rocky will instead only give 8
  queues to the VM even though the kernel supports 256.

  Could a workaround option be implemented to disable this check, or
  manually define the max queue count?

  Snippet of drivers/net/tun.c from CentOS 7.2 kernel source code:
  /* MAX_TAP_QUEUES 256 is chosen to allow rx/tx queues to be equal
   * to max number of VCPUs in guest. */
  #define MAX_TAP_QUEUES 256
  #define MAX_TAP_FLOWS  4096

  Snippet from the 3.10.0 kernel code from https://elixir.bootlin.com/linux/v3.10/source/drivers/net/tun.c:
  /* DEFAULT_MAX_NUM_RSS_QUEUES were choosed to let the rx/tx queues allocated for
   * the netdevice to be fit in one page. So we can make sure the success of
   * memory allocation. TODO: increase the limit. */
  #define MAX_TAP_QUEUES DEFAULT_MAX_NUM_RSS_QUEUES
  #define MAX_TAP_FLOWS  4096

  In the above example, DEFAULT_MAX_NUM_RSS_QUEUES is set to 8.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1847367/+subscriptions


Follow ups