← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1420662] [NEW] VMware: InstanceList.get_by_host raise rpc timeout error

 

Public bug reported:

I deploy my OpenStack with VMware driver, one nova-compute connect to
VMware deployment, there are about 3000 VMs in VMware deployment. I use
mysql.

The method of InstanceList.get_by_host rasie rpc timeout error when
ComputeManager.init_host() and _sync_power_states periodic task execute.

Looks like a performance issue. currently, one nova-compute host map to
the whole VMware deployment that maybe contain several clusters in nova
VMware driver. When InstanceList.get_by_host execute in ComputeManager,
it indicate that nova-compute will execute a rpc call to nova-
conducutor, nova-conductor will fetch a lots of instances in the whole
VMware deployment in once, in my case , it's 3000 instances. The long
time SQL query maybe lead to the nova-conductor rpc timeout.

PS: 
vSphere 5.1 now allows 100 hosts and 3000 powered on VMs.
vSphere 6 now allows 1000 hosts and 10,000 powered on VMs.

** Affects: nova
     Importance: Undecided
     Assignee: Rui Chen (kiwik-chenrui)
         Status: New


** Tags: vmware

** Changed in: nova
     Assignee: (unassigned) => Rui Chen (kiwik-chenrui)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1420662

Title:
  VMware: InstanceList.get_by_host raise rpc timeout error

Status in OpenStack Compute (Nova):
  New

Bug description:
  I deploy my OpenStack with VMware driver, one nova-compute connect to
  VMware deployment, there are about 3000 VMs in VMware deployment. I
  use mysql.

  The method of InstanceList.get_by_host rasie rpc timeout error when
  ComputeManager.init_host() and _sync_power_states periodic task
  execute.

  Looks like a performance issue. currently, one nova-compute host map
  to the whole VMware deployment that maybe contain several clusters in
  nova VMware driver. When InstanceList.get_by_host execute in
  ComputeManager, it indicate that nova-compute will execute a rpc call
  to nova-conducutor, nova-conductor will fetch a lots of instances in
  the whole VMware deployment in once, in my case , it's 3000 instances.
  The long time SQL query maybe lead to the nova-conductor rpc timeout.

  PS: 
  vSphere 5.1 now allows 100 hosts and 3000 powered on VMs.
  vSphere 6 now allows 1000 hosts and 10,000 powered on VMs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1420662/+subscriptions


Follow ups

References