← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1292583] [NEW] VMware: nova nodes crash on 503 errors, control of vSphere infrastructure completely lost

 

Public bug reported:

Related to https://bugs.launchpad.net/nova/+bug/1262288 when the Nova
node shuts down it leaves it's vSphere SOAP session open. This session
remains open blocking connection by administrators and by OpenStack.

The result is administrative control over the entire vSphere, vCenter,
and ESX infrastructure is temporarily lost.

Python CLI users experience:
pyVmomi.VmomiSupport.HostConnectFault: (vim.fault.HostConnectFault) {
   dynamicType = <unset>,
   dynamicProperty = (vmodl.DynamicProperty) [],
   msg = '503 Service Unavailable',
   faultCause = <unset>,
   faultMessage = (vmodl.LocalizableMessage) []
}


VpXClient users experience:

Call "ServiceInstance.RetrieveContent" for object "ServiceInstance" on
Server "<server_name>" failed.

OpenStack users experience:
2014-03-14 08:55:06.579 ERROR suds.client [-] <?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="urn:vim25" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/";>
   <ns1:Body>
      <ns0:RetrieveServiceContent>
         <ns0:_this type="ServiceInstance">ServiceInstance</ns0:_this>
      </ns0:RetrieveServiceContent>
   </ns1:Body>
</SOAP-ENV:Envelope>
2014-03-14 08:55:06.581 CRITICAL nova.virt.vmwareapi.driver [-] Unable to connect to server at 192.168.2.36, sleeping for 2 seconds
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver Traceback (most recent call last):
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 753, in _create_session
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     self.vim = self._get_vim_object()
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 742, in _get_vim_object
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     return vim.Vim(protocol=self._scheme, host=self._host_ip)
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 117, in __init__
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     self._service_content = self.retrieve_service_content()
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 120, in retrieve_service_content
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     return self.RetrieveServiceContent("ServiceInstance")
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 223, in vim_request_handler
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     _("Exception in %s ") % (attr_name), excep)
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver VimException: Exception in RetrieveServiceContent : (503, u'Service Unavailable')
2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver 


All administrative control of the cloud is temporarily lost for a timespan of up to 30 minutes.

** Affects: nova
     Importance: High
     Assignee: Shawn Hartsock (hartsock)
         Status: In Progress

** Affects: openstack-vmwareapi-team
     Importance: Critical
     Assignee: Shawn Hartsock (hartsock)
         Status: In Progress


** Tags: vmware

** Changed in: nova
       Status: New => In Progress

** Changed in: nova
   Importance: Undecided => High

** Changed in: nova
    Milestone: None => icehouse-rc1

** Also affects: openstack-vmwareapi-team
   Importance: Undecided
       Status: New

** Changed in: openstack-vmwareapi-team
       Status: New => In Progress

** Changed in: openstack-vmwareapi-team
   Importance: Undecided => Critical

** Changed in: openstack-vmwareapi-team
     Assignee: (unassigned) => Shawn Hartsock (hartsock)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1292583

Title:
  VMware: nova nodes crash on 503 errors, control of vSphere
  infrastructure completely lost

Status in OpenStack Compute (Nova):
  In Progress
Status in The OpenStack VMwareAPI subTeam:
  In Progress

Bug description:
  Related to https://bugs.launchpad.net/nova/+bug/1262288 when the Nova
  node shuts down it leaves it's vSphere SOAP session open. This session
  remains open blocking connection by administrators and by OpenStack.

  The result is administrative control over the entire vSphere, vCenter,
  and ESX infrastructure is temporarily lost.

  Python CLI users experience:
  pyVmomi.VmomiSupport.HostConnectFault: (vim.fault.HostConnectFault) {
     dynamicType = <unset>,
     dynamicProperty = (vmodl.DynamicProperty) [],
     msg = '503 Service Unavailable',
     faultCause = <unset>,
     faultMessage = (vmodl.LocalizableMessage) []
  }

  
  VpXClient users experience:

  Call "ServiceInstance.RetrieveContent" for object "ServiceInstance" on
  Server "<server_name>" failed.

  OpenStack users experience:
  2014-03-14 08:55:06.579 ERROR suds.client [-] <?xml version="1.0" encoding="UTF-8"?>
  <SOAP-ENV:Envelope xmlns:ns0="urn:vim25" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/";>
     <ns1:Body>
        <ns0:RetrieveServiceContent>
           <ns0:_this type="ServiceInstance">ServiceInstance</ns0:_this>
        </ns0:RetrieveServiceContent>
     </ns1:Body>
  </SOAP-ENV:Envelope>
  2014-03-14 08:55:06.581 CRITICAL nova.virt.vmwareapi.driver [-] Unable to connect to server at 192.168.2.36, sleeping for 2 seconds
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver Traceback (most recent call last):
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 753, in _create_session
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     self.vim = self._get_vim_object()
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 742, in _get_vim_object
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     return vim.Vim(protocol=self._scheme, host=self._host_ip)
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 117, in __init__
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     self._service_content = self.retrieve_service_content()
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 120, in retrieve_service_content
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     return self.RetrieveServiceContent("ServiceInstance")
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver   File "/opt/stack/nova/nova/virt/vmwareapi/vim.py", line 223, in vim_request_handler
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver     _("Exception in %s ") % (attr_name), excep)
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver VimException: Exception in RetrieveServiceContent : (503, u'Service Unavailable')
  2014-03-14 08:55:06.581 TRACE nova.virt.vmwareapi.driver 

  
  All administrative control of the cloud is temporarily lost for a timespan of up to 30 minutes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1292583/+subscriptions


Follow ups

References