← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1447490] [NEW] Deletion of instances will be stuck forever if any of deletion hung in 'multipath -r'

 

Public bug reported:

I created about 25 VMs from bootable volumes, after finishing this,
I ran a script to deletion all of them in a very short time.

while what i saw was: all of the VMs were in 'deleting' status and would
never be deleted after waiting for hours

from ps cmd:
stack@ubuntu-server13:/var/log/libvirt$ ps aux | grep multipath
root       8205  0.0  0.0 504988  5560 ?        SLl  Apr22   0:01 /sbin/multipathd
root     115515  0.0  0.0  64968  2144 pts/3    S+   Apr22   0:00 sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -r
root     115516  0.0  0.0  42240  9488 pts/3    S+   Apr22   0:00 /usr/bin/python /usr/local/bin/nova-rootwrap /etc/nova/rootwrap.conf multipath -r
root     115525  0.0  0.0  41792  2592 pts/3    S+   Apr22   0:00 /sbin/multipath -r
stack    151825  0.0  0.0  11744   936 pts/0    S+   02:10   0:00 grep --color=auto multipath

then i killed the multipath -r commands

all vm ran into ERROR status

after digging into nova code,
nova always trying to  get a global file lock :
@utils.synchronized('connect_volume')
    def disconnect_volume(self, connection_info, disk_dev):
        """Detach the volume from instance_name."""
        iscsi_properties = connection_info['data']

      ......
      if self.use_multipath and multipath_device:
            return self._disconnect_volume_multipath_iscsi(iscsi_properties,
                                                           multipath_device)

and then rescan iscsi by 'multipath -r'

def _disconnect_volume_multipath_iscsi(self, iscsi_properties,
                                           multipath_device):
        self._rescan_iscsi()
        self._rescan_multipath()    ---> self._run_multipath('-r', check_exit_code=[0, 1, 21])


In my case, 'multipath -r' hang for a very long time and did not exit for serveral hours
in addtion, this block all deletion of VM instances in the same Nova Node

IMO, Nova should not wait the "BLOCK" command forever, at least, a
timeout is needed for command such as'multipath -r' and 'multipath -ll'

or is there any other solution for my case?


MY ENVIRONMENT:
Ubuntu Server 14:
multipath-tools
multipath enabled in Nova node

Thanks
Peter

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1447490

Title:
  Deletion of instances will be stuck forever if any of deletion hung in
  'multipath -r'

Status in OpenStack Compute (Nova):
  New

Bug description:
  I created about 25 VMs from bootable volumes, after finishing this,
  I ran a script to deletion all of them in a very short time.

  while what i saw was: all of the VMs were in 'deleting' status and
  would never be deleted after waiting for hours

  from ps cmd:
  stack@ubuntu-server13:/var/log/libvirt$ ps aux | grep multipath
  root       8205  0.0  0.0 504988  5560 ?        SLl  Apr22   0:01 /sbin/multipathd
  root     115515  0.0  0.0  64968  2144 pts/3    S+   Apr22   0:00 sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -r
  root     115516  0.0  0.0  42240  9488 pts/3    S+   Apr22   0:00 /usr/bin/python /usr/local/bin/nova-rootwrap /etc/nova/rootwrap.conf multipath -r
  root     115525  0.0  0.0  41792  2592 pts/3    S+   Apr22   0:00 /sbin/multipath -r
  stack    151825  0.0  0.0  11744   936 pts/0    S+   02:10   0:00 grep --color=auto multipath

  then i killed the multipath -r commands

  all vm ran into ERROR status

  after digging into nova code,
  nova always trying to  get a global file lock :
  @utils.synchronized('connect_volume')
      def disconnect_volume(self, connection_info, disk_dev):
          """Detach the volume from instance_name."""
          iscsi_properties = connection_info['data']

        ......
        if self.use_multipath and multipath_device:
              return self._disconnect_volume_multipath_iscsi(iscsi_properties,
                                                             multipath_device)

  and then rescan iscsi by 'multipath -r'

  def _disconnect_volume_multipath_iscsi(self, iscsi_properties,
                                             multipath_device):
          self._rescan_iscsi()
          self._rescan_multipath()    ---> self._run_multipath('-r', check_exit_code=[0, 1, 21])

  
  In my case, 'multipath -r' hang for a very long time and did not exit for serveral hours
  in addtion, this block all deletion of VM instances in the same Nova Node

  IMO, Nova should not wait the "BLOCK" command forever, at least, a
  timeout is needed for command such as'multipath -r' and 'multipath
  -ll'

  or is there any other solution for my case?

  
  MY ENVIRONMENT:
  Ubuntu Server 14:
  multipath-tools
  multipath enabled in Nova node

  Thanks
  Peter

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1447490/+subscriptions


Follow ups

References