yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #57241
[Bug 1629114] [NEW] Ceph RBD live-migration failure due to wrong rbd_user/rbd_secret
Public bug reported:
Description:
Ceph RBD live-migration fails to modify rbd_user/libvirt secret UUID to
the receiving hosts information, causing live migration to fail.
Steps to reproduce:
Compute node A:
/etc/nova/nova.conf:
rbd_user=compute_node_A
rbd_secret_uuid = secretA
Secret file:
/etc/libvirt/secrets/secretA.xml
Compute node B:
/etc/nova/nova.conf:
rbd_user=compute_node_B
rbd_secret_uuid = secretB
Secret file:
/etc/libvirt/secret/secretB.xml
Expected result:
Live migration completes
Current result:
Live migration fails because it sets the secret/key/id to the
information from cmopute_node_A instead of compute_node_B.
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.613 175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] [instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Live Migration failure: internal error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:keysomecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none: error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: Traceback (most recent call last):
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
Sep 29 18:50:40 compute_node_A nova-compute[175448]: timer()
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
Sep 29 18:50:40 compute_node_A nova-compute[175448]: cb(*args, **kw)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
Sep 29 18:50:40 compute_node_A nova-compute[175448]: waiter.switch(result)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
Sep 29 18:50:40 compute_node_A nova-compute[175448]: result = function(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 1145, in context_wrapper
Sep 29 18:50:40 compute_node_A nova-compute[175448]: return func(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6104, in _live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]: instance=instance)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Sep 29 18:50:40 compute_node_A nova-compute[175448]: self.force_reraise()
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Sep 29 18:50:40 compute_node_A nova-compute[175448]: six.reraise(self.type_, self.value, self.tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6064, in _live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]: migration_flags)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
Sep 29 18:50:40 compute_node_A nova-compute[175448]: result = proxy_call(self._autowrap, f, *args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
Sep 29 18:50:40 compute_node_A nova-compute[175448]: rv = execute(f, *args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
Sep 29 18:50:40 compute_node_A nova-compute[175448]: six.reraise(c, e, tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Sep 29 18:50:40 compute_node_A nova-compute[175448]: rv = meth(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1833, in migrateToURI3
Sep 29 18:50:40 compute_node_A nova-compute[175448]: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: libvirtError: internal error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:key=somecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none: error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.703 175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] [instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Migration operation has aborted
Environment
===========
1. OpenStack Mitaka from Ubuntu
2. KVM
2. Ceph
3. Neutron with Calico
----
More information:
The issue occurs when the new libvirt XML config is generated for the
disk configuration:
https://github.com/openstack/nova/blob/643aed652d0e51e36dbe7cb106285b51e3b5941b/nova/virt/libvirt/volume/net.py#L67
if (conf.source_protocol == 'rbd' and
CONF.libvirt.rbd_secret_uuid):
conf.auth_secret_uuid = CONF.libvirt.rbd_secret_uuid
auth_enabled = True # Force authentication locally
if CONF.libvirt.rbd_user:
conf.auth_username = CONF.libvirt.rbd_user
Instead of getting the configuration information from the remote host
(node B when live migrating from node A -> node B) it is pulling the
information from the local /etc/nova/nova.conf file (using the CONF
object) instead of getting that information from the remote host that
the VM is about to be migrated to.
node A's nova.conf file does not match node B's nova.conf file when it
comes to the "rbd_user"/"rbd_secret".
This causes failures to migrate the VM over because Ceph won't let
compute_node_B authenticate because there are no credentials.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1629114
Title:
Ceph RBD live-migration failure due to wrong rbd_user/rbd_secret
Status in OpenStack Compute (nova):
New
Bug description:
Description:
Ceph RBD live-migration fails to modify rbd_user/libvirt secret UUID
to the receiving hosts information, causing live migration to fail.
Steps to reproduce:
Compute node A:
/etc/nova/nova.conf:
rbd_user=compute_node_A
rbd_secret_uuid = secretA
Secret file:
/etc/libvirt/secrets/secretA.xml
Compute node B:
/etc/nova/nova.conf:
rbd_user=compute_node_B
rbd_secret_uuid = secretB
Secret file:
/etc/libvirt/secret/secretB.xml
Expected result:
Live migration completes
Current result:
Live migration fails because it sets the secret/key/id to the
information from cmopute_node_A instead of compute_node_B.
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.613 175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] [instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Live Migration failure: internal error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:keysomecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none: error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: Traceback (most recent call last):
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
Sep 29 18:50:40 compute_node_A nova-compute[175448]: timer()
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
Sep 29 18:50:40 compute_node_A nova-compute[175448]: cb(*args, **kw)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
Sep 29 18:50:40 compute_node_A nova-compute[175448]: waiter.switch(result)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
Sep 29 18:50:40 compute_node_A nova-compute[175448]: result = function(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 1145, in context_wrapper
Sep 29 18:50:40 compute_node_A nova-compute[175448]: return func(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6104, in _live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]: instance=instance)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Sep 29 18:50:40 compute_node_A nova-compute[175448]: self.force_reraise()
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Sep 29 18:50:40 compute_node_A nova-compute[175448]: six.reraise(self.type_, self.value, self.tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6064, in _live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]: migration_flags)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
Sep 29 18:50:40 compute_node_A nova-compute[175448]: result = proxy_call(self._autowrap, f, *args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
Sep 29 18:50:40 compute_node_A nova-compute[175448]: rv = execute(f, *args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
Sep 29 18:50:40 compute_node_A nova-compute[175448]: six.reraise(c, e, tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Sep 29 18:50:40 compute_node_A nova-compute[175448]: rv = meth(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1833, in migrateToURI3
Sep 29 18:50:40 compute_node_A nova-compute[175448]: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: libvirtError: internal error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:key=somecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none: error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.703 175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] [instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Migration operation has aborted
Environment
===========
1. OpenStack Mitaka from Ubuntu
2. KVM
2. Ceph
3. Neutron with Calico
----
More information:
The issue occurs when the new libvirt XML config is generated for the
disk configuration:
https://github.com/openstack/nova/blob/643aed652d0e51e36dbe7cb106285b51e3b5941b/nova/virt/libvirt/volume/net.py#L67
if (conf.source_protocol == 'rbd' and
CONF.libvirt.rbd_secret_uuid):
conf.auth_secret_uuid = CONF.libvirt.rbd_secret_uuid
auth_enabled = True # Force authentication locally
if CONF.libvirt.rbd_user:
conf.auth_username = CONF.libvirt.rbd_user
Instead of getting the configuration information from the remote host
(node B when live migrating from node A -> node B) it is pulling the
information from the local /etc/nova/nova.conf file (using the CONF
object) instead of getting that information from the remote host that
the VM is about to be migrated to.
node A's nova.conf file does not match node B's nova.conf file when it
comes to the "rbd_user"/"rbd_secret".
This causes failures to migrate the VM over because Ceph won't let
compute_node_B authenticate because there are no credentials.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1629114/+subscriptions
Follow ups