yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #71527
[Bug 1753676] [NEW] Live migration not working as Expected when Restarting nova-compute service while migration
Public bug reported:
Description
===========
Environment: Ubuntu 16.04
Openstack Version: Pike
I am trying to migrate VM ( live migration ( block migration ) ) form
one compute node to another compute node...Everything looks good unless
I restart nova-compute service, live migration still running underneath
with help of libvirt, once the vm reaches destination, database is not
updated properly.
Steps to reproduce:
===================
nova.conf ( libvirt setting on both compute nodes )
[libvirt]
live_migration_bandwidth=1200
live_migration_downtime=100
live_migration_downtime_steps =3
live_migration_downtime_delay=10
live_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE
virt_type = kvm
inject_password = False
disk_cachemodes = network=writeback
live_migration_uri = "qemu+tcp://nova@%s/system"
live_migration_tunnelled = False
block_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_NON_SHARED_INC
( default openstack live migration configuration ( pre-copy with no tunneling )
Source vm root disk ( boot from volume with one ephemernal disk (160GB) )
Trying to migrate vm from compute1 to compute2, below is my source vm.
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
1) nova live-migration --block-migrate <vm-id> compute2
[req-48a3df61-3974-46ac-8019-c4c4a0f8a8c8 4a8150eb246a4450829331e993f8c3fd f11a5d3631f14c4f879a2e7dddb96c06 - default default] pre_live_migration data is LibvirtLiveMigrateData(bdms=<?>,block_migration=True,disk_available_mb=6900736,disk_over_commit=<?>,filename='tmpW5ApOS',graphics_listen_addr_spice=x.x.x.x,graphics_listen_addr_vnc=127.0.0.1,image_type='default',instance_relative_path='504028fc-1381-42ca-ad7c-def7f749a722',is_shared_block_storage=False,is_shared_instance_path=False,is_volume_backed=True,migration=<?>,serial_listen_addr=None,serial_listen_ports=<?>,supported_perf_events=<?>,target_connect_addr=<?>) pre_live_migration /openstack/venvs/nova-16.0.6/lib/python2.7/site-packages/nova/compute/manager.py:5437
Migration started, able to see the data and memory transfer ( using iftop )
Data transfer between compute nodes using iftop
<= 4.94Gb 4.99Gb 5.01Gb
Restarted Nova-compute service on source compute node ( where the vm is
migrating)
Live migration still it is going, once migration completes, below is my
total data transfer ( using iftop )
TX: cum: 17.3MB peak: 2.50Mb rates: 11.1Kb 7.11Kb 463Kb
RX: 97.7GB 4.97Gb 3.82Kb 1.93Kb 1.87Gb
TOTAL: 97.7GB 4.97Gb
Once migration completes, from the destination compute node ( we can
able to see the virsh domain running)
root@compute2:~# virsh list --all
Id Name State
----------------------------------------------------
3 instance-00000153 running
>From the nova-compute.log
Instance <id> has been moved to another host compute1(compute1). There
are allocations remaining against the source host that might need to be
removed: {u'resources': {u'VCPU': 8, u'MEMORY_MB': 23808, u'DISK_GB':
180}}. _remove_deleted_instances_allocations
/openstack/venvs/nova-16.0.6/lib/python2.7/site-
packages/nova/compute/resource_tracker.py:123
Nova compute still showing 0 vcpus ( but 8 core vm was there )
Total usable vcpus: 56, total allocated vcpus: 0
_report_final_resource_view /openstack/venvs/nova-16.0.6/lib/python2.7
/site-packages/nova/compute/resource_tracker.py:792
nova show <vm-id> ( still nova db shows src hostname, db is not updated
with new compute_node )
OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
Entire vm data is still present on both compute nodes.
After restarting nova-compute service on destination machine ( got below
warning from nova-compute )
2018-03-05 11:19:05.942 5791 WARNING nova.compute.manager [-] [instance:
18d63c06-b124-4ec4-9e36-afcadccaf23e] Instance is unexpectedly not
found. Ignore.: InstanceNotFound: Instance
18d63c06-b124-4ec4-9e36-afcadccaf23e could not be found.
Expected result
===============
DB should update accordingly or it should abort the migration
Actual result
=============
nova show <vm-id> ( still nova db shows src hostname, db is not updated
with new compute_node )
OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
Virsh list on the destination compute node shows below output:
root@compute2:~# virsh list --all
Id Name State
----------------------------------------------------
3 instance-00000153 running
Entire vm data is still present on both compute nodes.
ls /var/lib/nova/instances/18d63c06-b124-4ec4-9e36-afcadccaf23e
After restarting nova-compute service on destination machine ( got below warning from nova-compute )
2018-03-05 11:19:05.942 5791 WARNING nova.compute.manager [-] [instance:
18d63c06-b124-4ec4-9e36-afcadccaf23e] Instance is unexpectedly not
found. Ignore.: InstanceNotFound: Instance
18d63c06-b124-4ec4-9e36-afcadccaf23e could not be found.
** Affects: nova
Importance: Undecided
Status: New
** Summary changed:
- Live migration not working as Expected when Restarting nova-compute service
+ Live migration not working as Expected when Restarting nova-compute service while migration
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1753676
Title:
Live migration not working as Expected when Restarting nova-compute
service while migration
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
Environment: Ubuntu 16.04
Openstack Version: Pike
I am trying to migrate VM ( live migration ( block migration ) ) form
one compute node to another compute node...Everything looks good
unless I restart nova-compute service, live migration still running
underneath with help of libvirt, once the vm reaches destination,
database is not updated properly.
Steps to reproduce:
===================
nova.conf ( libvirt setting on both compute nodes )
[libvirt]
live_migration_bandwidth=1200
live_migration_downtime=100
live_migration_downtime_steps =3
live_migration_downtime_delay=10
live_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE
virt_type = kvm
inject_password = False
disk_cachemodes = network=writeback
live_migration_uri = "qemu+tcp://nova@%s/system"
live_migration_tunnelled = False
block_migration_flag = VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_NON_SHARED_INC
( default openstack live migration configuration ( pre-copy with no tunneling )
Source vm root disk ( boot from volume with one ephemernal disk (160GB) )
Trying to migrate vm from compute1 to compute2, below is my source vm.
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
1) nova live-migration --block-migrate <vm-id> compute2
[req-48a3df61-3974-46ac-8019-c4c4a0f8a8c8 4a8150eb246a4450829331e993f8c3fd f11a5d3631f14c4f879a2e7dddb96c06 - default default] pre_live_migration data is LibvirtLiveMigrateData(bdms=<?>,block_migration=True,disk_available_mb=6900736,disk_over_commit=<?>,filename='tmpW5ApOS',graphics_listen_addr_spice=x.x.x.x,graphics_listen_addr_vnc=127.0.0.1,image_type='default',instance_relative_path='504028fc-1381-42ca-ad7c-def7f749a722',is_shared_block_storage=False,is_shared_instance_path=False,is_volume_backed=True,migration=<?>,serial_listen_addr=None,serial_listen_ports=<?>,supported_perf_events=<?>,target_connect_addr=<?>) pre_live_migration /openstack/venvs/nova-16.0.6/lib/python2.7/site-packages/nova/compute/manager.py:5437
Migration started, able to see the data and memory transfer ( using iftop )
Data transfer between compute nodes using iftop
<= 4.94Gb 4.99Gb 5.01Gb
Restarted Nova-compute service on source compute node ( where the vm
is migrating)
Live migration still it is going, once migration completes, below is
my total data transfer ( using iftop )
TX: cum: 17.3MB peak: 2.50Mb rates: 11.1Kb 7.11Kb 463Kb
RX: 97.7GB 4.97Gb 3.82Kb 1.93Kb 1.87Gb
TOTAL: 97.7GB 4.97Gb
Once migration completes, from the destination compute node ( we can
able to see the virsh domain running)
root@compute2:~# virsh list --all
Id Name State
----------------------------------------------------
3 instance-00000153 running
From the nova-compute.log
Instance <id> has been moved to another host compute1(compute1). There
are allocations remaining against the source host that might need to
be removed: {u'resources': {u'VCPU': 8, u'MEMORY_MB': 23808,
u'DISK_GB': 180}}. _remove_deleted_instances_allocations
/openstack/venvs/nova-16.0.6/lib/python2.7/site-
packages/nova/compute/resource_tracker.py:123
Nova compute still showing 0 vcpus ( but 8 core vm was there )
Total usable vcpus: 56, total allocated vcpus: 0
_report_final_resource_view /openstack/venvs/nova-16.0.6/lib/python2.7
/site-packages/nova/compute/resource_tracker.py:792
nova show <vm-id> ( still nova db shows src hostname, db is not
updated with new compute_node )
OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
Entire vm data is still present on both compute nodes.
After restarting nova-compute service on destination machine ( got
below warning from nova-compute )
2018-03-05 11:19:05.942 5791 WARNING nova.compute.manager [-]
[instance: 18d63c06-b124-4ec4-9e36-afcadccaf23e] Instance is
unexpectedly not found. Ignore.: InstanceNotFound: Instance
18d63c06-b124-4ec4-9e36-afcadccaf23e could not be found.
Expected result
===============
DB should update accordingly or it should abort the migration
Actual result
=============
nova show <vm-id> ( still nova db shows src hostname, db is not
updated with new compute_node )
OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | testcase1-all-ephemernal-boot-from-vol |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000153
Virsh list on the destination compute node shows below output:
root@compute2:~# virsh list --all
Id Name State
----------------------------------------------------
3 instance-00000153 running
Entire vm data is still present on both compute nodes.
ls /var/lib/nova/instances/18d63c06-b124-4ec4-9e36-afcadccaf23e
After restarting nova-compute service on destination machine ( got below warning from nova-compute )
2018-03-05 11:19:05.942 5791 WARNING nova.compute.manager [-]
[instance: 18d63c06-b124-4ec4-9e36-afcadccaf23e] Instance is
unexpectedly not found. Ignore.: InstanceNotFound: Instance
18d63c06-b124-4ec4-9e36-afcadccaf23e could not be found.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1753676/+subscriptions
Follow ups