yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87743
[Bug 1950894] Re: live_migration_permit_post_copy mode does not work
** Project changed: nova => charm-nova-compute
** Summary changed:
- live_migration_permit_post_copy mode does not work
+ live-migration-permit-post-copy mode does not work
** Description changed:
Description
===========
Some customers have noted that some VMs never complete a
live migration. The VM's memory copy keeps oscillating
- around 1-10% but never completes. After changing
- live_migration_permit_post_copy = True, we expected this to
+ around 1-10% but never completes. After changing
+ live-migration-permit-post-copy = True, we expected this to
converge and migrate successfully as this feature describes it
should.
Workaround 1: It's possible to complete the process if you log into the source
host and run the QMP command[1]:
virsh qemu-monitor-command instance-00000026 '{"execute":"migrate-
start-postcopy"}'
-
- Workaround 2: The migration finishes if you run 'nova live-migration-force-complete'
-
+ Workaround 2: The migration finishes if you run 'nova live-migration-
+ force-complete'
I believe this can also be a libvirt bug given that I don't see any "migrate-start-postcopy"
coming from nova/libvirt logs[4], but only after I manually triggered it via the execute
command above, at 2021-11-12 19:14:08.053+0000[4].
-
Steps to reproduce
==================
* Set up an OpenStack deployment with live_migration_permit_post_copy=False
* Create a large VM (8+ CPUs) and install stress-ng
* Run stress-ng:
- nohup stress-ng --vm 4 --vm-bytes 10% --vm-method write64 --vm-addr-method pwr2 -t 1h &
+ nohup stress-ng --vm 4 --vm-bytes 10% --vm-method write64 --vm-addr-method pwr2 -t 1h &
* Migrate the VM, and check for the source host logs messages like:
- 'Migration running for \d+ secs, memory \d+% remaining'
- This should be oscillating like describing and migration not completing
+ 'Migration running for \d+ secs, memory \d+% remaining'
+ This should be oscillating like describing and migration not completing
* Complete or cancel the above migration, set live_migration_permit_post_copy=True,
- restart nova services on the computes, and re-do the operation
-
+ restart nova services on the computes, and re-do the operation
Expected result
===============
Migration should complete 100% of times
Actual result
=============
The migration does not complete and VM's memory is never copied.
Environment
===========
1. Exact version of OpenStack you are running[8]
21.2.1-0ubuntu1
-
2. Which hypervisor did you use[8]?
qemu-kvm: 4.2-3ubuntu6.18
libvirt-daemon: 6.0.0-0ubuntu8.14
-
2. Which storage type did you use?
Shared Ceph
-
3. Which networking type did you use?
OpenvSwitch L3HA
Logs & Configs
==============
-
[1] QMP Commands: https://gist.github.com/sombrafam/5e8e991058001c2b3843c0d08b4cd7d1
[2] Migration (completed manually with workaround 1) logs: https://gist.github.com/sombrafam/b74497150ae4ae32494ac5735189e149
[3] nova-compute.log src: https://gist.github.com/sombrafam/b74497150ae4ae32494ac5735189e149
[4] libvirt.log src: https://gist.github.com/sombrafam/69f05404d7097265140e1578ea50c00c
[5] Migration list: https://gist.github.com/sombrafam/39b72e242e27b6a3123603db1faa7b19
[6] Nova.conf dst host: https://gist.github.com/sombrafam/ad43b268e7f4b69e7da513a0f7a0095f
[7] Nova.conf src host: https://gist.github.com/sombrafam/ab27b40e577fbe56d741f01e811f3a18
[8] Package versions: https://gist.github.com/sombrafam/0622792d82750b2141b45580b625b69f
[9] VM info: https://gist.github.com/sombrafam/57eaa4c4ba4b141dec9659ee01f25b6d
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950894
Title:
live-migration-permit-post-copy mode does not work
Status in OpenStack Nova Compute Charm:
New
Bug description:
Description
===========
Some customers have noted that some VMs never complete a
live migration. The VM's memory copy keeps oscillating
around 1-10% but never completes. After changing
live-migration-permit-post-copy = True, we expected this to
converge and migrate successfully as this feature describes it
should.
Workaround 1: It's possible to complete the process if you log into the source
host and run the QMP command[1]:
virsh qemu-monitor-command instance-00000026 '{"execute":"migrate-
start-postcopy"}'
Workaround 2: The migration finishes if you run 'nova live-migration-
force-complete'
I believe this can also be a libvirt bug given that I don't see any "migrate-start-postcopy"
coming from nova/libvirt logs[4], but only after I manually triggered it via the execute
command above, at 2021-11-12 19:14:08.053+0000[4].
Steps to reproduce
==================
* Set up an OpenStack deployment with live_migration_permit_post_copy=False
* Create a large VM (8+ CPUs) and install stress-ng
* Run stress-ng:
nohup stress-ng --vm 4 --vm-bytes 10% --vm-method write64 --vm-addr-method pwr2 -t 1h &
* Migrate the VM, and check for the source host logs messages like:
'Migration running for \d+ secs, memory \d+% remaining'
This should be oscillating like describing and migration not completing
* Complete or cancel the above migration, set live_migration_permit_post_copy=True,
restart nova services on the computes, and re-do the operation
Expected result
===============
Migration should complete 100% of times
Actual result
=============
The migration does not complete and VM's memory is never copied.
Environment
===========
1. Exact version of OpenStack you are running[8]
21.2.1-0ubuntu1
2. Which hypervisor did you use[8]?
qemu-kvm: 4.2-3ubuntu6.18
libvirt-daemon: 6.0.0-0ubuntu8.14
2. Which storage type did you use?
Shared Ceph
3. Which networking type did you use?
OpenvSwitch L3HA
Logs & Configs
==============
[1] QMP Commands: https://gist.github.com/sombrafam/5e8e991058001c2b3843c0d08b4cd7d1
[2] Migration (completed manually with workaround 1) logs: https://gist.github.com/sombrafam/b74497150ae4ae32494ac5735189e149
[3] nova-compute.log src: https://gist.github.com/sombrafam/b74497150ae4ae32494ac5735189e149
[4] libvirt.log src: https://gist.github.com/sombrafam/69f05404d7097265140e1578ea50c00c
[5] Migration list: https://gist.github.com/sombrafam/39b72e242e27b6a3123603db1faa7b19
[6] Nova.conf dst host: https://gist.github.com/sombrafam/ad43b268e7f4b69e7da513a0f7a0095f
[7] Nova.conf src host: https://gist.github.com/sombrafam/ab27b40e577fbe56d741f01e811f3a18
[8] Package versions: https://gist.github.com/sombrafam/0622792d82750b2141b45580b625b69f
[9] VM info: https://gist.github.com/sombrafam/57eaa4c4ba4b141dec9659ee01f25b6d
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1950894/+subscriptions
References