yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #38728
[Bug 1498196] [NEW] Live migration's assigned ports conflicts
Public bug reported:
It looks like during live migration some generated port to use for live
migration didn't checked for being already used and/or didn't have a re-
get new port if the old one got occupied by someone else.
Here is an example of this behavior in nova-compute log files from the
source compute node:
2015-09-20T06:25:21.701157+00:00 info: 2015-09-20 06:25:21.700 17037 INFO nova.virt.libvirt.driver [-] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Instance spawned successfully.
2015-09-20T06:25:21.828941+00:00 info: 2015-09-20 06:25:21.828 17037 INFO nova.compute.manager [req-8fdf447a-48c4-4b41-8276-9459ae9e5a65 - - - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] VM
Resumed (Lifecycle Event)
2015-09-20T06:25:37.349069+00:00 err: 2015-09-20 06:25:37.348 17037 ERROR nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Live Migration failure: internal error: early end of file from monitor: possible problem:
2015-09-20T06:25:37.116947Z qemu-system-x86_64: -incoming tcp:[::]:49152: Failed to bind socket: Address already in use
2015-09-20T06:25:37.354837+00:00 info: 2015-09-20 06:25:37.354 17037 INFO nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Migration running for 0 secs, memory 0% remaining; (bytes processed=0, remaining=0, total=0)
2015-09-20T06:25:37.856147+00:00 err: 2015-09-20 06:25:37.855 17037 ERROR nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Migration operation has aborted
Some env description:
root@node-169:~# nova-compute --version
2015.1.1
root@node-169:~# dpkg -l |grep 'nova-compute '|awk '{print $3}'
1:2015.1.1-1~u14.04+mos19662
Steps to reproduce:
Actually this happens during rally testing of pretty big env (~200 nodes) one per 200 iterations so chances for getting that on scale are pretty big. So it should be easily reproduced under following circumastances:
1. Very high rate of migrations.
2. A lot of running VMs/other services with large amount of used TCP ports.
Both of these statements will lead to the higher chances of getting
collision for qemu migration port allocation procedure.
** Affects: mos
Importance: Undecided
Status: New
** Affects: nova
Importance: Undecided
Status: New
** Tags: scale
** Project changed: nova => mos
** Also affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498196
Title:
Live migration's assigned ports conflicts
Status in Mirantis OpenStack:
New
Status in OpenStack Compute (nova):
New
Bug description:
It looks like during live migration some generated port to use for
live migration didn't checked for being already used and/or didn't
have a re-get new port if the old one got occupied by someone else.
Here is an example of this behavior in nova-compute log files from the
source compute node:
2015-09-20T06:25:21.701157+00:00 info: 2015-09-20 06:25:21.700 17037 INFO nova.virt.libvirt.driver [-] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Instance spawned successfully.
2015-09-20T06:25:21.828941+00:00 info: 2015-09-20 06:25:21.828 17037 INFO nova.compute.manager [req-8fdf447a-48c4-4b41-8276-9459ae9e5a65 - - - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] VM
Resumed (Lifecycle Event)
2015-09-20T06:25:37.349069+00:00 err: 2015-09-20 06:25:37.348 17037 ERROR nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Live Migration failure: internal error: early end of file from monitor: possible problem:
2015-09-20T06:25:37.116947Z qemu-system-x86_64: -incoming tcp:[::]:49152: Failed to bind socket: Address already in use
2015-09-20T06:25:37.354837+00:00 info: 2015-09-20 06:25:37.354 17037 INFO nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Migration running for 0 secs, memory 0% remaining; (bytes processed=0, remaining=0, total=0)
2015-09-20T06:25:37.856147+00:00 err: 2015-09-20 06:25:37.855 17037 ERROR nova.virt.libvirt.driver [req-8150b87f-f87b-4bec-8bab-561dd37605d5 820904596e1d422e9460f472b7b9672f 04ce0fe8f21a4a6b8535c5cefd9f8594 - - -] [instance: 60493be7-4b00-4f4f-a785-9d4aa6e74f58] Migration operation has aborted
Some env description:
root@node-169:~# nova-compute --version
2015.1.1
root@node-169:~# dpkg -l |grep 'nova-compute '|awk '{print $3}'
1:2015.1.1-1~u14.04+mos19662
Steps to reproduce:
Actually this happens during rally testing of pretty big env (~200 nodes) one per 200 iterations so chances for getting that on scale are pretty big. So it should be easily reproduced under following circumastances:
1. Very high rate of migrations.
2. A lot of running VMs/other services with large amount of used TCP ports.
Both of these statements will lead to the higher chances of getting
collision for qemu migration port allocation procedure.
To manage notifications about this bug go to:
https://bugs.launchpad.net/mos/+bug/1498196/+subscriptions
Follow ups