yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #62897
[Bug 1678577] [NEW] nova live migration failed in some case
Public bug reported:
env: nova 15.0.2 + libvirt + kvm + centos
in some situation, nova request spec become
{"nova_object.version": "1.8", "nova_object.changes": ["instance_uuid",
"requested_destination", "retry", "num_instances", "pci_requests",
"limits", "availability_zone", "force_nodes", "image", "instance_group",
"force_hosts", "numa_topology", "flavor", "project_id",
"scheduler_hints", "ignore_hosts"], "nova_object.name": "RequestSpec",
"nova_object.data": {"requested_destination": null, "instance_uuid":
"ca01b22b-d2d4-4291-96bd-ff6111f1f88b", "retry": {"nova_object.version":
"1.1", "nova_object.changes": ["num_attempts", "hosts"],
"nova_object.name": "SchedulerRetries", "nova_object.data":
{"num_attempts": 1, "hosts": {"nova_object.version": "1.16",
"nova_object.changes": ["objects"], "nova_object.name":
"ComputeNodeList", "nova_object.data": {"objects":
[{"nova_object.version": "1.16", "nova_object.changes": ["host",
"hypervisor_hostname"], "nova_object.name": "ComputeNode",
"nova_object.data": {"host": "control01", "hypervisor_hostname":
"control01"}, "nova_object.namespace": "nova"}]},
"nova_object.namespace": "nova"}}, "nova_object.namespace": "nova"},
"num_instances": 1, "pci_requests": {"nova_object.version": "1.1",
"nova_object.name": "InstancePCIRequests", "nova_object.data":
{"instance_uuid": "ca01b22b-d2d4-4291-96bd-ff6111f1f88b", "requests":
[]}, "nova_object.namespace": "nova"}, "limits": {"nova_object.version":
"1.0", "nova_object.changes": ["memory_mb", "vcpu", "disk_gb",
"numa_topology"], "nova_object.name": "SchedulerLimits",
"nova_object.data": {"vcpu": null, "memory_mb": 245427, "disk_gb": 8371,
"numa_topology": null}, "nova_object.namespace": "nova"},
"availability_zone": null, "force_nodes": null, "image":
{"nova_object.version": "1.8", "nova_object.changes": ["min_disk",
"container_format", "min_ram", "disk_format", "properties"],
"nova_object.name": "ImageMeta", "nova_object.data": {"min_disk": 1,
"container_format": "bare", "min_ram": 0, "disk_format": "raw",
"properties": {"nova_object.version": "1.16", "nova_object.name":
"ImageMetaProps", "nova_object.data": {}, "nova_object.namespace":
"nova"}}, "nova_object.namespace": "nova"}, "instance_group": null,
"force_hosts": null, "numa_topology": null, "ignore_hosts": null,
"flavor": {"nova_object.version": "1.1", "nova_object.name": "Flavor",
"nova_object.data": {"disabled": false, "root_gb": 1, "name": "m1.tiny",
"flavorid": "a70249ef-5ea9-49cb-b35f-ab4732064981", "deleted": false,
"created_at": "2017-03-22T08:13:48Z", "ephemeral_gb": 0, "updated_at":
null, "memory_mb": 256, "vcpus": 1, "extra_specs": {}, "swap": 0,
"rxtx_factor": 1.0, "is_public": true, "deleted_at": null,
"vcpu_weight": 0, "id": 119}, "nova_object.namespace": "nova"},
"project_id": "f3c6d500b267432c858c588800b49653", "scheduler_hints":
{}}, "nova_object.namespace": "nova"}
check the retry part
retry": {"nova_object.version": "1.1", "nova_object.changes":
["num_attempts", "hosts"], "nova_object.name": "SchedulerRetries",
"nova_object.data": {"num_attempts": 1, "hosts": {"nova_object.version":
"1.16", "nova_object.changes": ["objects"], "nova_object.name":
"ComputeNodeList", "nova_object.data": {"objects":
[{"nova_object.version": "1.16", "nova_object.changes": ["host",
"hypervisor_hostname"], "nova_object.name": "ComputeNode",
"nova_object.data": {"host": "control01", "hypervisor_hostname":
"control01"}, "nova_object.namespace": "nova"}]}
it has control01 as host even it is in control02
when live migrate this vm from controll02 to control01, get error in
"migration-list", after check the nova-scheduler logs, got
2017-04-02 14:01:47.010 6 DEBUG nova.filters [req-191c8f6e-010b-42f6-acc6-c84c689f649c 2442cfcb9d5c4daf8d90af8bcfe30df7 8eb03bbcdfd84f68b88a7fbaa74e2327 - - -] Starting with 1 host(s) get_filtered_objects /var/lib/kolla/venv/lib/python2.7/site-packages/nova/filters.py:70
2017-04-02 14:01:47.010 6 INFO nova.scheduler.filters.retry_filter [req-191c8f6e-010b-42f6-acc6-c84c689f649c 2442cfcb9d5c4daf8d90af8bcfe30df7 8eb03bbcdfd84f68b88a7fbaa74e2327 - - -] Host [u'control01', u'control01'] fails. Previously tried hosts: [[u'control01', u'control01']]
I think the root cause is the retry part, and still do not know how it happen.
** Affects: nova
Importance: Undecided
Status: New
** Attachment added: "nova-scheduler.log"
https://bugs.launchpad.net/bugs/1678577/+attachment/4852537/+files/nova-scheduler.log
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1678577
Title:
nova live migration failed in some case
Status in OpenStack Compute (nova):
New
Bug description:
env: nova 15.0.2 + libvirt + kvm + centos
in some situation, nova request spec become
{"nova_object.version": "1.8", "nova_object.changes":
["instance_uuid", "requested_destination", "retry", "num_instances",
"pci_requests", "limits", "availability_zone", "force_nodes", "image",
"instance_group", "force_hosts", "numa_topology", "flavor",
"project_id", "scheduler_hints", "ignore_hosts"], "nova_object.name":
"RequestSpec", "nova_object.data": {"requested_destination": null,
"instance_uuid": "ca01b22b-d2d4-4291-96bd-ff6111f1f88b", "retry":
{"nova_object.version": "1.1", "nova_object.changes": ["num_attempts",
"hosts"], "nova_object.name": "SchedulerRetries", "nova_object.data":
{"num_attempts": 1, "hosts": {"nova_object.version": "1.16",
"nova_object.changes": ["objects"], "nova_object.name":
"ComputeNodeList", "nova_object.data": {"objects":
[{"nova_object.version": "1.16", "nova_object.changes": ["host",
"hypervisor_hostname"], "nova_object.name": "ComputeNode",
"nova_object.data": {"host": "control01", "hypervisor_hostname":
"control01"}, "nova_object.namespace": "nova"}]},
"nova_object.namespace": "nova"}}, "nova_object.namespace": "nova"},
"num_instances": 1, "pci_requests": {"nova_object.version": "1.1",
"nova_object.name": "InstancePCIRequests", "nova_object.data":
{"instance_uuid": "ca01b22b-d2d4-4291-96bd-ff6111f1f88b", "requests":
[]}, "nova_object.namespace": "nova"}, "limits":
{"nova_object.version": "1.0", "nova_object.changes": ["memory_mb",
"vcpu", "disk_gb", "numa_topology"], "nova_object.name":
"SchedulerLimits", "nova_object.data": {"vcpu": null, "memory_mb":
245427, "disk_gb": 8371, "numa_topology": null},
"nova_object.namespace": "nova"}, "availability_zone": null,
"force_nodes": null, "image": {"nova_object.version": "1.8",
"nova_object.changes": ["min_disk", "container_format", "min_ram",
"disk_format", "properties"], "nova_object.name": "ImageMeta",
"nova_object.data": {"min_disk": 1, "container_format": "bare",
"min_ram": 0, "disk_format": "raw", "properties":
{"nova_object.version": "1.16", "nova_object.name": "ImageMetaProps",
"nova_object.data": {}, "nova_object.namespace": "nova"}},
"nova_object.namespace": "nova"}, "instance_group": null,
"force_hosts": null, "numa_topology": null, "ignore_hosts": null,
"flavor": {"nova_object.version": "1.1", "nova_object.name": "Flavor",
"nova_object.data": {"disabled": false, "root_gb": 1, "name":
"m1.tiny", "flavorid": "a70249ef-5ea9-49cb-b35f-ab4732064981",
"deleted": false, "created_at": "2017-03-22T08:13:48Z",
"ephemeral_gb": 0, "updated_at": null, "memory_mb": 256, "vcpus": 1,
"extra_specs": {}, "swap": 0, "rxtx_factor": 1.0, "is_public": true,
"deleted_at": null, "vcpu_weight": 0, "id": 119},
"nova_object.namespace": "nova"}, "project_id":
"f3c6d500b267432c858c588800b49653", "scheduler_hints": {}},
"nova_object.namespace": "nova"}
check the retry part
retry": {"nova_object.version": "1.1", "nova_object.changes":
["num_attempts", "hosts"], "nova_object.name": "SchedulerRetries",
"nova_object.data": {"num_attempts": 1, "hosts":
{"nova_object.version": "1.16", "nova_object.changes": ["objects"],
"nova_object.name": "ComputeNodeList", "nova_object.data": {"objects":
[{"nova_object.version": "1.16", "nova_object.changes": ["host",
"hypervisor_hostname"], "nova_object.name": "ComputeNode",
"nova_object.data": {"host": "control01", "hypervisor_hostname":
"control01"}, "nova_object.namespace": "nova"}]}
it has control01 as host even it is in control02
when live migrate this vm from controll02 to control01, get error in
"migration-list", after check the nova-scheduler logs, got
2017-04-02 14:01:47.010 6 DEBUG nova.filters [req-191c8f6e-010b-42f6-acc6-c84c689f649c 2442cfcb9d5c4daf8d90af8bcfe30df7 8eb03bbcdfd84f68b88a7fbaa74e2327 - - -] Starting with 1 host(s) get_filtered_objects /var/lib/kolla/venv/lib/python2.7/site-packages/nova/filters.py:70
2017-04-02 14:01:47.010 6 INFO nova.scheduler.filters.retry_filter [req-191c8f6e-010b-42f6-acc6-c84c689f649c 2442cfcb9d5c4daf8d90af8bcfe30df7 8eb03bbcdfd84f68b88a7fbaa74e2327 - - -] Host [u'control01', u'control01'] fails. Previously tried hosts: [[u'control01', u'control01']]
I think the root cause is the retry part, and still do not know how it happen.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1678577/+subscriptions
Follow ups