← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1824858] Re: nova instance remnant left behind after cold migration completes

 

Fails in the check to see that the source host no longer has instance
files after the cold migration

Hosts have the following labels assigned ie. includes remote label
	openstack-compute-node=enabled
	openvswitch=enabled 
        sriov=enabled
        remote-storage=enabled

Recent load on 2+3 lab  also retested. These test continue to fail for
the same reason

BUILD_TYPE="Formal"
BUILD_ID="20190613T013000Z"

test_cold_migrate_vm[remote-0-0-None-2-volume-confirm] 
test_cold_migrate_vm[remote-1-0-None-1-volume-confirm]
test_cold_migrate_vm[remote-1-512-None-1-image-confirm] 
test_cold_migrate_vm[remote-0-0-None-2-image_with_vol-confirm]

** Changed in: nova
       Status: Incomplete => Invalid

** Changed in: nova
       Status: Invalid => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1824858

Title:
  nova instance remnant left behind after cold migration completes

Status in OpenStack Compute (nova):
  Confirmed
Status in StarlingX:
  Incomplete

Bug description:
  Brief Description
  -----------------
  After cold migration to a new worker node, instances remnants are left behind

  
  Severity
  --------
  standard

  
  Steps to Reproduce
  ------------------
  worker nodes compute-1 and compute-2 have label	remote-storage enabled
  1. Launch instance on compute-1
  2. cold migrate to compute-2
  3. confirm cold migration to complete

  
  Expected Behavior
  ------------------
  Migration to compute-2 and cleanup on files on compute-1

  
  Actual Behavior
  ----------------
  At 16:35:24 cold migration for instance a416ead6-a17f-4bb9-9a96-3134b426b069  completed to compute-2 but the following path is left behind on compute-1
  compute-1:/var/lib/nova/instances/a416ead6-a17f-4bb9-9a96-3134b426b069

  compute-1:/var/lib/nova/instances$ ls
  a416ead6-a17f-4bb9-9a96-3134b426b069         _base          locks
  a416ead6-a17f-4bb9-9a96-3134b426b069_resize  compute_nodes  lost+found

  
  compute-1:/var/lib/nova/instances$ ls
  a416ead6-a17f-4bb9-9a96-3134b426b069  _base  compute_nodes  locks  lost+found

  compute-1:/var/lib/nova/instances$ ls
  a416ead6-a17f-4bb9-9a96-3134b426b069  _base  compute_nodes  locks  lost+found


  2019-04-15T16:35:24.646749 	clear 	700.010 	Instance tenant2-migration_test-1 owned by tenant2 has been cold-migrated to host compute-2 waiting for confirmation 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:35:24.482575 	log 	700.168 	Cold-Migrate-Confirm complete for instance tenant2-migration_test-1 enabled on host compute-2 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:35:16.815223 	log 	700.163 	Cold-Migrate-Confirm issued by tenant2 against instance tenant2-migration_test-1 owned by tenant2 on host compute-2 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:35:10.030068 	clear 	700.009 	Instance tenant2-migration_test-1 owned by tenant2 is cold migrating from host compute-1 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:35:09.971414 	set 	700.010 	Instance tenant2-migration_test-1 owned by tenant2 has been cold-migrated to host compute-2 waiting for confirmation 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:35:09.970212 	log 	700.162 	Cold-Migrate complete for instance tenant2-migration_test-1 now enabled on host compute-2 waiting for confirmation 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:34:51.637687 	set 	700.009 	Instance tenant2-migration_test-1 owned by tenant2 is cold migrating from host compute-1 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:34:51.637636 	log 	700.158 	Cold-Migrate inprogress for instance tenant2-migration_test-1 from host compute-1 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:34:51.478442 	log 	700.157 	Cold-Migrate issued by tenant2 against instance tenant2-migration_test-1 owned by tenant2 from host compute-1 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical 	
  2019-04-15T16:34:20.181155 	log 	700.101 	Instance tenant2-migration_test-1 is enabled on host compute-1 	tenant=7f1d4223-3341-428a-9188-55614770e676.instance=a416ead6-a17f-4bb9-9a96-3134b426b069 	critical

  
  see nova-compute.log (compute-1)
  compute-1 nova-compute log

  [instance: a416ead6-a17f-4bb9-9a96-3134b426b069 claimed and spawned
  here on compute-1]

  {"log":"2019-04-15 16:34:04,617.617 60908 INFO nova.compute.claims [req-f1195bbb-d5b0-4a75-a598-ff287d247643 3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 - default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Claim successful on node compute-1\n","stream":"stdout","time":"2019-04-15T16:34:04.617671485Z"}
  {"log":"2019-04-15 16:34:07,836.836 60908 INFO nova.virt.libvirt.driver [req-f1195bbb-d5b0-4a75-a598-ff287d247643 3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 - default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Ignoring supplied device name: /dev/vda. Libvirt can't honour user-supplied dev names\n","stream":"stdout","time":"2019-04-15T16:34:07.836900621Z"}
  {"log":"2019-04-15 16:34:08,000.000 60908 INFO nova.virt.block_device [req-f1195bbb-d5b0-4a75-a598-ff287d247643 3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 - default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Booting with volume 78db19a5-b699-407e-bbfb-7addeec8abdc at /dev/vda\n","stream":"stdout","time":"2019-04-15T16:34:08.00120626Z"}

  {"log":"2019-04-15 16:34:15,416.416 60908 INFO nova.virt.libvirt.driver [req-f1195bbb-d5b0-4a75-a598-ff287d247643 3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 - default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Creating image\n","stream":"stdout","time":"2019-04-15T16:34:15.421820953Z"}
  {"log":"2019-04-15 16:34:18,225.225 60908 INFO nova.compute.manager [-] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] VM Started (Lifecycle Event)\n","stream":"stdout","time":"2019-04-15T16:34:18.226158194Z"}

  {"log":"2019-04-15 16:34:18,256.256 60908 INFO nova.compute.manager
  [req-a1def21d-e38b-4cbe-a527-2182e896aa3e - - - - -] [instance:
  a416ead6-a17f-4bb9-9a96-3134b426b069] VM Paused (Lifecycle
  Event)\n","stream":"stdout","time":"2019-04-15T16:34:18.256867098Z"}

  {"log":"2019-04-15 16:34:18,310.310 60908 INFO nova.compute.manager
  [req-a1def21d-e38b-4cbe-a527-2182e896aa3e - - - - -] [instance:
  a416ead6-a17f-4bb9-9a96-3134b426b069] During sync_power_state the
  instance has a pending task (spawning).
  Skip.\n","stream":"stdout","time":"2019-04-15T16:34:18.31100974Z"}

  {"log":"2019-04-15 16:34:20,092.092 60908 INFO nova.compute.manager
  [req-a1def21d-e38b-4cbe-a527-2182e896aa3e - - - - -] [instance:
  a416ead6-a17f-4bb9-9a96-3134b426b069] VM Resumed (Lifecycle
  Event)\n","stream":"stdout","time":"2019-04-15T16:34:20.093107332Z"}

  {"log":"2019-04-15 16:34:20,095.095 60908 INFO
  nova.virt.libvirt.driver [-] [instance: a416ead6-a17f-
  4bb9-9a96-3134b426b069] Instance spawned
  successfully.\n","stream":"stdout","time":"2019-04-15T16:34:20.095798258Z"}

  {"log":"2019-04-15 16:34:20,095.095 60908 INFO nova.compute.manager
  [req-f1195bbb-d5b0-4a75-a598-ff287d247643
  3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 -
  default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Took
  4.68 seconds to spawn the instance on the
  hypervisor.\n","stream":"stdout","time":"2019-04-15T16:34:20.09609328Z"}

  {"log":"2019-04-15 16:34:20,149.149 60908 INFO nova.compute.manager [req-a1def21d-e38b-4cbe-a527-2182e896aa3e - - - - -] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] During sync_power_state the instance has a pending task (spawning). Skip.\n","stream":"stdout","time":"2019-04-15T16:34:20.149969266Z"}
  {"log":"2019-04-15 16:34:20,149.149 60908 INFO nova.compute.manager [req-a1def21d-e38b-4cbe-a527-2182e896aa3e - - - - -] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] VM Resumed (Lifecycle Event)\n","stream":"stdout","time":"2019-04-15T16:34:20.150156576Z"}
  {"log":"2019-04-15 16:34:20,207.207 60908 INFO nova.compute.manager [req-f1195bbb-d5b0-4a75-a598-ff287d247643 3fd3229d3e6248cf9b5411b2ecec86e9 7f1d42233341428a918855614770e676 - default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Took 15.62 seconds to build instance.\n","stream":"stdout","time":"2019-04-15T16:34:20.207528448Z"}

  {"log":"2019-04-15 16:34:59,463.463 60908 INFO
  nova.virt.libvirt.driver [req-b1f1706c-82a8-4367-9459-d67a7eb32f34
  834c06b5424947ae8c6c0882b12909b2 0d8b1b1835d444fd81d66c013cdbef4c -
  default default] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069]
  Instance shutdown successfully after 2
  seconds.\n","stream":"stdout","time":"2019-04-15T16:34:59.463555102Z"}

  {"log":"2019-04-15 16:34:59,466.466 60908 INFO nova.virt.libvirt.driver [-] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Instance destroyed successfully.\n","stream":"stdout","time":"2019-04-15T16:34:59.466515441Z"}
  {"log":"2019-04-15 16:35:00,660.660 60908 WARNING nova.compute.manager [req-36b50051-d7d9-4182-84ad-c0f42252b627 a3fa585069f54c3899d56994fe5bc701 90af2d0de6d74a59b484ab34698569e7 - 6b9be1b871584ab0b1f767f0dc33e402 6b9be1b871584ab0b1f767f0dc33e402] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Received unexpected event network-vif-unplugged-740df7da-cbf8-4f04-9ab1-35b297e7fc95 for instance with vm_state active and task_state resize_migrated.\n","stream":"stdout","time":"2019-04-15T16:35:00.662300475Z"}

  {"log":"2019-04-15 16:35:05,938.938 60908 WARNING nova.compute.manager [req-4d3715fe-43b3-4e1e-b99f-3db3b5e60583 a3fa585069f54c3899d56994fe5bc701 90af2d0de6d74a59b484ab34698569e7 - 6b9be1b871584ab0b1f767f0dc33e402 6b9be1b871584ab0b1f767f0dc33e402] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Received unexpected event network-vif-unplugged-99f054a9-7469-449a-9623-17c7870dfd00 for instance with vm_state active and task_state resize_finish.\n","stream":"stdout","time":"2019-04-15T16:35:05.938732459Z"}
  {"log":"2019-04-15 16:35:05,941.941 60908 INFO nova.compute.resource_tracker [req-bafc620b-ee9c-4f91-a544-39921c43e529 - - - - -] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Updating resource usage from migration ccda6a14-82e3-4965-a61f-a04e2e6cebf5\n","stream":"stdout","time":"2019-04-15T16:35:05.941818948Z"}

  
  [cold migrated to compute-2 here]
  {"log":"2019-04-15 16:35:10,212.212 60908 WARNING nova.compute.manager [req-ef46a902-0961-4446-bde8-2f235f5da2c8 a3fa585069f54c3899d56994fe5bc701 90af2d0de6d74a59b484ab34698569e7 - 6b9be1b871584ab0b1f767f0dc33e402 6b9be1b871584ab0b1f767f0dc33e402] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Received unexpected event network-vif-plugged-740df7da-cbf8-4f04-9ab1-35b297e7fc95 for instance with vm_state resized and task_state None.\n","stream":"stdout","time":"2019-04-15T16:35:10.212577086Z"}
  {"log":"2019-04-15 16:35:12,263.263 60908 WARNING nova.compute.manager [req-420edc8f-cb8b-48bb-b036-46fec11bbad2 a3fa585069f54c3899d56994fe5bc701 90af2d0de6d74a59b484ab34698569e7 - 6b9be1b871584ab0b1f767f0dc33e402 6b9be1b871584ab0b1f767f0dc33e402] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] Received unexpected event network-vif-plugged-99f054a9-7469-449a-9623-17c7870dfd00 for instance with vm_state resized and task_state None.\n","stream":"stdout","time":"2019-04-15T16:35:12.263926304Z"}
  {"log":"2019-04-15 16:35:14,228.228 60908 INFO nova.compute.manager [-] [instance: a416ead6-a17f-4bb9-9a96-3134b426b069] VM Stopped (Lifecycle Event)\n","stream":"stdout","time":"2019-04-15T16:35:14.228512171Z"}

  {"log":"2019-04-15 16:35:14,289.289 60908 INFO nova.compute.manager
  [req-84c2832a-2f74-41f0-be83-c54b0afcd1fb - - - - -] [instance:
  a416ead6-a17f-4bb9-9a96-3134b426b069] During the sync_power process
  the instance has moved from host compute-2 to host
  compute-1\n","stream":"stdout","time":"2019-04-15T16:35:14.289417626Z"}

  
  Reproducibility
  ---------------
  yes

  System Configuration
  --------------------
  Multi-node system (remote)

  Branch/Pull Time/Commit
  -----------------------
  BUILD_TYPE="Formal"
  BUILD_ID="20190410T013000Z"

  Timestamp/Logs
  --------------
  see inline logs above

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1824858/+subscriptions