← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Merge] autopkgtest-cloud:retryResourceInErrorState into autopkgtest-cloud:master

 

Brian Murray has proposed merging autopkgtest-cloud:retryResourceInErrorState into autopkgtest-cloud:master.

Commit message:
retry failures where the test instance is in an error state
    
There was a traceback in novaclient when the instance had a "status" of
"ERROR" due to building of the instance being aborted because there were
"No fixed IP addresses available".


Requested reviews:
  Canonical's Ubuntu QA (canonical-ubuntu-qa)

For more details, see:
https://code.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/466335

I saw the following traceback in https://objectstorage.prodstack5.canonical.com/swift/v1/AUTH_0f9aae918d5b4744bf7b827671c86842/autopkgtest-noble/noble/amd64/c/csync2/20240522_133502_4cefa@/log.gz

755s DEBUG (shell:822) <Server: adt-noble-amd64-csync2-20240522-122145-juju-7f2275-prod-proposed-migration-environment-3-4eff9f5f-0b21-4e75-b37c-2be19568a929>
755s Traceback (most recent call last):
755s   File "/usr/lib/python3/dist-packages/novaclient/shell.py", line 820, in main
755s     OpenStackComputeShell().main(argv)
755s   File "/usr/lib/python3/dist-packages/novaclient/shell.py", line 742, in main
755s     args.func(self.cs, args)
755s   File "/usr/lib/python3/dist-packages/novaclient/v2/shell.py", line 980, in do_boot
755s     _poll_for_status(cs.servers.get, server.id, 'building', ['active'])
755s   File "/usr/lib/python3/dist-packages/novaclient/v2/shell.py", line 1019, in _poll_for_status
755s     raise exceptions.ResourceInErrorState(obj)
755s novaclient.exceptions.ResourceInErrorState: <Server: adt-noble-amd64-csync2-20240522-122145-juju-7f2275-prod-proposed-migration-environment-3-4eff9f5f-0b21-4e75-b37c-2be19568a929>
755s ERROR (ResourceInErrorState): <Server: adt-noble-amd64-csync2-20240522-122145-juju-7f2275-prod-proposed-migration-environment-3-4eff9f5f-0b21-4e75-b37c-2be19568a929>
755s 
755s Error building server

Here's some of the debug output:

"status": "ERROR"...
"message": "Build of instance 568e898e-d91f-4266-bcae-0a805fd1ff67 aborted: Failed to allocate the network(s) with error No fixed IP addresses available for network: 0d0997a0-13bd-484c-b3b7-ebb3288d6a74, not rescheduling."

While we should discuss this failure scenario, "No fixed IP addresses available", with the team managing the cloud we should also retry these failures automatically.
-- 
Your team Canonical's Ubuntu QA is requested to review the proposed merge of autopkgtest-cloud:retryResourceInErrorState into autopkgtest-cloud:master.
diff --git a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
index 0173fe5..1efe292 100755
--- a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
+++ b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
@@ -119,6 +119,7 @@ TEMPORARY_TEST_FAIL_STRINGS = [
     ": error cleaning up:",
     " has modification time ",  # clock skew, LP: #1880839
     "OSError: [Errno 28] No space left on device",
+    "novaclient.exceptions.ResourceInErrorState",  # failure with the VM
 ]
 
 # If we repeatedly time out when installing, there's probably a problem with