← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Merge] ~andersson123/autopkgtest-cloud:amend_testbed_failure_handling into autopkgtest-cloud:master

 

Tim Andersson has proposed merging ~andersson123/autopkgtest-cloud:amend_testbed_failure_handling into autopkgtest-cloud:master.

Requested reviews:
  Canonical's Ubuntu QA (canonical-ubuntu-qa)

For more details, see:
https://code.launchpad.net/~andersson123/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/446642
-- 
Your team Canonical's Ubuntu QA is requested to review the proposed merge of ~andersson123/autopkgtest-cloud:amend_testbed_failure_handling into autopkgtest-cloud:master.
diff --git a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
index 2af9dc1..997ceaa 100755
--- a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
+++ b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker
@@ -1032,6 +1032,7 @@ def request(msg):
             running_test = True
             start_time = time.time()
             num_failures = 0
+            num_testbed_failures = 0
             for retry in range(3):
                 retry_start_time = time.time()
                 logging.info("Running %s", " ".join(argv))
@@ -1149,6 +1150,8 @@ def request(msg):
                             " and ".join(fails),
                             num_failures,
                         )
+                    else:
+                        num_testbed_failures += 1
                     logging.warning(
                         "Testbed failure. %sLog follows:", retrying
                     )
@@ -1166,12 +1169,21 @@ def request(msg):
                 else:  # code == 0, no retry needed
                     break
             else:
-                if num_failures >= 3:
-                    logging.warning(
-                        "Three fails in a row - considering this a failure rather than tmpfail"
-                    )
-                    code = 4
+                if (num_failures + num_testbed_failures) >= 3:
+                    if num_failures > num_testbed_failures:
+                        logging.warning(
+                            "Three fails in a row - considering this a failure rather than tmpfail"
+                        )
+                        code = 4
+                    else:
+                        logging.error(
+                            "%i testbed failures out of %i total failures detected, test will be considered as testbed failure",
+                            num_testbed_failures,
+                            (num_failures + num_testbed_failures),
+                        )
+                        code = 16
                 else:
+                    # Should never reach this I guess. Or at least not with testbed failures.
                     # 2022-07-05 what code is passed to submit_metric in this code path?
                     submit_metric(
                         architecture,
@@ -1182,7 +1194,7 @@ def request(msg):
                         release,
                     )
                     logging.error(
-                        "Three tmpfails in a row, aborting worker. Log follows:"
+                        "Unexpected: three tmpfails in a row, aborting worker. Log follows:"
                     )
                     logging.error(log_contents(out_dir))
                     sys.exit(99)

Follow ups