← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Bug 2058465] [NEW] Consider making the systemd restrictions of the autopkgtest@.service files more liberal

 

Public bug reported:

Currently, our restrictions are as follows:

RestartSec=5min
Restart=on-failure
StartLimitInterval=10m
StartLimitBurst=3

Before this commit in November:

commit 60233d12e61085637a51e48b2fe5bf45f0d21711 (origin/autopkgtest-worker-restart)
Author: Tim Andersson <tim.andersson@xxxxxxxxxxxxx>
Date:   Tue Nov 28 14:13:43 2023 +0000

    fix: cloud-worker: Amend restart limitations for autopkgtest@*.service
    
    We've been having issues with having to manually restart workers when
    they fail. This happens because they previously would hit stringent Restart
    restrictions (StartLimitBurst, StartLimitInterval). It's best for us if
    those restrictions aren't as stringent and allow the autopkgtest@
    services to restart themselves more frequently.

They were:

RestartSec=5min
Restart=on-failure
StartLimitInterval=1h
StartLimitBurst=3

I think we can get a bit more liberal with these restrictions. Maybe
bringing down the RestartSec and increasing the StartLimitBurst. I
think, in the occasion where we have flaky infra, where some tests are
passing and some are failing, it'd help our throughput. And since
there's no documented reason why the worker services ever had these
restrictions, I think we can safely go ahead and do this.

** Affects: auto-package-testing
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of
Canonical's Ubuntu QA, which is subscribed to Auto Package Testing.
https://bugs.launchpad.net/bugs/2058465

Title:
  Consider making the systemd restrictions of the autopkgtest@.service
  files more liberal

Status in Auto Package Testing:
  New

Bug description:
  Currently, our restrictions are as follows:

  RestartSec=5min
  Restart=on-failure
  StartLimitInterval=10m
  StartLimitBurst=3

  Before this commit in November:

  commit 60233d12e61085637a51e48b2fe5bf45f0d21711 (origin/autopkgtest-worker-restart)
  Author: Tim Andersson <tim.andersson@xxxxxxxxxxxxx>
  Date:   Tue Nov 28 14:13:43 2023 +0000

      fix: cloud-worker: Amend restart limitations for autopkgtest@*.service
      
      We've been having issues with having to manually restart workers when
      they fail. This happens because they previously would hit stringent Restart
      restrictions (StartLimitBurst, StartLimitInterval). It's best for us if
      those restrictions aren't as stringent and allow the autopkgtest@
      services to restart themselves more frequently.

  They were:

  RestartSec=5min
  Restart=on-failure
  StartLimitInterval=1h
  StartLimitBurst=3

  I think we can get a bit more liberal with these restrictions. Maybe
  bringing down the RestartSec and increasing the StartLimitBurst. I
  think, in the occasion where we have flaky infra, where some tests are
  passing and some are failing, it'd help our throughput. And since
  there's no documented reason why the worker services ever had these
  restrictions, I think we can safely go ahead and do this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/2058465/+subscriptions



Follow ups