← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Merge] ~andersson123/autopkgtest-cloud:resize-docs into autopkgtest-cloud:master

 

Tim Andersson has proposed merging ~andersson123/autopkgtest-cloud:resize-docs into autopkgtest-cloud:master.

Requested reviews:
  Canonical's Ubuntu QA (canonical-ubuntu-qa)
Related bugs:
  Bug #2061141 in Auto Package Testing: "Running out of `/tmp` space on cloud workers - post-mortem"
  https://bugs.launchpad.net/auto-package-testing/+bug/2061141

For more details, see:
https://code.launchpad.net/~andersson123/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/464343
-- 
Your team Canonical's Ubuntu QA is requested to review the proposed merge of ~andersson123/autopkgtest-cloud:resize-docs into autopkgtest-cloud:master.
diff --git a/docs/administration.rst b/docs/administration.rst
index 3bc80b9..5fbe252 100644
--- a/docs/administration.rst
+++ b/docs/administration.rst
@@ -371,3 +371,66 @@ when you have lots of obsoleted packages, would be like so:
   for pkg in $packages; do for arch in amd64 arm64 s390x ppc64el armhf i386; do ./filter-amqp -v debci-huge-noble-$arch "$pkg\b"; done; done
 
 This way you can remove all the packages in one command on every architecture.
+
+Resizing /tmp partitions
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+When running an instance of autopkgtest-cloud, you may find that the `/tmp` partitions for the
+autopkgtest-cloud-worker units can get quite full.
+
+This can happen when you have very long running tests, which have large log files. These long
+running tests can disproportionately use up the disk space on `/tmp`, and this can end up
+introducing a "meta-quota", where your cloud resources aren't restricted, but you hit bottlenecks
+due to the `/tmp` partition running out of space.
+
+In an occasion like this, consider increasing the size of the `/tmp` partitions. The steps are
+detailed below.
+
+Before doing any of the steps detailed in this section, it's important to make sure no tests
+are currently running on the cloud worker with the partition you want to resize.
+
+.. code-block::
+  # on the worker machine with the volume you intend to resize
+  chmod -x autopkgtest-cloud/worker/worker
+  sudo systemctl stop autopkgtest.target # ensure that you WAIT for all running jobs to finish, i.e. for the stop command to exit
+  while true; do ps aux | grep runner; sleep 3; clear; done # wait until there are no runner processes
+
+First check that this specific version of openstack is available via:
+
+.. code-block::
+  openstack --os-volume-api-version 3.42 volume list
+
+The command should not fail.
+
+To resize a volume:
+
+.. code-block::
+  # check the storage name
+  juju storage # shows all existing storage volumes (aside from root partitions)
+  # get the 'openstack' name for the volume
+  openstack volume list # $num from tmp/$num should be appended at the end of the volume name
+  # from the above command, get the id, and set it to a variable: VOLUME_ID
+  openstack --os-volume-api-version 3.42 volume set ${VOLUME_ID?} --size ${NEW_SIZE}
+  # this will begin the process of resizing the volume
+  # whilst this is happening, consider running this:
+  while true; do openstack volume show $VOLUME_ID; sleep 5; clear; done
+  # If the volume in question has been retyped (DEFAULT <-> ceph nvme), run the following (not necessary for volumes that haven't been retyped):
+  nova reboot $server_name
+  # where $server_name is the name of the server associated with the volume
+  # to check this:
+  juju storage # make note of the juju unit name associated with the storage you've resized
+  # then
+  openstack server list
+  # and get the server name of the server running the unit mentioned in juju storage
+  # after rebooting, run the following ON THE SERVER you've rebooted
+  lsblk # check that the disk size has increased
+  sudo growpart /dev/vdb 1
+  sudo resize2fs /dev/vdb1
+  lsblk # check that the disk size and partition sizes match
+
+There are no conclusions as to why the reboot is required if the volume has already
+been retyped. None of the typical methods for rescanning disks work, in this case.
+
+When the volume hasn't been retyped prior, it is immediately acknowledged by the
+openstack server. Keep this in mind if you're using the __DEFAULT__ volume type
+(see `openstack volume show $VOLUME_ID` to check).

Follow ups