← Back to team overview

ubuntu-docker-images team mailing list archive

Re: Specs for image-builder.py

 

On Wed, Apr 14, 2021 at 06:27:31PM -0700, Bryce Harrington wrote:
On Wed, Apr 14, 2021 at 08:22:37PM -0400, Sergio Durigan Junior wrote:
Hi,

Today on MM Bryce and I discussed the possibility of having a new script
to help us build/rebuild/mass-rebuild OCI images.  The need for this is
becoming more and more apparent with every mass-rebuild we have to do,
and with the prospect of having more images and series supported.  The
Launchpad "OCI recipe" page, while informative, requires a lot of clicks
in order to get the job done.

So anyway, Bryce asked me to come up with a list of requirements and
things that I want this script to do.  Here's what comes to mind.

First, a fictitious usage for the script:


  image-builder.py -- Nice text here

  Usage:

    image-builder.py [-h] [--series MM.YY] [--arch ARCH] [--wait] [--auto-retry-uploads] [-- IMAGE_1 IMAGE_2 ... IMAGE_N ]

  Where:

    --series MM.YY        Build only images from the MM.YY series.
                          Optional, can be passed multiple times.
                          If not provided, all series will be built.
                          Supporte series: 20.04, 21.04

    --arch ARCH           Build images only for architecture ARCH.
                          Optional, can be passed multiple times.
                          If not provided, all architectures will be built.
                          Supported architectures: amd64, arm64, ppc64el,
                          s390x

    --wait                Wait until all builds finished, and print
                          their statuses.  This can take a long time.

    --auto-retry-uploads  Auto-retry any failed uploads to the
                          registries.
                          Optional.  Implies "--wait".

    -- IMAGE_N...         Image(s) to build.
                          Optional.
                          If not provided, all images will be built.


A few comments:

- I don't know how feasible it is to implement the auto-retry-uploads
  option; not sure whether Launchpad offers such granularity in their
  API.  I also don't know if it makes sense to embed it into this
  script, or create a separate script just for that (which may make more
  sense).  However, given the number and frequency of failed uploads we
  are having, and especially considering the fact that they are
  currently not being reported anywhere in the LP recipe page (one has
  to go to the specific build page in order to check the upload status;
  check LP#1918908), this IMO is a must-have.

Yes, lp oopses and failures are an unfortunately common occurrence, and
so a lot of lp scripts include some sort of retry functionality.  A
common pattern is a 3x retry with progressive delay standoff
(i.e. immediate -> 1 min -> 5 min).  The reason for the immediate retry
is that lp can fail if an request is pulling uncached information and
times out; the 2nd call will benefit from the pre-filled cache and
succeed.  The other wait periods are useful if the problem is just
network glitches or service loads.  If it still fails after 3 tries over
5 minutes, then something bigger may be at issue such as a legit bug or
a service outage.

- I considered whether to add the "--wait" option or not.  I decided to
  do it, but I understand that it might be a bit tricky to implement.

It might be analogous (and hopefully simpler) than the ppa wait case we
did before.  In any case, sounds useful.

- I also thought about having a "--retag" option that would invoke the
  tag-images.sh script automatically after everything is done, but I'm
  not entirely sure this is something fit for this script.  I like
  separating things into logical blocks, so perhaps after this script is
  done we can have a "build-and-retag.{sh,py}" script.

With ppa-dev-tools and other launchpad object scripts, the tool
interfaces generally group into four parts:

a) creation/initialization
b) write operations - requesting builds, modifying params, etc.
c) read-only operations - checking status, parsing build logs, etc.
d) destruction

The operations you've spec'd so far fit into part (b).  I'm guessing (a)
and (d) are going to be out of scope or at least low priority in our
case but perhaps we may eventually want this tool to help with creating
recipes.  You might think about (c) though - when we run into build
issues, are there scriptable things we could do to help make debugging
easier?

Bryce, if you want to talk more about this tomorrow (possibly involving
Athos, since he will be part of the OCI effort very soon), we can then
come up with a nice way to split the work.

Why don't we plan on meeting up on Friday, that'll give me time on
Thursday to pull up some of the code I was showing you, so we're not
starting entirely from scratch.

I'd love to be part of the discussion!

Also, I'm thinking it may be wise for us to set up a separate git repo
for this python scripting, and let the existing 'util' repo stay more
bash-focused.

It may be too soon for me to go around +1'ing things, but here we go: +1
:)

Bryce

--
Athos Ribeiro


Follow ups

References