ubuntu-docker-images team mailing list archive

Thread
Date

Re: Specs for image-builder.py

To: Bryce Harrington <bryce.harrington@xxxxxxxxxxxxx>
From: Sergio Durigan Junior <sergio.durigan@xxxxxxxxxxxxx>
Date: Thu, 15 Apr 2021 14:20:21 -0400
Cc: ubuntu-docker-images@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20210415012731.GA3559177@bryceharrington.org> (Bryce Harrington's message of "Wed, 14 Apr 2021 18:27:31 -0700")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

On Wednesday, April 14 2021, Bryce Harrington wrote:

> On Wed, Apr 14, 2021 at 08:22:37PM -0400, Sergio Durigan Junior wrote:
>> - I don't know how feasible it is to implement the auto-retry-uploads
>>   option; not sure whether Launchpad offers such granularity in their
>>   API.  I also don't know if it makes sense to embed it into this
>>   script, or create a separate script just for that (which may make more
>>   sense).  However, given the number and frequency of failed uploads we
>>   are having, and especially considering the fact that they are
>>   currently not being reported anywhere in the LP recipe page (one has
>>   to go to the specific build page in order to check the upload status;
>>   check LP#1918908), this IMO is a must-have.
>
> Yes, lp oopses and failures are an unfortunately common occurrence, and
> so a lot of lp scripts include some sort of retry functionality.  A
> common pattern is a 3x retry with progressive delay standoff
> (i.e. immediate -> 1 min -> 5 min).  The reason for the immediate retry
> is that lp can fail if an request is pulling uncached information and
> times out; the 2nd call will benefit from the pre-filled cache and
> succeed.  The other wait periods are useful if the problem is just
> network glitches or service loads.  If it still fails after 3 tries over
> 5 minutes, then something bigger may be at issue such as a legit bug or
> a service outage.

OK, good.  I don't know if the API exposes the "estimate time to build
this image" property; if it does, then it should be easy to sleep until
everything's been built.

>> - I considered whether to add the "--wait" option or not.  I decided to
>>   do it, but I understand that it might be a bit tricky to implement.
>
> It might be analogous (and hopefully simpler) than the ppa wait case we
> did before.  In any case, sounds useful.

Yeah.  Same comment as above, btw.

>> - I also thought about having a "--retag" option that would invoke the
>>   tag-images.sh script automatically after everything is done, but I'm
>>   not entirely sure this is something fit for this script.  I like
>>   separating things into logical blocks, so perhaps after this script is
>>   done we can have a "build-and-retag.{sh,py}" script.
>
> With ppa-dev-tools and other launchpad object scripts, the tool
> interfaces generally group into four parts:
>
>  a) creation/initialization
>  b) write operations - requesting builds, modifying params, etc.
>  c) read-only operations - checking status, parsing build logs, etc.
>  d) destruction
>
> The operations you've spec'd so far fit into part (b).  I'm guessing (a)
> and (d) are going to be out of scope or at least low priority in our
> case but perhaps we may eventually want this tool to help with creating
> recipes.  You might think about (c) though - when we run into build
> issues, are there scriptable things we could do to help make debugging
> easier?

Hm, not sure.  I mean, so far these are the build issues we faced:

- The actual image build has failed.  For example, when gosu wasn't
  usable on ppc64el/Focal.  Or if/when the security manifest builder
  script fails to run.  In this case, we can only know what happened by
  looking at the build logs.

- The registry upload has failed.

For the former, I don't know if there's any scripting we can do other
than grab the build log from LP and display it locally.  For the latter,
we're already going to cover this scenario with the auto-retry feature.

I'll see if I can think more about this later.

>> Bryce, if you want to talk more about this tomorrow (possibly involving
>> Athos, since he will be part of the OCI effort very soon), we can then
>> come up with a nice way to split the work.
>
> Why don't we plan on meeting up on Friday, that'll give me time on
> Thursday to pull up some of the code I was showing you, so we're not
> starting entirely from scratch.

Sure.

> Also, I'm thinking it may be wise for us to set up a separate git repo
> for this python scripting, and let the existing 'util' repo stay more
> bash-focused.

That may be a good idea, but I'm also concerned that we're heading
towards having two different codebases (one in shell and another in
Python) which have some overlapping things (for example, listing images,
authenticating to registries, etc.).  The benefit of putting everything
into one repository is that it's easier to reuse code this way.

-- 
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0  EB2F 106D A1C8 C3CB BF14

Follow ups

Re: Specs for image-builder.py
From: Bryce Harrington, 2021-04-15

References

Specs for image-builder.py
From: Sergio Durigan Junior, 2021-04-15
Re: Specs for image-builder.py
From: Bryce Harrington, 2021-04-15