maas-devel team mailing list archive

Thread
Date

Re: Progress report: removing bootresources.yaml

To: Mark Shuttleworth <mark@xxxxxxxxxx>
From: Gavin Panella <gavin.panella@xxxxxxxxxxxxx>
Date: Fri, 23 May 2014 15:36:12 +0100
Cc: Maas Devel <maas-devel@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <537F4733.9020900@ubuntu.com>

On 23 May 2014 14:03, Mark Shuttleworth <mark@xxxxxxxxxx> wrote:
> On 23/05/14 11:59, Gavin Panella wrote:
>> On 23 May 2014 07:59, Mark Shuttleworth <mark@xxxxxxxxxx> wrote:
>>> On 22/05/14 16:16, Gavin Panella wrote:
>>>> On the maas-import-pxe-files side (i.e. the import script), it needs
>>>> that config to drive it. It doesn't know what to do otherwise. That
>>>> config has a nested structure which would be awkward at best to
>>>> translate to the command-line.
>>> It has to decide what to download. If it downloads "everything" then the
>>> "config" can be hardcoded - it's everything that we, Canonical, publish
>>> for this purpose. That way one MAAS region is the same as the next MAAS
>>> region. They have all the bits we publish.
>> Right, this is the outcome, but without an absolutely set-in-stone
>> hard-coded configuration. We generate the config on the region, with
>> everything in, and send it to clusters for processing.
>
> So you need:
>
>  * code to generate the config
>  * code to parse the config
>  * code to handle cases we don't want people to be able to configure
>
> Instead, why not just have this hardcoded in the import script, with
> progress updating? The script knows what images it intends to fetch and
> can update progress on them. This also allows cluster controllers to
> have special images that only they know about, if we ever need that.
>
> Also, why are the clusters doing the fetching? Why not fetch to the
> region controller then distribute to the clusters?

That is almost what happens. The region part is a caching proxy server,
and the cluster pulls through it.

>
> I still think your approach is overly complicated and liable to
> flakiness. Please keep iterating the design before you start cutting
> this code.

The majority of this code was cut well over a month ago.

>
>
>>>> On the region side, MAAS will generate that config with everything in
>>>> it by default, as agreed, and pass it over to the import script to run
>>>> on the cluster. However, we need a way to narrow that selection down
>>>> for development; it takes too long to download the ~400MB per
>>>> arch+subarch per release, especially when trying to iterate. For
>>>> amd64, i386, arm, and power, that's 1.6GB (7.2GB on disk once
>>>> uncompressed) before even considering different subarchitectures.
>>>> That's not cool for someone on DSL, even with caching. The
>>>> certification team also wants to be able to narrow down the selection
>>>> of boot resources.
>>> U... you are adding a lot of complexity - and configurability - for
>>> this? Seems like you could have a much simpler outcome that just focuses
>>> on a single version / architecture then lets the rest go on in the
>>> background.
>> We could change it to do that, but it already works as I've described,
>> and has done since Trusty was first released. It's extra work to do as
>> you describe.
>
> This is a cycle for clearing out past mistakes. Hanging on to code that
> exists, if it's not suitable, is a bad idea.
>
>> I think you're imagining that it's more complex than it is. The import
>> code that works from a provided config is already done and was
>> released with Trusty. I only said that the configuration was not
>> amenable to being expressed well with command-line arguments, not that
>> it was a baroque monstrosity. You have asked repeatedly for two
>> things: 1. Configuration and state in the region (and not as config
>> files). 2. Download all boot images. We are taking what was already
>> there - which was built to satisfy the use-cases as were understood
>> them at the time - and are reforming it to fulfill the points above.
>
> You are generating config to pass to the cluster. What if you get
> version skew between region and cluster?

Diogo is arranging for CI between versions of region and cluster, so we
will detect problems with this before they affect users. I don't forsee
any need to change the form of the config in the near future, though of
course I can't see the future either.

You also said in Austin - and before that - that you want configuration
and state in the region. Is this no longer the case?

>
>>>  * unnecessary code
>> We are not gratuitously adding code. With the code we'll be able to
>> excise, we'll probably end up with no net change to the size of the
>> codebase.
>
> No net win, then. Cut it if we don't need it.
>
>> Okay, perhaps the needs of the certification and hardware enablement
>> teams have been overstated. With the cessation of UI work on this we
>> are now into QA; no additional development is needed.
>
> Hold on. So you're saying we've done a bunch of work and are not going
> to do more? I thought I was being pretty clear that it doesn't sound
> well done to me.

If you want us to redo this work, we can do it, but something else will
need to make room for it.

>
>
>>>> - We already do not recommend that regular users run
>>>> maas-import-pxe-files directly; they should use the UI or the API.
>>>> That there is no good feedback on success/failure/progress for this is
>>>> a problem we're working on this cycle.
>>> Thank you, but hiding the complexity from users is not the same as
>>> eliminating complexity. Think more carefully about this please, you're
>>> just moving the complexity, not addressing it directly.
>> There is complexity in managing configuration, hard-coded or not. Users
>> and developers are going to change this configuration however hard we
>> try to prevent it. This already happened with power types.
>
> Power types are completely different. There is NO WAY to get the
> standard OS's on a machine unless you can talk to its controller.

If we're working with manufacturers to enable Ubuntu and MAAS with their
hardware then there may well be some back and forth as we sort out the
kernel and initrd, for example, so that the OS can get installed.

Preventing the HWE and Certification teams from cleanly using
alternative boot resources as they wish makes it likely that they'll
patch MAAS. Putting roadblocks in their way doesn't make sense.

>
> On the other hand, the config you're describing is to support hackery on
> the image you want. Instead, I'd suggest you:
>
>  * model exactly what we want to offer:
>    * releases and daily images
>    * hwe
>    * architectures
>
>  * also have a "custom image" type that can be uploaded to the region
> and distributed to clusters

This last bit is not part of this work, but we aim to do it this cycle.
The goal of this piece of work is to get bootresource.yaml off the
clusters.

>
> The only optionality we want to offer is:
>
>  * if you want a specific architecture
>  * if you only want a single release of Ubuntu
>  * whether or not you also want daily images for the releases you pick
>
> Neither of those justifies a configuration, the set of options can be
> perfectly well described on a command line:
>
>    maas import-images trusty --daily --amd64

The maas command talks to the region, via the API, so the region needs
to be able to pass whatever configuration is provided at the
command-line over to the cluster.

The implementation of the importer works from a configuration which
defines:

- A boot resource description stream to work from, as a URL.
- Selections from the stream by release, arch, subarch, and label
  (e.g. "release", "daily").

The code that does this was new for Trusty. With this we have a lot of
flexibility, but it's perfectly possible to limit the choices for end
users to those you've specified. This is a new requirement though.

We thought that you wanted everything to be downloaded. To that we added
an escape hatch to narrow the selection. This is for specific users -
HWE, Certification, developers - and not intended for production users.
We did very basic UI mockups for the latter case. In response to your
earlier email we'll leave that out, and have it headless-only. We knew
it didn't need to be all-singing because of the very limited audience.
We also chose to not limit the choices for that audience; they could
drive the importer to do what they needed.

>
>> By putting this configuration in the database we've made it easier for
>> us to manage - via database migrations - and avoid breaking user's
>> systems when upgrading.
>
> That is still code which can break and which I thought we agreed to remove.

I'm not sure how we offer the options you talk about above without some
way to persist them. Importing boot resources is not a one-off job.

References

Progress report: removing bootresources.yaml
From: Graham Binns, 2014-05-21
Re: Progress report: removing bootresources.yaml
From: Mark Shuttleworth, 2014-05-22
Re: Progress report: removing bootresources.yaml
From: Gavin Panella, 2014-05-22
Re: Progress report: removing bootresources.yaml
From: Mark Shuttleworth, 2014-05-22
Re: Progress report: removing bootresources.yaml
From: Gavin Panella, 2014-05-22
Re: Progress report: removing bootresources.yaml
From: Mark Shuttleworth, 2014-05-23
Re: Progress report: removing bootresources.yaml
From: Gavin Panella, 2014-05-23
Re: Progress report: removing bootresources.yaml
From: Mark Shuttleworth, 2014-05-23