← Back to team overview

ubuntu-phone team mailing list archive

Re: A new Image release Proposal

 

On Fri, Nov 29, 2013 at 2:34 PM, Ursula Junque <ursinha@xxxxxxxxxxx> wrote:
>
>
>
> On Fri, Nov 29, 2013 at 10:01 AM, Alexander Sack <asac@xxxxxxxxxxxxx> wrote:
>>
>> On Fri, Nov 29, 2013 at 12:41 PM, Oliver Grawert <ogra@xxxxxxxxxx> wrote:
>> > hi,
>> > On Fr, 2013-11-29 at 11:32 +0100, Alexander Sack wrote:
>> >> Hi,
>> >>
>> >> it seems you put a few changes up for discussion in one shot.
>> >>
>> >> Let's keep those separate and look at them one by one:
>> >>
>> >> >From what I see you basically propose three main things:
>> >>
>> >>  1. lets increase velocity of image production so we get 2-3 images
>> >> produced in devel-proposed per day
>> >>  2. make cron the technology we use to schedule and kick those images
>> >> 2-3 times a day
>> >>  3. increase manual testing done before "releasing" images create a
>> >> broader touch-release team that will include avengers and manual
>> >> testers and community etc.
>> >>
>> >> Let me look at them one by one and then give a bullet summary of what
>> >> I believe we should indeed tweak for now...
>> >>
>> >> On 1.
>> >> ======
>> >>
>> >> I think 1. is and was the goal. So I think noone disagrees with the
>> >> benefits of having 2-3 checkpoints a day and we should just do it.
>> >> Note: it actually always was that way when I ran the landing team and
>> >> during release time. I believe we still do it, but if we don't we
>> >> should certainly ensure that we get back to do this.
>> > on the majority of days in the past we only had one image build per day
>> > simply because there were to many landings to wait for and in the end we
>> > had huge change sets that burned a lot of manpower when searching where
>> > a regression comes from.
>> >
>>
>> Let's fix that process problem first.
>>
>> All we need to do is to be strict about following the time windows for
>> cutting images regardless of whether the image has a chance to get
>> promoted or not. We haven't spelled things out like this before, so I
>> am pretty confident that this discussion helped getting us there.
>>
>>
>> >>
>> >> On 2.
>> >> ======
>> >>
>> >> You are suggesting a technical solution to the problem "how and when
>> >> do we cut images".
>> >>
>> >> I don't see why we would go for cron if we have something that is
>> >> smarter - e.g. our landing process. It would be a big step back to do
>> >> that. Let's be smarter :)...
>> >>
>> >> What we did during the final weeks of release and what we should
>> >> continue to do (until we have trigger based image production) was to
>> >> cut images based on a smart, individual landing plan that doesn't use
>> >> a strict time approach, but rather a hybrid approach that also takes
>> >> landing goals into account also
>> >>
>> >> For instance, every morning, landing team looks at the work to do and
>> >> decides what chunks of work we would like to have in image 1,2,3...
>> >> then they set themselves a hard end time to avoid that we drag on
>> >> without images forever. This worked pretty well.
>> >>
>> >> On top we should ensure that we continue producing images also during
>> >> times where landing team does not operate. That's mostly on weekend,
>> >> but also might be during eur/US nights. For those times we can use
>> >> cron to compensate the lack of available brains :0
>> >>
>> >
>> > we should have a fixed cron schedule even if the landing team is around,
>> > it is a huge pain if the change sets get bigger, how about we have one
>> > or two fixed cron builds per day and still the opportunity to trigger a
>> > third manual build at will. (the testing infrastructure is still highly
>> > unstable and unreliable, tests need to be re-run on nearly every image
>> > build, we have two persons doing this in two time zones and just started
>> > to discuss a cron schedule on IRC that makes sure the images are built
>> > at a time most convenient for them so we can have images ready during
>> > their working hours with enough wiggle room for manually restarting the
>> > individual tests that failed or were flaky)
>>
>> So you don't trust the landing team that they can make and communicate
>> a predictable "time window schedule" for cutting images and follow
>> that schedule? I totally do believe they can and will do it :)
>>
>>
>> With that, I can't really see how can you still be unhappy about what
>> I propose: we get the goodness of both worlds -> guaranteed frequency,
>> predictability, smartness. perfect!
>
>
> Can you show me cases where this gatekeeping prevented problems from
> happening? I can point you to a few occurrences this week where we couldn't

Yes, it was one of the key receipts to get always better images out
for the final month' in saucy. Without that, we wouldn't have gotten
there. Before we turned off the cronjob the job either ran too early
or too late. It never ran at the right time :) and it was one of the
main PITAs we encountered each and every day.


> tell where a problem is, firstly because of the amount of changes between
> one image and another and secondly, because it's not deterministic that a
> regression is in the chunk of changes between images. There was one image
> this week that everyone was wondering why it was much worse than the
> previous with the small subset of changes, all unrelated to the problems
> found. There are other things we can (and need to) do to improve the quality
> of a generated image, I really don't believe adding a manual gate in the

I would be interested to hear what we can do to improve our quality
(but please in another thread).

I just can't see how that is not orthogonal to the question at hand.
We can surely do both?

1. do whatever you propose to improve image quality
2. streamline the landing/image production process and move from a
cron based to a modern, trigger based approach


 - Alexander


References