← Back to team overview

ubuntu-phone team mailing list archive

Re: A new Image release Proposal

 

On Fri, Nov 29, 2013 at 4:17 PM, Dave Morley <davmor2@xxxxxxxxxxxxx> wrote:
> On 29/11/13 14:43, Alexander Sack wrote:
>> Hi,
>>
>> On Fri, Nov 29, 2013 at 2:13 PM, Ricardo Salveti de Araujo
>> <ricardo.salveti@xxxxxxxxxxxxx> wrote:
>>> On Fri, Nov 29, 2013 at 10:55 AM, Oliver Grawert <ogra@xxxxxxxxxx> wrote:
>>>> On Fr, 2013-11-29 at 13:38 +0100, Alexander Sack wrote:
>>>>> On Fri, Nov 29, 2013 at 1:25 PM, Oliver Grawert <ogra@xxxxxxxxxx> wrote:
>>>>>> hi,
>>>>>> On Fr, 2013-11-29 at 13:01 +0100, Alexander Sack wrote:
>>>>>>> On Fri, Nov 29, 2013 at 12:41 PM, Oliver Grawert <ogra@xxxxxxxxxx> wrote:
>>>>>>>> hi,
>>>>>>>> On Fr, 2013-11-29 at 11:32 +0100, Alexander Sack wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> it seems you put a few changes up for discussion in one shot.
>>>>>>>>>
>>>>>>>>> Let's keep those separate and look at them one by one:
>>>>>>>>>
>>>>>>>>> >From what I see you basically propose three main things:
>>>>>>>>>
>>>>>>>>>  1. lets increase velocity of image production so we get 2-3 images
>>>>>>>>> produced in devel-proposed per day
>>>>>>>>>  2. make cron the technology we use to schedule and kick those images
>>>>>>>>> 2-3 times a day
>>>>>>>>>  3. increase manual testing done before "releasing" images create a
>>>>>>>>> broader touch-release team that will include avengers and manual
>>>>>>>>> testers and community etc.
>>>>>>>>>
>>>>>>>>> Let me look at them one by one and then give a bullet summary of what
>>>>>>>>> I believe we should indeed tweak for now...
>>>>>>>>>
>>>>>>>>> On 1.
>>>>>>>>> ======
>>>>>>>>>
>>>>>>>>> I think 1. is and was the goal. So I think noone disagrees with the
>>>>>>>>> benefits of having 2-3 checkpoints a day and we should just do it.
>>>>>>>>> Note: it actually always was that way when I ran the landing team and
>>>>>>>>> during release time. I believe we still do it, but if we don't we
>>>>>>>>> should certainly ensure that we get back to do this.
>>>>>>>> on the majority of days in the past we only had one image build per day
>>>>>>>> simply because there were to many landings to wait for and in the end we
>>>>>>>> had huge change sets that burned a lot of manpower when searching where
>>>>>>>> a regression comes from.
>>>>>>>>
>>>>>>>
>>>>>>> Let's fix that process problem first.
>>>>>>>
>>>>>>> All we need to do is to be strict about following the time windows for
>>>>>>> cutting images regardless of whether the image has a chance to get
>>>>>>> promoted or not. We haven't spelled things out like this before, so I
>>>>>>> am pretty confident that this discussion helped getting us there.
>>>>>> why would it matter at all if an image gets promoted ... in my ideal
>>>>>> world we would have builds triggered every time a change set enters from
>>>>>> proposed or at least every 2h ... only a minimal amount of these images
>>>>>> would be promoted at all, but we had a lot to pick the best one from ;)
>>>>>
>>>>> The reason for that is in a CI world we feed that image back so new
>>>>> merge proposals get tested on top of the most recent known good state.
>>>>> And this need to happen frequently, so folks don't test on outdated
>>>>> stuff etc.
>>>> I don't see how it makes any difference if that latest image was built
>>>> by cron or a human ... or do we have a special "human touch" in manually
>>>> built images that I don't know about ?
>>>> :)
>>>>
>>>> for the last two weeks I mostly behaved like cron wrt building images
>>>> (exactly in preparation for this request since so many people asked me
>>>> why we don't do cronned images, I'm pretty sad nobody of them speaks up
>>>> in this discussion) it doesn't seem to have interfered with anything.
>>>
>>> Exactly, not having it in cron here didn't help us much.
>>>
>>> I just don't get why we're fighting against automation, if we can all
>>> decide to build in a fixed schedule and work to improve our CI to get
>>> the best usage of that we can.
>>
>> it's not about automation or not and as I said before, I am surely not
>> fighting automation. I am simply against going back from a
>> potentially-smart, trigger based approach for one of our CI steps
>> (image production) to a cronjob based approach that is completely
>> arbitrary/decoupled from the rest of our engineering/landing process.
>>
>>>
>>> And the archive is always open, besides the landing team syncing the
>>> landings first in a ppa and then in proposed, so we should always have
>>> an automated job that creates such snapshots.
>>
>> The fact that the archive uploads are in a separate process makes
>> things hairy, yes. However, that doesn't means that we should add
>> another decoupled process (cronjob based image production) and hope
>> that things will be better....
>>
>>>
>>> We should all work against the schedule, not against the will of the
>>> landing team to trigger a new image (as that's not really a CI), and
>>> please, we're engineers :-)
>>
>> Note that the current proposal is to have a schedule. Not a point
>> schedule through a cronjob, but a smart, time window schedule.
>>
>> How is such approach still an issue for you?
>>
>>
>>> Ricardo Salveti de Araujo
>>
> Alexander my issue is simple.

Cool. :) Simple should be easy to clarify/resolve... let's see :)

> If you have images are only released when the landing team is good and
> ready how does anyone know when that is?

The proposal that evolved in this thread is that we clearly
communicate a usually fixed schedule that doesn't give exact cut off
times, but rather a time window in which the landing team has the
flexibility to cut the image slightly early or late ...

So for instance, you will know that the image will earliest be cut at
1200 UTC, but latest at 1400 UTC.

>
> If you have 3 releases a day 8 hours apart, the devs know when they need
> to finish for a landing, the landing team will know when to have stuff

This sounds right, but in practice isn't really that way. In practice
our tools have so much random delays that devs hardly can plan for an
on spot landing for the next image cut anyway. This means the only way
for a dev to ensure his stuff makes the image cut is to upload with at
least 1h buffer on top...

Now, with the proposed the time window based approach, he can still do
that PLUS he can do on spot landing that he doesn't want to miss in
the next image cut even if he doesn't have the luxury to upload with
such a buffer. All he needs to do is check with the landing team and
they can be smart about image cut time and consider his urgent fix and
wait another couple of minutes if its still within the "scheduled time
window"/

> landed by and users will know when they can upgrade based on the results
> of the automated testing.

Is it still a problem with scheduled time window? In that case you
won't know that your image will be there exactly at 15:30, but you
will know that it will be there latest by 15:30 - even though it might
be there at 13:30.

>
> If we then do 2 stable releases a week we can get the broader community
> on board for testing the things that need time to brew like does your
> phone work after 10 hours or not.

please don't use the name "stable" release for the "devel" channel.
read my initial reply where i explained the channels :)... devel is by
definition not stable. It just passes automation and super shallow
manual smoke testing and our goal is of course to make automation good
enough to ensure that dogfooders can pick it up.


>
> With 15 images a week (work days only, please god switch it off if there
> is no one around to fix things if it all goes pear shaped) and 2 stable
> it should be really easy to track when things broke.

You can track when things broke if you do cron cutting, yes. But we
also want to cut the best images possible. However, cron very rarely
cuts the bets image, as it usually cuts before the important fix
landed and after the new regression sneaked in ... we have been down
that vicious ciricle for weeks before before :).

I am sure you as a tester that runs -proposed also wants the best
image possible every day? ... and not just something "randomly cut"
that is broken and then after you report the issue claim that the bug
has long been 10 minutes too late for the image cut :), no?


>
> --
> You make it, I'll break it!
>
> I love my job :)
> http://www.ubuntu.com
> http://www.canonical.com
>
>
> --
> Mailing list: https://launchpad.net/~ubuntu-phone
> Post to     : ubuntu-phone@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~ubuntu-phone
> More help   : https://help.launchpad.net/ListHelp
>


References