← Back to team overview

ubuntu-phone team mailing list archive

Re: A new Image release Proposal

 

Hi,

it seems you put a few changes up for discussion in one shot.

Let's keep those separate and look at them one by one:

>From what I see you basically propose three main things:

 1. lets increase velocity of image production so we get 2-3 images
produced in devel-proposed per day
 2. make cron the technology we use to schedule and kick those images
2-3 times a day
 3. increase manual testing done before "releasing" images create a
broader touch-release team that will include avengers and manual
testers and community etc.

Let me look at them one by one and then give a bullet summary of what
I believe we should indeed tweak for now...

On 1.
======

I think 1. is and was the goal. So I think noone disagrees with the
benefits of having 2-3 checkpoints a day and we should just do it.
Note: it actually always was that way when I ran the landing team and
during release time. I believe we still do it, but if we don't we
should certainly ensure that we get back to do this.

On 2.
======

You are suggesting a technical solution to the problem "how and when
do we cut images".

I don't see why we would go for cron if we have something that is
smarter - e.g. our landing process. It would be a big step back to do
that. Let's be smarter :)...

What we did during the final weeks of release and what we should
continue to do (until we have trigger based image production) was to
cut images based on a smart, individual landing plan that doesn't use
a strict time approach, but rather a hybrid approach that also takes
landing goals into account also

For instance, every morning, landing team looks at the work to do and
decides what chunks of work we would like to have in image 1,2,3...
then they set themselves a hard end time to avoid that we drag on
without images forever. This worked pretty well.

On top we should ensure that we continue producing images also during
times where landing team does not operate. That's mostly on weekend,
but also might be during eur/US nights. For those times we can use
cron to compensate the lack of available brains :0


On 3.
======

Your proposal means very different things based on what you call
"image release". So far we have used the word "promotion" to describe
the act of moving a "blessed" image from a -proposed channel to a
non-proposed channel. I am not sure if thats what you call "release"
in your mail, but I assume so...

Let's look at the channels and its purposes again:

 - devel-proposed -> here all images get spit out. they are completely
untested and haven't even run through automation (read: why do you
want to bother big dogfooders and avengers by telling them to test
this stuff)
 - devel -> here we put images that have gone through automation and
that are ready for dogfooders to pick up
 - stable -> here is where we have end users and deliver updates to
end users through it.

Now the consent on target frequency of those is:

 - devel-proposed == 2-3 times a day (automated testing only)
 - devel == 1+ times a day (dogfooders and avengers testing with goal
to drive us to next stable update)
 - stable == 1-6 monthlty (stable users will give even more "testing")

I think that all makes sense, and doesnt' really need changing?

What needs better organization is the testing of dogfooders and
avengers of "already blessed" devel images. Here your idea about a
touch-release team makes sense. So far we had delegated that to jfunk.
You could help him organize a more effective avengers effort that also
includes the community, so maybe talk to him.


-----

OK, let's summarize what we got so far and let's do the following
tweaks for now...


Summary
=========

 1. we start producing 2 images a day until end of year at a
predicable schedule (didrocks will announce that schedule after
discussing internally)

 2. we don't enable cron during business days. Instead we hook image
kicks up to our landing process so that we get a smart, but predicable
schedule
    - for instance, the times of image build will always happen around
the same hours (e.g. image 1: 1200-1400, image2: 1800-2000) the same
timeframe, but also will be smart about considering the landing
payload so we can ensure that the critical pieces really landed etc.

 3. to keep the image frequency acceptable at all times, we enable
cron builds during weekend and days where landing team is not
operational.

 4. ogra and team to help jfunk to organize a more vibrant avengers
community around testing of images after devel promotion; this team
has the goal to identify issues that would block a stable promotion
and will be fed back into the landing team so they can prioritize
landings with the goal to clear a new stable promotion.



Thanks!


On Thu, Nov 28, 2013 at 1:13 PM, Oliver Grawert <ogra@xxxxxxxxxx> wrote:
> Hi,
>
> As some might have noticed we recently had a few bad image releases into
> the Trusty channel that contained regressions.
> We also tested out some changes to the build policies of the proposed
> images that I would like to establish further.
>
> Within the last weeks we raised the amount of built images per day to a
> higher frequency (3 at most currently) ... the reason for this:
> currently finding a regression in an image when only building a single
> image per day results in a pretty big change set. Finding the offending
> package that introduced a regression can take hours of weeding through
> package changelogs and is often based on guesswork ... usually we are
> guessing pretty well, but this still burns a lot of developer time we
> should rather spend on fixing bugs and writing tests or features.
>
> with building at least 3 images per day the change sets got a lot
> smaller, finding issues becomes a charm and offending packages are easy
> to identify ...
>
> My proposal here is to re-enable cron driven builds again.
> For a start this has to be every 8h since our testing infrastructure can
> not cope with more frequent builds and over time (as our testing
> infrastructure improves) to get to a frequency of doing a build every
> 4h, this should give us a small enough set of changes to be a lot
> quicker in finding regressions. I would like to switch cron back on by
> end of this week on the above mentioned 8h schedule, it seems all
> involved teams agreed that this is a good plan, please speak up if this
> plan does not suit the way your team works (or speak up in support :) )
>
> The second part of my proposal goes a bit further than just adding a
> cron entry ...
>
> Today the release of an image is handled by the Landing Team, which was
> traditionally only responsible for landing new upstream code inside the
> image ... this task alone already hogs most resourcesof the team.
> Releasing an image is usually pretty quickly decided as one topic of a
> daily meeting based on a quick glance over the automated tests and by
> asking for feedback from people that have done a short dogfooding
> smoketest ...
>
> I personally think the release process deserves a lot more attention,
> way more input from other teams and a wider dogfooder audience. The
> Landing team should concentrate on the landings of new code, the time
> they need to invest in additional image testing is time they can not
> spend on testing new landings which slows all of us down by creating an
> artificial bottleneck now that all landings need to go through that
> team. They should be able to fully concentrate on this process instead.
>
> What I like to throw in as a discussion point is to form a new
> "touch-release" team that consists of people from QA that bring in the
> buglist of collected bugs and triages and also checks errors.ubuntu.com
> regulary (we might need a "touch" filter there), one representative of
> the avengers team (who dogfood the stable images), someone from the
> ubuntu-cdimage team and a representative of the Landing team to give
> input and get feedback on the recent landings from actual endusers.
>
> This team should be a public focused team holding daily IRC meetings the
> whole community can participate in (as opposed to privately held
> team-only hangouts), I know there are plenty of users of the
> trusty-proposed/devel-proposed images and I think we should really try
> to have them participate in the release process and enable them to give
> us feedback to:
>
> a) regressions they run into that went unnoticed by the developers
> b) new bugs the QA/triage teams have not yet seen
> c) critical blockers we didn't even consider or see yet because they
> only happen after a day or two of constant usage.
>
> With the higher frequency of proposed builds it should then be possible
> to pick the image with the best automated results from a former day (or
> even two days so testers had more time for long term testing) that is
> considered good by all participating parties and their individual
> requirements.
>
> Lets put more focus onto the Image releases and take load of the Landing
> team to speed us all up and have the community more involved to actually
> increase the quality of our images and be truly regression free.
>
> ciao
>         oli


Follow ups

References