← Back to team overview

ubuntu-phone team mailing list archive

Re: Landing team 10.12.13

 

Hey everyone,


On Mon, Dec 16, 2013 at 1:15 PM, Łukasz 'sil2100' Zemczak <
lukasz.zemczak@xxxxxxxxxxxxx> wrote:

>
> Right, it might sound a bit scary indeed. The main thing that Didier
> wanted to say is that we want all the tests to be reliable.


I totally agree


> We no longer
> do re-runs in case of tests that are failing (due to flakiness)


...nor should you!


> and are
> no longer allowed to release a component that has an unreliable test.
>

Excellent!


> Tests that are flaky only add to the overall confusion.
>
>
We are in violent agreement here :)


> Of course, as you say, the best way is to fix the test properly!
>

Right on :)


> Sometimes though, as we experienced, it's either not that easy to do or
> there are simply no resources available in a team.


Absolutely - and the QA department probably hasn't been as helpful as we'd
like to be - sorry for that! [1] As I mentioned earlier, hopefully the TnT
team, and Dave Morley's daily test work should help you clear up the
remaining issues.


> Integration test
> reliability needs to be also relatively high-priority whenever an issue
> related pops up - but with many other, seemingly more important tasks
> queued, sometimes flaky tests stay around for too long.


The key word there for me is "seemingly". I absolutely understand that you
guys want to ship new features. My point is that while shipping new
features might seem to be more important than fixing your existing test
cases, I think you're mistaken :)

It's a nice idea that we'll get the image to go "green", and *then* we'll
start making sure our tests are reliable. However, I fear that's not how
human beings are wired up inside. Instead, I believe we'll get the image
green, then have this exact same discussion the next time some test case
starts failing.

The general wisdom around automated testing is that your test cases should
be treated with the same level of respect and attention as your production
code. Personally, I don't think we're doing that, and I can point to many
examples where the application has changed (possibly regressed, possibly
not - that's a judgment call) and the test cases haven't been updated.


> And since we
> won't release a component with such anomalies, sometimes temporary
> skipping the test is the only way to release a component. Since what use
> is a test that cannot give proper results!
>
>
I totally agree that test results should be meaningful. However, by
skipping the test, you're not improving the quality of the application,
you're just reducing the amount of information you have to hand. I think
our test results are meaningful already - sometimes they're a little hard
to interpret. We need to make autopilot do a better job of providing you
with all the information you need in a test result, and we're working with
the CI team on that already.


> I guess those words were to enforce this high-priority to fixing
> integration flakiness.
>
>
The latest image (I'm looking at this:
http://ci.ubuntu.com/smokeng/trusty/touch/maguro/69:20131216.1:20131211.2/5498/
)
is looking pretty good. I can't see any application crashes, or autopilot
crashes. All the failures I see are either application regressions or test
regressions. If they're the former, they should be fixes as a matter of
urgency; if they're the latter, skipping them is just regressing our test
coverage.


> Good to hear about the TnT team - I guess we'll be poking you guys about
> some of the problems we'll be encountering.
>
>
The team's mandate is to improve the tools the QA department provide to the
rest of the development community. We're focusing on autopilot right now,
so you should start to see some changes there. If you have feature requests
for autopilot (that aren't already tracked in bugs), we're the ones to talk
to. If you're seeing issues in the tool itself (as opposed to a test case),
we're the ones to talk to.


Anyway, I hope that clears up some confusion.


Cheers,

[1] In our defence, the number of developers we're supporting is crazy
high. If you look at the ratio of QA engineers to feature engineers in
other companies that have a strong culture of quality, we do pretty well,
considering how few QA engineers we have. I'm not complaining, nor am I
offering any excuses: I Actually think we've done pretty darn well over the
last few years, but it's hard to meet everyone's requests in a timely
manner with so few resources.
-- 
Thomi Richards
thomi.richards@xxxxxxxxxxxxx

References