testrepository-dev team mailing list archive

Thread
Date
Re: [Yellow] testrepository and test filter creation

To: Gary Poster <gary.poster@xxxxxxxxxxxxx>
From: Robert Collins <robertc@xxxxxxxxxxxxxxxxx>
Date: Tue, 1 May 2012 06:32:17 +1200
Cc: Launchpad Yellow Squad <yellow@xxxxxxxxxxxxxxxxxxx>, testrepository-dev@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4F9EC38B.4050703@canonical.com>
On Tue, May 1, 2012 at 4:53 AM, Gary Poster <gary.poster@xxxxxxxxxxxxx> wrote:
> On 04/29/2012 10:39 PM, Robert Collins wrote:
>> So, I think testrepositories internals have gotten a little foggy
>> recently, and I wanted to understand what we're trying to do.
>
> Thank you for thinking about this Robert.
>
> I do appreciate that you are very busy.  That said, from my perspective,
> your timing, combined with what I understand to be your conclusion to
> discard large chunks of the work done over the past two weeks, is
> unfortunate and counterproductive.

I don't have such a conclusion; if anything I think the recent
landings are great; JML and I created some confusion with the layering
a year or so ago, which I think made this harder, and I'm hoping to
fix it by small tweaks, to match the diagram, if the diagram is
correct.

> subunit output.  The buildbot setup is a much nicer tool now practically, in
> particular for diagnosis.

Awesome!

>
>> The
>> attached image (please excuse the grot on it - turns out my camera
>> finds every spec of ever used whiteboard marker on my whiteboard, and
>> puts it into the image - I cleaned it up as much as I had patience to
>> do, before compressing it to a usable size) lays this out.
>>
>> In english:
>>   - we want to combine all the worker threads,
>
> (with the worker tags inserted, as they are now in trunk)

yes

>> then split the resulting
>> stream before any filtering takes place. The repository wants a copy,
>> which it will store verbatim. This permits diagnostics etc to be done
>> on nearly-raw data.
>>   - the UI wants the rest of it, but before the UI sees it we want to
>> do some global processing on it and strip out the 'layer X started'
>> tests (probably IFF they were pass / skips).
>
> Stripping out the successful zope:layer-tagged faux tests is fine but
> unnecessary from our buildbot perspective.  subunit-filter does a good job
> of this sort of thing, thanks to jml's branch [1], and knowledge of
> "zope:layer" seems to belong in user configuration/command line space to me.
>  This is how we have it configured now.

yes, zope:layer belongs in user configuration - see the branch I
linked which adds a configuration option to control filtering.

> Stripping out the non-successful zope:layer would be a step backwards. We
> should have access to a subunit stream that includes them.

Ok, so that highlights a difference : my diagram would provide
filtered subunit streams that do not include them, based on my
understanding that they would confuse developers, which you seem to
confirm below.

>
>> LP doesn't want them
>> counted towards test runs, and other projects may have similar needs.
>
> In buildbot, our counts are done from the filtered subunit stream, and we
> get what we want.
>
> On the command-line, yes, we asked in January for testrepository to give us
> counts that ignore zope:layer tests.  We don't see that much anymore because
> we run all of our parallel tests in buildbot, but I do still feel that it
> will unnecessarily confuse developers to see their test counts vary
> seemingly randomly across runs.
>
> I do think that such a feature should affect counts only: the subunit stream
> should include layer failures, and when they are included, they should not
> affect count.

--full-results as implemented today pretty-prints all tests in the
stream, if we filter out the zope:layer tests from counting alone then
the visually seen tests and the counts will be different. That won't
matter when you have as many tests as LP has but it will for smaller
projects.

>> All UIs should have this done to the stream, so its not a UI specific
>> thing.
>>   - once the CLI UI has it, it splits it again, to
>>    * count the tests (for the summary)
>>    * time the run (for the summary) - but this perhaps comes from the
>> repository by querying the last run, so arguably ignorable. Note that
>> this should not be 'sum of test times' but 'duration in stream' or
>> something similar (because with two streams we want the wall time, not
>> the sum of time per CPU).
>
> These are important to the developer story, but not to the CI story, at
> least as we have configured it.

Thats good to know; it is relevant to a structural discussion of the
code in testrepository though, which is what we need to integrate all
this work into.

>>    * and we want to send the stream to stdout as subunit when --subunit is
>> given
>
> Yes.  This is critical to us.
>
>
>>    * or pretty print it (when --full-results is given)
>
> That's not what --full-results means in the trunk-based branches we've been
> using.
>
> In trunk, the default case is pretty printing.  If you ask for "--subunit"
> then that is asking for subunit output (as opposed to pretty printing).

I misread the current code. Just a little - --subunit and
--full-results are orthogonal, so I should have drawn
a subunit sink and a pretty print sink with an optional filter which
--full-results disables.

The current code is not counting skips by filtering them out entirely
(if --full-results) is not given.

> As a user, this seems like the right distinction to me.  testrepository devs
> can decide what they want, and we will eventually adjust if necessary.
>
> In trunk, testrepository only shows failures by default. "--full-results"
> means "show successes too", like "-s"/"--success" in subunit-filter.  AIUI
> it is supposed to affect pretty printing and --subunit output the same way.
>
> I think trunk's arrangement makes more sense, though perhaps
> "--full-results" ought to be normalized to "--success" like subunit-filter
> if they are truly equivalent.

I think trunk is fine here, except for a few minor bugs:
 - skips are not counted towards test counts
 - we don't have a structure to support showing *everything* and not
counting it.

>>    * or filter it and then pretty print it (default) - note that there
>> is a missing box/arrow for this case; I've drawn it in here, you can
>> imagine it.
>>
>> If I've got things wrong, please let me know, otherwise this should
>> provide guidance for where/how to plug things in, and what structure
>> to be aiming for in the code.
>
> I think that if we have a --subunit flag as you've described it that at
> least includes zope:layer tagged output for layer failures and errors then
> we will be able to work with the end result, whatever it is.  We'll leave it
> to testrepository devs to hash out the rest of the particulars.

Paraphrasing this, I think it boils down to 'you are using testr
merely for the parallel dispatch and aggregation' [and perhaps some
data mining in future], so testr's job for you is to get you a single
good subunit stream and get out of your way. That sounds like a great
way to use it, so lets make sure it delivers. A key difference which
your description here highlights is that you'd like --full-results to
*bypass* the global transform step: you need to take --subunit as a Y
join off of the repositories stream, not off of the UI's stream.
(Phrased like this it makes sense to me, because you are not
interested in testr's UI).

However I also want to enable the local dev story which is what got me
this bug I'm working on assigned to me in the first place ;). For that
we don't have an opinion on the --subunit output, we instead care that
the user not see inflated counts changing from run to run depending on
worker allocation and testr runner quirks.

I'll prep an updated picture shortly.

-Rob
Follow ups

Re: [Yellow] testrepository and test filter creation
From: Robert Collins, 2012-04-30
References

testrepository and test filter creation
From: Robert Collins, 2012-04-30
Re: [Yellow] testrepository and test filter creation
From: Gary Poster, 2012-04-30