linaro-project-management team mailing list archive

Thread
Date

Re: [READ THIS] Really rethinking the kernel process, was Re: Rethinking kernel-related roadmap process

To: Christian Robottom Reis <kiko@xxxxxxxxxx>
From: Deepak Saxena <dsaxena@xxxxxxxxxx>
Date: Thu, 8 Mar 2012 23:31:31 -0800
Cc: Linaro Tech Leads <techleads@xxxxxxxxxx>, Management Team <mgt@xxxxxxxxxx>, John Stultz <john.stultz@xxxxxxxxxx>, Proj Mgmt Mailing List <linaro-project-management@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20120308163506.GC2058@async.com.br>

On 8 March 2012 08:35, Christian Robottom Reis <kiko@xxxxxxxxxx> wrote:
> Guys, I'm honestly disappointed in the general tone of "we don't control
> upstream" that is getting repeated in this thread. I've said this a
> number of times, and since it's still going on, I want to make it clear:
> our success, and ultimately my job, depend on YOU finding a way to
> strengthen our participation upstream. Ignoring the fact that members
> expect us to deliver cards in their target quarter is simply not okay.
>
> I find Nico and Amit's exchange on "unresponsive maintainers" really
> enlightening. Nico's stating the obvious to me, yet it's not baked into
> our mindset: WE ARE THE ARM LINUX UPSTREAM. If a maintainer is
> unresponsive or blocks your work, then it's YOUR responsibility to find
> a way to get the patch reviewed and either acked or nacked, and if that
> takes a month to happen, then you're doing a really poor job.

I don't think the issue in general that gets in the way of us
predicting delivery
is unresponsive maintainers. I think that is the extreme example of
unpredictability.
The general case in one in which we are doing work that requires input from
various other parties, including folks not involved with Linaro. Even if
we are the upstream, we have to work with those folks and get their
input. We do have to put some bounds on it and if a given sub-arch
maintainer or vendor does not provide input, we just have to move
on and make changes to the subsystem after the fact.

> Unless we take advantage of that unique position Linaro's members
> have put us in, this project will fail and we'll have to go back to playing
> BSP cleanup or whatever boring jobs we had before.

Even if all the SOC vendors become Linaro members and assign
their core developers to us, we don't get to say "you must merge this
as is b/c our members want it by this date" (or at least I won't say that to
any of my engineers). I just want to provide a reality check here
that being the upstream does not mean we get to bypass community
standards. In fact, it is the opposite in that we need to be more
aware of them and almost more selective of what we allow in.

> Yes, Deepak's kernel release alignment suggestion is on the right track;
> however, alone it's not going to make things any better. I've been
> waiting for a strong plan to come out of this thread, but in its
> absence, my solution to this problem runs deeper and comes in steps:
>
>    0. Managing the patches being written and submitted is critical. If
>       an engineer who is supposed to write one hasn't pushed in a week,
>       it should be a red alert. If nobody has replied to the patch in
>       48 hours, it should be a red alert. I have not seen a single
>       effort to track this, even manually, by a PM or tech lead. I
>       don't care if you don't think you have time for this; it's more
>       important than ANYTHING else.

>From my experience 48 hours is not a red alert. Post a patch that
needs review in the middle of a merge window or an -rc with some
major bugs and there is going be delay. The above smacks of
micro-management and miss-trusting my assignees. What I
think works better and what I've been shifting to is regular conversation
with the engineers, checking in on how things are going, hearing
where there might be points of being stuck or overload and
providing suggestions on what to do next (which may include
pinging an upstream maintainer if I have an established relationship
with them).

>    1. Quarters and kernel cycles are eerily compatible. When we put a
>       card on a roadmap, realize you are saying "I plan to land this
>       feature in kernel release X". When a card comes up, spend an hour
>       looking at historical data and the code involved, and make a
>       realistic guess.
>
>       If the guess puts delivery 6, 9 or 12 months away, then you need
>       to change the card. Maybe it needs scope pruning. Maybe you need
>       3 months to produce a working patch that you'll use as an RFC
>       upstream but which a member can use in his BSP. Maybe you're just
>       not confident enough and need a call or meeting with a maintainer
>       that can advise on the topic. But you should be prepared to work
>       hard to hit the quarter when the card goes up.

I like the idea of having an initial card that is more of a "research phase"
card and then breaking it up into deliverables over  the next few releases,
but these may not be something that we can excitingly market to the
TSC and prospective members as they won't deliver a full feature...

Members should not be using RFC patches in products IMNHO. That does
not get us any closer to solving the fragmentation issues. What will happen
is that member A will make changes to work with their HW, member B will
make changes to work with theirs, and we'll be right back where we started
with multiple vendor trees that are complete out of sync with upstream and
with each other.

>    2. Plan to deliver in iterations. Each iteration needs a deadline.
>       Iterations should be small; week-sized for a reasonable chunk of
>       work. You won't publish the deadlines to the TSC or the world,
>       but you will follow it religiously inside your team.
>
>       When you miss an iteration, it needs to be handled specially;
>       come up with a process to handle the exceptions. Upstream
>       unresponsive? Developer distracted with something else? Patch got
>       really hairy? Have it so a PM knows exactly how to handle it when
>       the exception triggers.

I need to think about this more. I'm not sure the PM can do anything to
have any affect on a process that involves people who don't care
about our process.

>    3. You need sherrifs. Platform has the release manager; the TCWG has
>       patch mergers and release managers; the kernel needs people that
>       are in charge of blowing the whistle on stale patches, stuck
>       discussions and missed deadlines.
>
>       Put in place specific roles and empower people to drive your
>       iterations into success. Rotate the roles between PMWG and KWG,
>       if you want to spread out the responsibility. Without these
>       supporting roles tech leads and PMs bottleneck everything.

I like this idea, need to think about what exactly the implementation
would look like.

>
>    4. When a maintainer asks for a major redesign, sends you off to do
>       cleanup, or NACKs what you are implementing, then in no more than
>       24h you should find a solution to the issue.

A major redesign in 24 hours is unrealistic, though an initial response should
be more than doable. Again, note that the blocking issue is not generally
NACKing from  maintainers, it is herding all the people who are stakeholders
to agree  on the design. You need an ACK from _every single stakeholder_
in the case of something that has cross-tree/cross-subarch users, like much
of the work we do does.

>    5. You /must/ budget cleanup time to strengthen our position in the
>       kernel ecosystem. Do code cleanups. Non-critical stuff. Help
>       review irrelevant patches. At least one day a week should be
>       spent writing and committing code upstream that is not for a
>       card, but which makes the kernel better. Each and every kernel
>       engineer should be doing this.

+20

Many of the KWG engineers already do this, either as maintainers of subsystems
or existing platforms.

> Finally, on the point of data and tools:
>
> On Wed, Mar 07, 2012 at 05:29:12PM -0800, John Stultz wrote:
>> That's probably a good indicator. If we have more then one patch being
>> submitted, its likely complicated enough to warrant some tracking (ie:
>> not a simple fix).
>>
>> That said, patches.linaro.org is nice for exactly this sort of tracking.
>> However its a little too broad in the way it does its tracking. I just
>> wish there was some way to link patches/patch-bundles to blueprints.
>
> There's a lot of shallow thinking going on in this thread, but John is
> actually very insightful where he points out that tracking patchsets is
> important. This is fundamental data; unless we measure it we will never
> be able to control or predict delivery. I've made a mockup of how we
> might track this in a 15-minute spreadsheet:
>
>    https://docs.google.com/spreadsheet/ccc?key=0AnmosMCIa6jxdEF3MVFoTFB0ZnBaRFJuYTdKbFJXTlE
>
> Ilias and Matthias had a call with me this week to discuss, and your
> input would be great.

So I started working on this for the Android lowmem patches and will
work on getting
it fully filled out. Overall, I think automation via patchwork is the
way to do this as
going through the mailing lists after the fact is a royal pain (and at
that point I'm
having to read every thread which I really shouldn't be doing as a TL)
and having
engineers manually input all this data is just overhead for them. All we need is
some sort of drop-down box that allows us to associate a patch series with
a blueprint or roadmap card.

I would like a place I can look at to understand the status of patches, but
I don't think we should extrapolate future upstreaming time based on
how a current patch set is doing.

~Deepak

Follow ups

Re: [READ THIS] Really rethinking / need for fake upstream
From: Andy Green, 2012-03-09

References

Rethinking kernel-related roadmap process
From: Deepak Saxena, 2012-03-02
Re: Rethinking kernel-related roadmap process
From: John Stultz, 2012-03-03
Re: Rethinking kernel-related roadmap process
From: Deepak Saxena, 2012-03-08
Re: Rethinking kernel-related roadmap process
From: John Stultz, 2012-03-08
[READ THIS] Really rethinking the kernel process, was Re: Rethinking kernel-related roadmap process
From: Christian Robottom Reis, 2012-03-08