← Back to team overview

openstack team mailing list archive

Nova subsystem branches and feature branches

 

Hey,

We discussed this during the "baking area for features" design summit
session. I found that discussion fairly frustrating because there were
so many of us involved and we all were either wanting to discuss
slightly different things or had a slightly different understanding of
what we were discussing. So, here's my attempt to put some more
structure on the discussion.

tl;dr - subsystem branches are managed by trusted domain experts and
feature branches are just temporary rebasing branches on personal github
forks. We've got a tonne of work to do figuring out how this would all
work. We should probably pick a single subsystem and start with that.

...

Firstly, problem definition:

  - Nova is big, complex and has a fairly massive rate of churn. While 
    the nova-core team is big, there isn't enough careful review going 
    on by experts in particular areas and there's a consistently large
    backlog of reviews.

  - Developers working on features are very keen to have their work 
    land somewhere and this leads to half-finished features being 
    merged onto master rather than developers collaborating to get a 
    feature to a level of completeness and polish before merging into 
    master.

Some assumptions about the solution:

  - There should be a small number of domain experts who can approve 
    changes to each of major subsystems. This will encourage 
    specialization and give more clear lines of responsibility.

  - There should be a small number of project dictators who have final 
    approval on merge proposals, but who are not expected to review 
    every patch in great detail. This is good because we need someone 
    with an overall view of the project who can make sure efforts in 
    the various subsystems are coordinated, without that someone being 
    massively overloaded.

  - New features should be developed on a branch and brought to a level 
    of completeness before being merged into master. This is good 
    because we don't want half-baked stuff in master but also because 
    it encourages developers to break their features into stages where 
    each stage of the work can be brought to completion and merged 
    before moving on to the next stage.

  - In essence, we're assuming some variation of the kernel distributed 
    development model.

    (FWIW, my instinct is to avoid the kernel model on projects. Mostly 
    because it's extremely complex and massive overkill for most 
    projects. Looking at the kernel history with gitk is enough to send 
    anyone screaming for the hills. However, Nova seems to be big 
    enough that we're experiencing the same pressures that drove the 
    kernel to adopt their model)

Ok, what are "subsystem branches" and how would they work?

  - Subsystem branches would have a small number of maintainers who can 
    approve a change. These would be domain experts providing strong 
    oversight over a particular area.

    (In gerrit, this is a branch with a small team or single person who 
    can +1 approve a review)

  - Project dictators don't need to do detailed reviews of merge 
    proposals from subsystem maintainers. The dictator's role is mostly 
    just to sign off on the merge proposal. However, the dictator can 
    comment in the proposal on things which could have been done better 
    and the subsystem maintainer should take note of these comments and 
    perhaps retroactively fix them up. Ultimately, though, the dictator 
    can have exercise a veto if the merge proposal is unacceptable or 
    if the subsystem maintainer is consistently making the same 
    mistakes.

  - It would be up to the project dictators to help drive patches 
    through the right subsystem branches - e.g. they might object if 
    one subsystem maintainer merged a patch that inappropriately cut 
    into another subsystem or they might refuse to merge a given patch
    into the main branch unless it went through the appropriate 
    subsystem branch.

    (In gerrit, this would mean a small team or single person who can 
    +1 approve merge proposals on master. They would -1 proposals
    submitted against master which should have been submitted against a 
    subsystem branch.)

  - Subsystem branches might not necessarily be blessed centrally. It 
    might be a case that anyone can create such a branch and, over 
    time, establish trust with the project dictators. Subsystem 
    branches would come and go. This is the mechanism by which 
    subsystem maintainership is transferred between people over time.

    (In gerrit, this means people need to easily be able to create 
    their own branches)

    (What's more difficult to imagine in gerrit is how a new, potential 
    subsystem maintainer comes along, starts hoovering up patches into 
    her branch and submitting them in batches. Where does she hoover 
    them up from and how does she say "I've merged this into my branch, 
    don't merge it via another branch")

  - Bisectability remains important. Subsystem maintainers don't merge 
    broken commits into their subsystem branch and the project 
    dictators can enforce this using their veto. It is not good enough 
    for subsystem maintainers to consistently merge broken commits into 
    their branch, fix it up with a later commit and include both 
    commits their merge proposals.

    (I don't think we'd use Jenkins to enforce this, but subsystem 
    maintainers might use it as a tool to help them catch issues. So, 
    the full set of gating tests would only gate merges into master but 
    subsystem branches might choose to gate merges into their branch on 
    the unit tests. Subsystem maintainers might also use Smokestack to 
    pre-gate merge proposals to the subsystem branch)

  - Subsystem branches would not rebase unless the project dictator 
    outright rejects a merge request from the subsystem branch (i.e.
    "I'm not merging commit abcdef0! Fix it and rebase!"). This means 
    the subsystem maintainer will need to regularly (probably only when 
    there are conflicts to be dealt with) merge master back into the
    subsystem branch.

  - Plausible subsystem branches are e.g.:

      - OpenStack APIs
      - EC2 API
      - virt
         - libvirt driver
         - xenapi driver
         - vmware driver
      - networking
      - volumes
      - scheduler
      - RPC

    Deciding which areas make sense as a subsystem branch is 
    non-trivial.

    Should there be a "DB" subsystem? Probably not, because that would 
    mean every new feature needs to come through this branch or, 
    alternatively, the DB maintainer would need to accept DB schema 
    additions without the context of how it's being used higher up the 
    stack.

    Ok, so why does it make sense to have an "OpenStack APIs" 
    subsystem? Don't all features affect that branch too? Well, maybe, 
    but the APIs really do need strong oversight. Perhaps we can be 
    confident that we can add e.g. a new scheduler feature through the
    scheduler branch and then later merge any API additions through the 
    APIs branch.

And how about feature branches?

  - Feature branches are relatively short-lived (i.e. weeks or months
    rather than years) branches for a specific feature. They are a
    mechanism for developers to work on a patch series in the open until
    the feature is complete enough to be merged into a subsystem branch
    or master.

    (I'm not sure gerrit is right for this. Why not just do it in 
    folk's github forks? I think all people are looking for is for 
    people to be more aware of feature branches. How about if you put 
    details of your feature branch in the blueprint for the feature?)

    (If not using gerrit, can developers configure Jenkins to CI their 
    branch? Or is Smokestack the right tool?)

  - Feature branches rebase, do not contain merge commits and each 
    commit on the branch is functional, bisectable and self-contained.

  - When a feature branch is ready to be merged into a subsystem 
    branch, the patch series is submitted for review. The subsystem 
    maintainer will likely require changes to individual patches and 
    these changes would be made on the feature branch and squashed back 
    into the appropriate individual patch.

    (Ideally gerrit's "topic review" feature will get upstream and 
    we'll use that. This would mean that a patch series could be 
    proposed for review as a single logical unit while still keeping 
    individual patches as separate commits)

  - Because feature branches rebase, active day-to-day collaboration
    with others is difficult. You certainly can't have multiple people
    rebasing the same branch, that way lies madness.

    There are ways to have multiple people work actively on the same 
    rebasing branch e.g.

      http://blogs.gnome.org/markmc/2011/02/26/git-rebasing-cont/

    but, ultimately, feature branches are going to be owned by a single 
    person who might incorporate patches from others.

    (Incorporating the work of others, rebasing and squashing means a 
    patch might have multiple contributors but only one author listed in
    git. That makes CLA enforcement impossible, but we should drop the 
    CLA in favour of the kernel-like Signed-off-by: tag. See this 
    discussion: https://lists.launchpad.net/openstack/msg06544.html )

  - One option for longer-lived, active collaboration is for a subsystem
    maintainer to create a feature branch and review the work as it is 
    ongoing. The idea being that the subsystem maintainer commits to 
    not requiring the feature branch to be rebased before it is merged 
    into the subsystem branch.

Cheers,
Mark.



Follow ups