← Back to team overview

fuel-dev team mailing list archive

Re: Propose new rules for bringing in external puppet modules

 

OK, we are slightly off topic here and I want to steer it back.

The point isn't that we _will_ track upstream the point is that if we don't
change we _cant_ track upstream and we _cant_ review large commits like
this.

This new processes wont give a damn about where you took from the upstream,
It will leave a marker for others to be able to have a chance in hell of
making a qualified review of the module and where you took it from as well
as separately review you changes.

Now what has occurred with https://review.openstack.org/#/c/80024/ is that
Dimitry has reviewed it under the guise of

1. Create a review request with a verbatim copy of the upstream module and
> no other related modifications. This review should also contain the commit
> hash from the upstream repo in the commit message. The review should be
> looked over for reasons for rejecting the entire module, such as license
> issues. Forbearing such issues it should be accepted with out requiring
> modifications.


Due to this review, Dimitry has found a good reason for "rejecting the
entire module" as the point that module was pulled from has alot of wasted
lines that the newer version replaces. It's now necessary to justify why we
should except this crap over the newer and cleaner version upstream.

Wile we should track and keep close to upstream or use only upstream, I
know we need to take steps. This is why I only propose that we separate out
the upstream module pull from your own development, it gives us:
* improved view of what you want to add
* improved view of what had to be done to "fuel" the module
* a clear scope of review instead of blindly adding modules
* prevents you from being the blame when the code clearly came from upstream
* review for really bad upstream modules (license, code quality, ...)
* a chance (not requirement) to diff our changes from the upstream and
propose changes back upstream

For now we can just stick with improved review process. As we get better we
can start working on upstream's more.

So to re-iterate the rules I will update on our docs / wiki with
https://etherpad.openstack.org/p/adding_fuel_lib_modules


On Fri, Mar 28, 2014 at 12:41 PM, Dmitry Borodaenko <
dborodaenko@xxxxxxxxxxxx> wrote:

> No. The fact that we failed miserably at tracking upstream so far is
> not a valid excuse for refusing to even try:
>
> http://hakanforss.files.wordpress.com/2014/03/are-you-too-busy-to-improve1.png
>
> Vladimir, you optimization goal is flawed: it assumes that staying
> close to upstream is a goal in itself. It is not a cult, there are
> important reasons why we want to track upstream.
>
> Yes, sometimes we find and fix bugs in the upstream code, but it is
> much more common for upstream to find and fix its own bugs. On
> average, we loose code quality over time, and that causes us to lose
> velocity by wasting efforts on fixing bugs that were already addressed
> by upstream, or bugs that were introduced because our code is poorly
> structured and full of ugly hacks that upstream would have done (or
> already did) better.
>
> Staying close to upstream and submitting your fixes to them on a
> regular basis also makes it more likely that upstream developers would
> be able and willing to help you. Right now we can't report bugs
> upstream (because we've diverged too far), we can't submit patches
> (often because we do things in a Fuel specific way that would be
> unacceptable to upstream), so we're always on our own when we have
> problems with any of the Puppet code in fuel-library.
>
> In short, assuming that we can do a better job alone than in
> collaboration with all our upstreams is just arrogant. We're not that
> good and there's not that many of us.
>
> Bogdan, your logstash commit is such an obvious illustration of the
> above that I am very surprised that I have to explain it.
>
> When I said that merging this patch will increase our Puppet code base
> by 130% I expected everybody to understand that it is nothing short of
> a catastrophe:
>
> https://stackoverflow.com/questions/184071/when-if-ever-is-number-of-lines-of-code-a-useful-metric
>
> When I said that it looked auto-generated I thought it would be
> treated as a red flag to be verified, not just an observation that can
> be ignored. You have to have very good reasons to commit
> auto-generated code (instead of just the generator), and
> puppet-logstash didn't have them. Even worse, it didn't even include
> the generator script (although commit history looks as if they kept
> using it after the initial commit).
>
> The fact that current version of puppet-logstash is able to do the
> same task in merely 673 lines of Puppet code instead of 20790 is by
> itself a MASSIVE improvement. If we commit the old version now, and
> give up on tracking upstream, we miss this improvement and are stuck
> maintaining an obsolete 20KLOC version for years. It will not be
> pleasant. If at some point we give up and try to move to a newer
> upstream version, we'll have to deal with the fact that upstream had
> to do a full rewrite of the manifests. And we would have to spend even
> more effort migrating to the new upstream version in the future that
> we would now, because we'll have accumulated more Fuel specific
> changes that will have to be redone essentially from scratch.
>
> I don't see what better example of needing a newer upstream version
> you may need. There may be cases where it makes sense to postpone
> merging latest upstream, but this is certainly not one of them.
> Especially since it's not merged yet.
>
>
> On Fri, Mar 28, 2014 at 5:36 AM, Vladimir Kuklin <vkuklin@xxxxxxxxxxxx>
> wrote:
> > Guys, I suggest a hybrid approach.
> >
> > Let's just specify what do we want. Let me write these requirements down
> in
> > some kind of optimization problem
> >
> > We want to be as close to upstream modules as possible, subject to the
> fact
> > that our code should satisfy all the functional requirements for
> particular
> > features.
> >
> > In this case I would assume we need the following to do:
> >
> > 1) For any request require from the requester to submit the original
> module.
> > Not the current upstream HEAD, but his head where his own development
> > branched from.
> > 2) Then the developer must submit the diverged code. As long as it meets
> > functional requirements, it should be OK to use it in FUEL
> > 3) Set upstream merging task as a "stretch goal" and strive to be as
> close
> > to upstream as possible.
> >
> > There should almost no cult of "upstream" code in our project, I think.
> > Because if there is a code piece that is not working in the upstream and
> we
> > fix it and upstream community does not accept it, then we have nothing
> to do
> > but maintain our own fork. Thus, let's split requests according to this
> > approach and request diverging changes. If they work - let's merge them
> and
> > leave upstream merging as background task.
> >
> >
> > On Fri, Mar 28, 2014 at 12:08 PM, Bogdan Dobrelya <
> bdobrelia@xxxxxxxxxxxx>
> > wrote:
> >>
> >> On 03/19/2014 07:45 PM, Dmitry Borodaenko wrote:
> >> > On Wed, Mar 19, 2014 at 4:56 AM, Bogdan Dobrelya
> >> > <bdobrelia@xxxxxxxxxxxx> wrote:
> >> >> On 03/19/2014 01:31 PM, Dmitry Nikishov wrote:
> >> >>> Actually, zabbix module doesn't come from any upstream repo, it's
> our
> >> >>> own reimplementation of the module by PL team, which was based on
> >> >>> upstream code. In Zabbix thread it was suggested that we split it
> into
> >> >>> multiple commits instead of trying to push the whole thing at once.
> >> >
> >> > It can't be both our own reimplementation and based on upstream code
> >> > at the same time. If it's not rewritten from scratch, you should track
> >> > down the original upstream version and start your commit series from
> >> > that.
> >> >
> >> >>> It's unclear how it will affect internally developed modules like
> >> >>> zabbix
> >> >>> one. Should there be any at all? Or should we make a public repo
> with
> >> >>> that module first and then try to include it into fuel-library?
> >> >
> >> > That's a very good question. Personally, I think that we should prefer
> >> > the second option most of the time: create a public repo for the
> >> > standalone module, and use that as upstream for the copy of that
> >> > module in fuel-library. It will make it more likely that this module
> >> > will attract other engineers who will help us find and fix bugs in
> >> > this code, and eventually even add more new features that we'd be able
> >> > to reuse in Fuel.
> >> >
> >> >> It is also not quite clear how to submit these team-specific changes.
> >> >> 1) E.g. I've submitted the puppet-logstash module from Apollo11 team
> >> >> (Poland) and that module was diverged from the original upstream
> >> >> version
> >> >> by the team. Now we have a two diverged versions - an upstream one
> and
> >> >> a
> >> >> submitted one.
> >> >
> >> > I think that you should present each hand-over as a separate commit:
> >> > {original upstream commit} -> {modifications from Apollo11 team} ->
> >> > {modifications to integrate with Fuel}. If you have the commit history
> >> > from Apollo11 and it's not too long, it would be nice to have all of
> >> > it instead of squashing it all into one commit (although if it's more
> >> > than a dozen commits it might now be worth the trouble of pulling each
> >> > commit through Gerrit).
> >> >
> >> >> 2) Should I submit the diverged commits rebased onto the upstream
> >> >> HEAD/stable version as a separate patchset which depends on the
> >> >> verbatim
> >> >> copy of HEAD/stable patchset? That could be a very bad idea, because
> >> >> rebasing might broke the module completely.
> >> >
> >> > Yes, I also think that would be a bad idea. Rebase is a kind of a
> >> > change, you don't want to combine that and other changes in a single
> >> > commit, or you loose ability to distinguish what were you changes and
> >> > what changed due to rebase.
> >> >
> >> >> 3) Should I do the same as (2) but use the common parent commit as an
> >> >> upstream base for verbatim copy instead? Despite on no more rebasing
> >> >> needed, that is not so good idea as well, because it would also
> >> >> complicate the  upstream sync contribution process, if any planned in
> >> >> the future.
> >> >
> >> > You're not going to be able to kill two rabbits with one stone here.
> >> > Either you significantly diverge from upstream, and keeping up will be
> >> > near impossible, or you keep Fuel specific changes minimal and well
> >> > isolated, and keeping up becomes simple.
> >> >
> >> > a) If you diverge, the best you can do is submit all your
> >> > non-Fuel-specific improvements to upstream (you will obviously need to
> >> > heavily modify them to decouple from Fuel specific code), and then
> >> > periodically (e.g. once per Fuel release) merge upstream changes back
> >> > into Fuel by hand. The further you deviate, the harder this process
> >> > becomes. It becomes even harder if you don't submit anything to
> >> > upstream, because there will be more changes to hand-port later.
> >> >
> >> > b) If you can isolate Fuel specific code, keeping up with upstream
> >> > becomes much easier. Create a fork of upstream repo on Github, create
> >> > a fuel branch in that fork, commit all changes for that module to the
> >> > fuel branch before submitting them to fuel-library. Submit non Fuel
> >> > specific changes to upstream (keep them on your fuel branch until they
> >> > are merged). Pulling a new upstream version into fuel-library becomes
> >> > a rebase of your fuel branch of the forked upstream onto the latest
> >> > upstream release, and then copying the result verbatim into
> >> > fuel-library. If Fuel specific code is well isolated, that rebase will
> >> > be trivial.
> >> >
> >> >> 4) So, looks like the only good option is to accept changes to the
> >> >> puppet modules which are only the sync requests from the upstream
> (see
> >> >> Openstack projects and Oslo) and never change them locally in the
> Fuel?
> >> >> But I'm afraid the Fuel puppet modules are not ready yet for such
> >> >> dramatical changes... Looks like we need a kind of Fuel-oslo ;)
> >> >
> >> > I think we're very far from being able to use this approach.
> >> >
> >>
> >> Dmitry, thank you - that is a good point that makes sense.
> >> But looks like we still have a decision-blocker for introducing any new
> >> modules for puppet in Fuel-library, if one was not synced with the
> >> upstream while in dev / PoC process (i.e. we silently follow the
> >> rejected options (2) and (4) despite on the fact we have near to 90% of
> >> puppet modules in Fuel are far more than 1 year outdated and never
> >> synced with upstream)
> >>
> >> Here is an example of such decision blockers:
> >> https://review.openstack.org/#/c/80025/
> >> https://review.openstack.org/#/c/80024/
> >>
> >> Hence, lets accept we have a blocker condition in introducing puppet
> >> modules for Fuel, lets decide how to deal with it once again and follow
> >> it in reviews as well.
>
> --
> Dmitry Borodaenko
>
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
>



-- 
Andrew
Mirantis
Ceph community

Follow ups

References