← Back to team overview

fuel-dev team mailing list archive

Re: Propose new rules for bringing in external puppet modules

 

On 03/28/2014 11:21 PM, Andrew Woodward wrote:
> OK, we are slightly off topic here and I want to steer it back.
> 
> The point isn't that we _will_ track upstream the point is that if we
> don't change we _cant_ track upstream and we _cant_ review large commits
> like this. 
> 
> This new processes wont give a damn about where you took from the
> upstream, It will leave a marker for others to be able to have a chance
> in hell of making a qualified review of the module and where you took it
> from as well as separately review you changes.
> 
> Now what has occurred with https://review.openstack.org/#/c/80024/ is
> that Dimitry has reviewed it under the guise of 
> 
>     1. Create a review request with a verbatim copy of the upstream
>     module and no other related modifications. This review should also
>     contain the commit hash from the upstream repo in the commit
>     message. The review should be looked over for reasons for rejecting
>     the entire module, such as license issues. Forbearing such issues it
>     should be accepted with out requiring modifications. 
> 
> 
> Due to this review, Dimitry has found a good reason for "rejecting the
> entire module" as the point that module was pulled from has alot of
> wasted lines that the newer version replaces. It's now necessary to
> justify why we should except this crap over the newer and cleaner
> version upstream. 
> 
> Wile we should track and keep close to upstream or use only upstream, I
> know we need to take steps. This is why I only propose that we separate
> out the upstream module pull from your own development, it gives us:
> * improved view of what you want to add
> * improved view of what had to be done to "fuel" the module
> * a clear scope of review instead of blindly adding modules
> * prevents you from being the blame when the code clearly came from upstream
> * review for really bad upstream modules (license, code quality, ...)
> * a chance (not requirement) to diff our changes from the upstream and
> propose changes back upstream
> 
> For now we can just stick with improved review process. As we get better
> we can start working on upstream's more.
> 
> So to re-iterate the rules I will update on our docs / wiki
> with https://etherpad.openstack.org/p/adding_fuel_lib_modules 

Please update the https://wiki.openstack.org/wiki/Fuel for appropriate
contribution workflow. For today it is still not clear how to contribute
to the Fuel projects, e.g. Puppet manifests for Fuel-library.

Here is an another example of stuck for MongoDB support:
1) We have a verbatim copy of upstream for puppetlabs-mongodb
https://review.openstack.org/#/c/85350/
2) We have a feature team changes for Fuel as a separate patch:
https://review.openstack.org/#/c/71901/
3) Our Ceilometer dev team is not aware of this discussion - probably
because of lack of documentation at the wiki pages.


> 
> 
> On Fri, Mar 28, 2014 at 12:41 PM, Dmitry Borodaenko
> <dborodaenko@xxxxxxxxxxxx <mailto:dborodaenko@xxxxxxxxxxxx>> wrote:
> 
>     No. The fact that we failed miserably at tracking upstream so far is
>     not a valid excuse for refusing to even try:
>     http://hakanforss.files.wordpress.com/2014/03/are-you-too-busy-to-improve1.png
> 
>     Vladimir, you optimization goal is flawed: it assumes that staying
>     close to upstream is a goal in itself. It is not a cult, there are
>     important reasons why we want to track upstream.
> 
>     Yes, sometimes we find and fix bugs in the upstream code, but it is
>     much more common for upstream to find and fix its own bugs. On
>     average, we loose code quality over time, and that causes us to lose
>     velocity by wasting efforts on fixing bugs that were already addressed
>     by upstream, or bugs that were introduced because our code is poorly
>     structured and full of ugly hacks that upstream would have done (or
>     already did) better.
> 
>     Staying close to upstream and submitting your fixes to them on a
>     regular basis also makes it more likely that upstream developers would
>     be able and willing to help you. Right now we can't report bugs
>     upstream (because we've diverged too far), we can't submit patches
>     (often because we do things in a Fuel specific way that would be
>     unacceptable to upstream), so we're always on our own when we have
>     problems with any of the Puppet code in fuel-library.
> 
>     In short, assuming that we can do a better job alone than in
>     collaboration with all our upstreams is just arrogant. We're not that
>     good and there's not that many of us.
> 
>     Bogdan, your logstash commit is such an obvious illustration of the
>     above that I am very surprised that I have to explain it.
> 
>     When I said that merging this patch will increase our Puppet code base
>     by 130% I expected everybody to understand that it is nothing short of
>     a catastrophe:
>     https://stackoverflow.com/questions/184071/when-if-ever-is-number-of-lines-of-code-a-useful-metric
> 
>     When I said that it looked auto-generated I thought it would be
>     treated as a red flag to be verified, not just an observation that can
>     be ignored. You have to have very good reasons to commit
>     auto-generated code (instead of just the generator), and
>     puppet-logstash didn't have them. Even worse, it didn't even include
>     the generator script (although commit history looks as if they kept
>     using it after the initial commit).
> 
>     The fact that current version of puppet-logstash is able to do the
>     same task in merely 673 lines of Puppet code instead of 20790 is by
>     itself a MASSIVE improvement. If we commit the old version now, and
>     give up on tracking upstream, we miss this improvement and are stuck
>     maintaining an obsolete 20KLOC version for years. It will not be
>     pleasant. If at some point we give up and try to move to a newer
>     upstream version, we'll have to deal with the fact that upstream had
>     to do a full rewrite of the manifests. And we would have to spend even
>     more effort migrating to the new upstream version in the future that
>     we would now, because we'll have accumulated more Fuel specific
>     changes that will have to be redone essentially from scratch.
> 
>     I don't see what better example of needing a newer upstream version
>     you may need. There may be cases where it makes sense to postpone
>     merging latest upstream, but this is certainly not one of them.
>     Especially since it's not merged yet.
> 
> 
>     On Fri, Mar 28, 2014 at 5:36 AM, Vladimir Kuklin
>     <vkuklin@xxxxxxxxxxxx <mailto:vkuklin@xxxxxxxxxxxx>> wrote:
>     > Guys, I suggest a hybrid approach.
>     >
>     > Let's just specify what do we want. Let me write these
>     requirements down in
>     > some kind of optimization problem
>     >
>     > We want to be as close to upstream modules as possible, subject to
>     the fact
>     > that our code should satisfy all the functional requirements for
>     particular
>     > features.
>     >
>     > In this case I would assume we need the following to do:
>     >
>     > 1) For any request require from the requester to submit the
>     original module.
>     > Not the current upstream HEAD, but his head where his own development
>     > branched from.
>     > 2) Then the developer must submit the diverged code. As long as it
>     meets
>     > functional requirements, it should be OK to use it in FUEL
>     > 3) Set upstream merging task as a "stretch goal" and strive to be
>     as close
>     > to upstream as possible.
>     >
>     > There should almost no cult of "upstream" code in our project, I
>     think.
>     > Because if there is a code piece that is not working in the
>     upstream and we
>     > fix it and upstream community does not accept it, then we have
>     nothing to do
>     > but maintain our own fork. Thus, let's split requests according to
>     this
>     > approach and request diverging changes. If they work - let's merge
>     them and
>     > leave upstream merging as background task.
>     >
>     >
>     > On Fri, Mar 28, 2014 at 12:08 PM, Bogdan Dobrelya
>     <bdobrelia@xxxxxxxxxxxx <mailto:bdobrelia@xxxxxxxxxxxx>>
>     > wrote:
>     >>
>     >> On 03/19/2014 07:45 PM, Dmitry Borodaenko wrote:
>     >> > On Wed, Mar 19, 2014 at 4:56 AM, Bogdan Dobrelya
>     >> > <bdobrelia@xxxxxxxxxxxx <mailto:bdobrelia@xxxxxxxxxxxx>> wrote:
>     >> >> On 03/19/2014 01:31 PM, Dmitry Nikishov wrote:
>     >> >>> Actually, zabbix module doesn't come from any upstream repo,
>     it's our
>     >> >>> own reimplementation of the module by PL team, which was based on
>     >> >>> upstream code. In Zabbix thread it was suggested that we
>     split it into
>     >> >>> multiple commits instead of trying to push the whole thing at
>     once.
>     >> >
>     >> > It can't be both our own reimplementation and based on upstream
>     code
>     >> > at the same time. If it's not rewritten from scratch, you
>     should track
>     >> > down the original upstream version and start your commit series
>     from
>     >> > that.
>     >> >
>     >> >>> It's unclear how it will affect internally developed modules like
>     >> >>> zabbix
>     >> >>> one. Should there be any at all? Or should we make a public
>     repo with
>     >> >>> that module first and then try to include it into fuel-library?
>     >> >
>     >> > That's a very good question. Personally, I think that we should
>     prefer
>     >> > the second option most of the time: create a public repo for the
>     >> > standalone module, and use that as upstream for the copy of that
>     >> > module in fuel-library. It will make it more likely that this
>     module
>     >> > will attract other engineers who will help us find and fix bugs in
>     >> > this code, and eventually even add more new features that we'd
>     be able
>     >> > to reuse in Fuel.
>     >> >
>     >> >> It is also not quite clear how to submit these team-specific
>     changes.
>     >> >> 1) E.g. I've submitted the puppet-logstash module from
>     Apollo11 team
>     >> >> (Poland) and that module was diverged from the original upstream
>     >> >> version
>     >> >> by the team. Now we have a two diverged versions - an upstream
>     one and
>     >> >> a
>     >> >> submitted one.
>     >> >
>     >> > I think that you should present each hand-over as a separate
>     commit:
>     >> > {original upstream commit} -> {modifications from Apollo11 team} ->
>     >> > {modifications to integrate with Fuel}. If you have the commit
>     history
>     >> > from Apollo11 and it's not too long, it would be nice to have
>     all of
>     >> > it instead of squashing it all into one commit (although if
>     it's more
>     >> > than a dozen commits it might now be worth the trouble of
>     pulling each
>     >> > commit through Gerrit).
>     >> >
>     >> >> 2) Should I submit the diverged commits rebased onto the upstream
>     >> >> HEAD/stable version as a separate patchset which depends on the
>     >> >> verbatim
>     >> >> copy of HEAD/stable patchset? That could be a very bad idea,
>     because
>     >> >> rebasing might broke the module completely.
>     >> >
>     >> > Yes, I also think that would be a bad idea. Rebase is a kind of a
>     >> > change, you don't want to combine that and other changes in a
>     single
>     >> > commit, or you loose ability to distinguish what were you
>     changes and
>     >> > what changed due to rebase.
>     >> >
>     >> >> 3) Should I do the same as (2) but use the common parent
>     commit as an
>     >> >> upstream base for verbatim copy instead? Despite on no more
>     rebasing
>     >> >> needed, that is not so good idea as well, because it would also
>     >> >> complicate the  upstream sync contribution process, if any
>     planned in
>     >> >> the future.
>     >> >
>     >> > You're not going to be able to kill two rabbits with one stone
>     here.
>     >> > Either you significantly diverge from upstream, and keeping up
>     will be
>     >> > near impossible, or you keep Fuel specific changes minimal and well
>     >> > isolated, and keeping up becomes simple.
>     >> >
>     >> > a) If you diverge, the best you can do is submit all your
>     >> > non-Fuel-specific improvements to upstream (you will obviously
>     need to
>     >> > heavily modify them to decouple from Fuel specific code), and then
>     >> > periodically (e.g. once per Fuel release) merge upstream
>     changes back
>     >> > into Fuel by hand. The further you deviate, the harder this process
>     >> > becomes. It becomes even harder if you don't submit anything to
>     >> > upstream, because there will be more changes to hand-port later.
>     >> >
>     >> > b) If you can isolate Fuel specific code, keeping up with upstream
>     >> > becomes much easier. Create a fork of upstream repo on Github,
>     create
>     >> > a fuel branch in that fork, commit all changes for that module
>     to the
>     >> > fuel branch before submitting them to fuel-library. Submit non Fuel
>     >> > specific changes to upstream (keep them on your fuel branch
>     until they
>     >> > are merged). Pulling a new upstream version into fuel-library
>     becomes
>     >> > a rebase of your fuel branch of the forked upstream onto the latest
>     >> > upstream release, and then copying the result verbatim into
>     >> > fuel-library. If Fuel specific code is well isolated, that
>     rebase will
>     >> > be trivial.
>     >> >
>     >> >> 4) So, looks like the only good option is to accept changes to the
>     >> >> puppet modules which are only the sync requests from the
>     upstream (see
>     >> >> Openstack projects and Oslo) and never change them locally in
>     the Fuel?
>     >> >> But I'm afraid the Fuel puppet modules are not ready yet for such
>     >> >> dramatical changes... Looks like we need a kind of Fuel-oslo ;)
>     >> >
>     >> > I think we're very far from being able to use this approach.
>     >> >
>     >>
>     >> Dmitry, thank you - that is a good point that makes sense.
>     >> But looks like we still have a decision-blocker for introducing
>     any new
>     >> modules for puppet in Fuel-library, if one was not synced with the
>     >> upstream while in dev / PoC process (i.e. we silently follow the
>     >> rejected options (2) and (4) despite on the fact we have near to
>     90% of
>     >> puppet modules in Fuel are far more than 1 year outdated and never
>     >> synced with upstream)
>     >>
>     >> Here is an example of such decision blockers:
>     >> https://review.openstack.org/#/c/80025/
>     >> https://review.openstack.org/#/c/80024/
>     >>
>     >> Hence, lets accept we have a blocker condition in introducing puppet
>     >> modules for Fuel, lets decide how to deal with it once again and
>     follow
>     >> it in reviews as well.
> 
>     --
>     Dmitry Borodaenko
> 
>     --
>     Mailing list: https://launchpad.net/~fuel-dev
>     Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>     <mailto:fuel-dev@xxxxxxxxxxxxxxxxxxx>
>     Unsubscribe : https://launchpad.net/~fuel-dev
>     More help   : https://help.launchpad.net/ListHelp
> 
> 
> 
> 
> -- 
> Andrew
> Mirantis
> Ceph community
> 
> 
> This body part will be downloaded on demand.
> 


-- 
Best regards,
Bogdan Dobrelya,
Skype #bogdando_at_yahoo.com
Irc #bogdando


Follow ups

References