fuel-dev team mailing list archive

Thread
Date
Re: Propose new rules for bringing in external puppet modules

To: Dmitry Borodaenko <dborodaenko@xxxxxxxxxxxx>
From: Vladimir Kuklin <vkuklin@xxxxxxxxxxxx>
Date: Sat, 29 Mar 2014 01:05:12 +0400
Cc: "fuel-dev@xxxxxxxxxxxxxxxxxxx" <fuel-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CAM0pNLNUeGSJj1XN25wqC7ECqi5QTVDKObEUgMuWn8heZhUcRw@mail.gmail.com>
Dmitry,

I do completely agree with you that using upstream code is the preferred
way. I was just saying, that our main goal is to deliver the product that
satisfies users' requirements. This means, that we need to find a
compromise between staying close to the upstream and delivering features.
In fact, I've being a strong adherent of using upstream code all the time,
and this is one of our main purposes. This will significantly increase our
engineering velocity. I am completely for that.
All I am saying is that it is possible to pursue both goals in parallel. We
are moving to using granular roles in our upcoming releases, thus allowing
us to isolate bad/inflated code pieces and striping them one by one for
each module. Thus it will allow us to finally come to the
as-close-as-possible-to-upstream code and retain feature delivery in time.
Stopping our ongoing activities and pursuing the goal of upstream code
reusage alone will not make users happy.

My suggestion is that we do concentrate on the upstream, start all the
activites possible (and required) to refactor the code that is not so close
to upstream. Let's do some discussion on logstash module for example and
find a way how to make it closer to the upstream.

Starting by now let's assume that we do not accept modules that are too far
from upstream, require upstream commit to be specified in an initial commit
of the module and commit small diverging pieces one by one.



On Fri, Mar 28, 2014 at 11:41 PM, Dmitry Borodaenko <
dborodaenko@xxxxxxxxxxxx> wrote:

> No. The fact that we failed miserably at tracking upstream so far is
> not a valid excuse for refusing to even try:
>
> http://hakanforss.files.wordpress.com/2014/03/are-you-too-busy-to-improve1.png
>
> Vladimir, you optimization goal is flawed: it assumes that staying
> close to upstream is a goal in itself. It is not a cult, there are
> important reasons why we want to track upstream.
>
> Yes, sometimes we find and fix bugs in the upstream code, but it is
> much more common for upstream to find and fix its own bugs. On
> average, we loose code quality over time, and that causes us to lose
> velocity by wasting efforts on fixing bugs that were already addressed
> by upstream, or bugs that were introduced because our code is poorly
> structured and full of ugly hacks that upstream would have done (or
> already did) better.
>
> Staying close to upstream and submitting your fixes to them on a
> regular basis also makes it more likely that upstream developers would
> be able and willing to help you. Right now we can't report bugs
> upstream (because we've diverged too far), we can't submit patches
> (often because we do things in a Fuel specific way that would be
> unacceptable to upstream), so we're always on our own when we have
> problems with any of the Puppet code in fuel-library.
>
> In short, assuming that we can do a better job alone than in
> collaboration with all our upstreams is just arrogant. We're not that
> good and there's not that many of us.
>
> Bogdan, your logstash commit is such an obvious illustration of the
> above that I am very surprised that I have to explain it.
>
> When I said that merging this patch will increase our Puppet code base
> by 130% I expected everybody to understand that it is nothing short of
> a catastrophe:
>
> https://stackoverflow.com/questions/184071/when-if-ever-is-number-of-lines-of-code-a-useful-metric
>
> When I said that it looked auto-generated I thought it would be
> treated as a red flag to be verified, not just an observation that can
> be ignored. You have to have very good reasons to commit
> auto-generated code (instead of just the generator), and
> puppet-logstash didn't have them. Even worse, it didn't even include
> the generator script (although commit history looks as if they kept
> using it after the initial commit).
>
> The fact that current version of puppet-logstash is able to do the
> same task in merely 673 lines of Puppet code instead of 20790 is by
> itself a MASSIVE improvement. If we commit the old version now, and
> give up on tracking upstream, we miss this improvement and are stuck
> maintaining an obsolete 20KLOC version for years. It will not be
> pleasant. If at some point we give up and try to move to a newer
> upstream version, we'll have to deal with the fact that upstream had
> to do a full rewrite of the manifests. And we would have to spend even
> more effort migrating to the new upstream version in the future that
> we would now, because we'll have accumulated more Fuel specific
> changes that will have to be redone essentially from scratch.
>
> I don't see what better example of needing a newer upstream version
> you may need. There may be cases where it makes sense to postpone
> merging latest upstream, but this is certainly not one of them.
> Especially since it's not merged yet.
>
>
> On Fri, Mar 28, 2014 at 5:36 AM, Vladimir Kuklin <vkuklin@xxxxxxxxxxxx>
> wrote:
> > Guys, I suggest a hybrid approach.
> >
> > Let's just specify what do we want. Let me write these requirements down
> in
> > some kind of optimization problem
> >
> > We want to be as close to upstream modules as possible, subject to the
> fact
> > that our code should satisfy all the functional requirements for
> particular
> > features.
> >
> > In this case I would assume we need the following to do:
> >
> > 1) For any request require from the requester to submit the original
> module.
> > Not the current upstream HEAD, but his head where his own development
> > branched from.
> > 2) Then the developer must submit the diverged code. As long as it meets
> > functional requirements, it should be OK to use it in FUEL
> > 3) Set upstream merging task as a "stretch goal" and strive to be as
> close
> > to upstream as possible.
> >
> > There should almost no cult of "upstream" code in our project, I think.
> > Because if there is a code piece that is not working in the upstream and
> we
> > fix it and upstream community does not accept it, then we have nothing
> to do
> > but maintain our own fork. Thus, let's split requests according to this
> > approach and request diverging changes. If they work - let's merge them
> and
> > leave upstream merging as background task.
> >
> >
> > On Fri, Mar 28, 2014 at 12:08 PM, Bogdan Dobrelya <
> bdobrelia@xxxxxxxxxxxx>
> > wrote:
> >>
> >> On 03/19/2014 07:45 PM, Dmitry Borodaenko wrote:
> >> > On Wed, Mar 19, 2014 at 4:56 AM, Bogdan Dobrelya
> >> > <bdobrelia@xxxxxxxxxxxx> wrote:
> >> >> On 03/19/2014 01:31 PM, Dmitry Nikishov wrote:
> >> >>> Actually, zabbix module doesn't come from any upstream repo, it's
> our
> >> >>> own reimplementation of the module by PL team, which was based on
> >> >>> upstream code. In Zabbix thread it was suggested that we split it
> into
> >> >>> multiple commits instead of trying to push the whole thing at once.
> >> >
> >> > It can't be both our own reimplementation and based on upstream code
> >> > at the same time. If it's not rewritten from scratch, you should track
> >> > down the original upstream version and start your commit series from
> >> > that.
> >> >
> >> >>> It's unclear how it will affect internally developed modules like
> >> >>> zabbix
> >> >>> one. Should there be any at all? Or should we make a public repo
> with
> >> >>> that module first and then try to include it into fuel-library?
> >> >
> >> > That's a very good question. Personally, I think that we should prefer
> >> > the second option most of the time: create a public repo for the
> >> > standalone module, and use that as upstream for the copy of that
> >> > module in fuel-library. It will make it more likely that this module
> >> > will attract other engineers who will help us find and fix bugs in
> >> > this code, and eventually even add more new features that we'd be able
> >> > to reuse in Fuel.
> >> >
> >> >> It is also not quite clear how to submit these team-specific changes.
> >> >> 1) E.g. I've submitted the puppet-logstash module from Apollo11 team
> >> >> (Poland) and that module was diverged from the original upstream
> >> >> version
> >> >> by the team. Now we have a two diverged versions - an upstream one
> and
> >> >> a
> >> >> submitted one.
> >> >
> >> > I think that you should present each hand-over as a separate commit:
> >> > {original upstream commit} -> {modifications from Apollo11 team} ->
> >> > {modifications to integrate with Fuel}. If you have the commit history
> >> > from Apollo11 and it's not too long, it would be nice to have all of
> >> > it instead of squashing it all into one commit (although if it's more
> >> > than a dozen commits it might now be worth the trouble of pulling each
> >> > commit through Gerrit).
> >> >
> >> >> 2) Should I submit the diverged commits rebased onto the upstream
> >> >> HEAD/stable version as a separate patchset which depends on the
> >> >> verbatim
> >> >> copy of HEAD/stable patchset? That could be a very bad idea, because
> >> >> rebasing might broke the module completely.
> >> >
> >> > Yes, I also think that would be a bad idea. Rebase is a kind of a
> >> > change, you don't want to combine that and other changes in a single
> >> > commit, or you loose ability to distinguish what were you changes and
> >> > what changed due to rebase.
> >> >
> >> >> 3) Should I do the same as (2) but use the common parent commit as an
> >> >> upstream base for verbatim copy instead? Despite on no more rebasing
> >> >> needed, that is not so good idea as well, because it would also
> >> >> complicate the  upstream sync contribution process, if any planned in
> >> >> the future.
> >> >
> >> > You're not going to be able to kill two rabbits with one stone here.
> >> > Either you significantly diverge from upstream, and keeping up will be
> >> > near impossible, or you keep Fuel specific changes minimal and well
> >> > isolated, and keeping up becomes simple.
> >> >
> >> > a) If you diverge, the best you can do is submit all your
> >> > non-Fuel-specific improvements to upstream (you will obviously need to
> >> > heavily modify them to decouple from Fuel specific code), and then
> >> > periodically (e.g. once per Fuel release) merge upstream changes back
> >> > into Fuel by hand. The further you deviate, the harder this process
> >> > becomes. It becomes even harder if you don't submit anything to
> >> > upstream, because there will be more changes to hand-port later.
> >> >
> >> > b) If you can isolate Fuel specific code, keeping up with upstream
> >> > becomes much easier. Create a fork of upstream repo on Github, create
> >> > a fuel branch in that fork, commit all changes for that module to the
> >> > fuel branch before submitting them to fuel-library. Submit non Fuel
> >> > specific changes to upstream (keep them on your fuel branch until they
> >> > are merged). Pulling a new upstream version into fuel-library becomes
> >> > a rebase of your fuel branch of the forked upstream onto the latest
> >> > upstream release, and then copying the result verbatim into
> >> > fuel-library. If Fuel specific code is well isolated, that rebase will
> >> > be trivial.
> >> >
> >> >> 4) So, looks like the only good option is to accept changes to the
> >> >> puppet modules which are only the sync requests from the upstream
> (see
> >> >> Openstack projects and Oslo) and never change them locally in the
> Fuel?
> >> >> But I'm afraid the Fuel puppet modules are not ready yet for such
> >> >> dramatical changes... Looks like we need a kind of Fuel-oslo ;)
> >> >
> >> > I think we're very far from being able to use this approach.
> >> >
> >>
> >> Dmitry, thank you - that is a good point that makes sense.
> >> But looks like we still have a decision-blocker for introducing any new
> >> modules for puppet in Fuel-library, if one was not synced with the
> >> upstream while in dev / PoC process (i.e. we silently follow the
> >> rejected options (2) and (4) despite on the fact we have near to 90% of
> >> puppet modules in Fuel are far more than 1 year outdated and never
> >> synced with upstream)
> >>
> >> Here is an example of such decision blockers:
> >> https://review.openstack.org/#/c/80025/
> >> https://review.openstack.org/#/c/80024/
> >>
> >> Hence, lets accept we have a blocker condition in introducing puppet
> >> modules for Fuel, lets decide how to deal with it once again and follow
> >> it in reviews as well.
>
> --
> Dmitry Borodaenko
>



-- 
Yours Faithfully,
Vladimir Kuklin,
Fuel Library Tech Lead,
Mirantis, Inc.
+7 (495) 640-49-04
+7 (926) 702-39-68
Skype kuklinvv
45bk3, Vorontsovskaya Str.
Moscow, Russia,
www.mirantis.com <http://www.mirantis.ru/>
www.mirantis.ru
vkuklin@xxxxxxxxxxxx
References

Propose new rules for bringing in external puppet modules
From: Andrew Woodward, 2014-03-18
Re: Propose new rules for bringing in external puppet modules
From: Dmitry Borodaenko, 2014-03-19
Re: Propose new rules for bringing in external puppet modules
From: Dmitry Nikishov, 2014-03-19
Re: Propose new rules for bringing in external puppet modules
From: Bogdan Dobrelya, 2014-03-19
Re: Propose new rules for bringing in external puppet modules
From: Dmitry Borodaenko, 2014-03-19
Re: Propose new rules for bringing in external puppet modules
From: Bogdan Dobrelya, 2014-03-28
Re: Propose new rules for bringing in external puppet modules
From: Vladimir Kuklin, 2014-03-28
Re: Propose new rules for bringing in external puppet modules
From: Dmitry Borodaenko, 2014-03-28