← Back to team overview

openstack team mailing list archive

Re: Remove Zones code - FFE

 

Thierry et. al.
Responses inline.

On 02/02/2012 06:03 AM, Thierry Carrez wrote:
Chris Behrens wrote:

Well, I can actually say with confidence that the replacement would be stable by essex release.  In fact, I expect the first commit to be a completely working solution that solves a number of problems with the original implementation.  I don't think there's any issue getting something committed by the 15th if there's not much bickering on the review.  The code is dead simple (currently a 500 line diff) and requires almost no modification of nova core.  The only modifications to nova core are:

Specify a different compute API class to use
Modify rpc code to allow some kwargs to Connection __init__ so you can specify a specific rabbit server to use to send a message to a zone
Add 2 new rpc methods:   cast_to_zone and call_to_zone (which use the above change)
Add a few zone_api.update_instance() calls in some places in compute so that we can push instance updates to top level zone.
Migration for zones DB table to add rpc credential information.
There's 1 thing that would be lost in what I'd propose:  zone scheduling would initially be random zone selection.
There's actually no concern for us in the scheduler for being "random" actually, we prefer this since our strategy is more aligned with the "spread first" scheduling method.

Besides that, the rest of the code is standalone.  There's absolutely no concern that it'll make non-zones less stable.  The few zone_api.update_instance() type calls would be no-ops when zones is turned off.

It still introduces a bit of disturbance in the Force, if only by loss
of focus on bugfixing. From where I'm standing it boils down to how
broken the current zone code is... and how much better the replacement
code would be.

I'll admit I have trouble properly evaluating how functional zone code
is with my couple laptops, so I'll trust you guys (Chris, Vish,
Alejandro) on that: if the new code is significantly more functional
than the old one, and the whole thing does not really touch non-zone
code, I'd say go for it... by the 15th at the latest !

So, if Chris is confident to merge the new zone code to 2/15 we'll be pleased to test it, to continue to our new features early adoption philosophy :D As we said before, today we are covering our need of multizone across our datacenters with our in-house custom scheduling api, one that we want to deprecate asap as we move to Essex making direct calls to the parent zone ( nova api ) doing this in a native way.

For us, multizone is a MUST for taking Essex into production, so if you Chris can make it to 2/15 we will be testing all this new code on 2/15 one minute later after commit :D so we can give you as much feedback as you need almost immediately.

Vish, if multizone actually worked since E-2 we were thinking about going with E-3 into prod and putting all our efforts till we wait for 2012-1, this is not what we prefer, but since we didnt find a feature more attractive in diablo than cactus and multizone code in diablo was pretty much broken than Essex. We have to make a balance between what we need and what openstack actually gives us ( in that case, we have to develop ourselves without digging into nova-core source code)

So definitely its a +1 for merging the code to 2/15.

Best.
Alejandro.



References