← Back to team overview

openstack team mailing list archive

Re: Multi-Cluster/Zone - Devil in the Details ...

 

On Wed, Feb 16, 2011 at 04:33:06PM -0500, Jay Pipes wrote:
> > While I would agree with this most of the time, there are some cases
> > where "optimizing later" just doesn't work. Or, "optimizing" means
> > rewriting everything you've done and replacing it with something
> > that will scale seamlessly. I've done this a fair share myself over
> > the years, and many of us have probably done it or seen it happen
> > elsewhere. I was hoping to use past experiences and foresight to
> > prevent a similar outcome with Nova.
> >
> > I'm not confident the current Nova database and messaging foundations
> > will hold up (even with optimizations) for the scale, security, and
> > user experience we are targeting. Spending more time on reworking those
> > foundations before diving right into implementing the distributed
> > aspects (multi-zone) is what I was trying to do and advocate. I
> > recognize I'm in the minority with this opinion and it doesn't seem
> > to be the route we'll take, so I hope to be proven wrong. :)
> 
> And I also raised concerns about performance of having Nova not
> understand the relationships of multi-tenancy. ;)

Yeah, understood, and I was thinking of that when writing my last
email. I understand sometimes we just need to defer to the more
popular opinion, no matter how strongly we feel. :)

> Unfortunately, I am still unclear as to what precisely you are
> proposing that Sandy change before going any further in his work.
> Could you be specific on what steps you feel Sandy et al should take
> that would eliminate your worry about scalability? What specifically
> are the foundations that you want to rework? And are these realistic
> to get done in Cactus?

The list of things was discussed at the design summit (which I know
not everyone was part of) and is outlined in the 'Design' section of:

http://wiki.openstack.org/DistributedScheduler

I mostly finished the first bullet point in Bexar and was working
on two, but with the addition of features and DB schema changes it
was becoming a never-ending battle. Focusing on features is a bit
premature for the project, we should be focusing on foundation and
scalability, which I know Cactus is doing to some degree.

I consider the multi-zone and distributed scheduler features a bit
premature still, I believe we should first focus on full request
marshaling (not relying on DB to pass details), not writing to the DB
from workers or API code, and creating a data aggregation channel/API
to use once we get to the zone work. The zone and distributed scheduler
issues we're discussing today would be pretty different (and I believe
easier) if this foundation work was in place.

This may mean the zone/scheduler work is not finished for Cactus,
but I believe in the long run the investment will be worth it. As I
stated in my last email though, I'll defer to the larger group since
this doesn't seem to be the preferred route. :)

-Eric



References