nova team mailing list archive

Thread
Date
Re: ORM Refactor

To: Justin Santa Barbara <justin@xxxxxxxxxxxx>
From: Jesse Andrews <anotherjesse@xxxxxxxxx>
Date: Fri, 10 Sep 2010 10:57:45 -0700
Cc: nova@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTi=6kKRSaCegP68QV=-MzA4HeyayNnEt6gqbrgB+@mail.gmail.com>
To be clear, we aren't giving up on redis or nosql.

Both the intermediate layer for the data api & the choice of not using
the sqlalchemy models outside of data api help contain the "battle"
between sql/nosql to the data access layer.

The the objects passed back from the data api are treated
dictionaries, not sqlachemy objects.

Jesse

On Fri, Sep 10, 2010 at 10:51 AM, Justin Santa Barbara
<justin@xxxxxxxxxxxx> wrote:
> I did some early work on abstracting out the data store early on.  There
> were several problems with the Redis implementation:
>
> It seemed clear to me that we were effectively re-implementing a relational
> database on top of Redis.  For example, there were secondary indexes that
> needed to be maintained by hand.
> Several operations that needed to be done atomically were not being done
> atomically, so the code was not technically correct and data integrity was
> suspect.  (I suppose this is a sub-point of the first point)
> The Redis code was very vulnerable to the 1+N select problem - when
> selecting a group of objects, we would do one select to get the list of IDs,
> and then a further select to get each object by ID.
> The schema-less nature made me very uncomfortable, I felt that as the
> project grew this would become unsustainable and a huge source of bugs,
> particularly in version migrations.
> It seemed that reporting against Redis would be difficult.  Some unfortunate
> developer would therefore have to code up reports against Redis, instead of
> just being able to run SQL queries or point something like Excel or Crystal
> Reports at it (http://blog.koehntopp.de/uploads/mapreduce.png)
>
> It seems to me that the only user to have yet deployed Redis in production
> (NASA) has decided it's unsuitable; that technically Redis is
> not-fit-for-(our)-purpose for the reasons above; that private (enterprise)
> clouds will prefer using traditional databases with which they are
> comfortable.  So it seems the only potential use case for Redis is public
> clouds (Rackspace), for reasons of scalability.
> My real hope was that we would be able to have both Redis and SQL
> implementations, and we'd show that not only did Redis have all these
> problems, but we didn't get anything in return: it would be both slower
> (because of 1+N) and less scalable (because of the need to keep all the keys
> in memory); we'd then deprecate Redis.  However, we need to stay focused on
> Nova and not proving a SQL/NoSQL point - if we know what the outcome will
> be, let's just go with the right choice and not expend effort on what is
> likely to be a technical dead-end.  If someone wants to write a Redis
> back-end so that it can be benchmarked and deprecated, that's great;
> otherwise I think we should merge the patch and forget about NoSQL.
> If we let Redis get into V1, then we're stuck supporting it, and we'll have
> to solve all the above problems.  I would prefer that development effort be
> focused on building IaaS, not a relational DB on top of a key-value store.
> Justin
>
>
>
> On Fri, Sep 10, 2010 at 10:11 AM, Rick Clark <rick@xxxxxxxxxxxxx> wrote:
>>
>> Thanks, Jay.
>>
>> This covers my feelings pretty much as well.  I am concerned as well
>> that it is a 180 degree turn 3 weeks before feature freeze. I like the
>> abstraction, but I would like us to keep the support for redis.  I think
>> SQL is critical for the enterprise and private clouds, but at
>> Rackspace's scale, especially with regards to globalization, I think we
>> are going to need some kind of keystore.
>>
>> My feeling is that we put this in Austin +1 and add support for other
>> datastores.  That will also give us time to write up a blueprint and
>> have an in depth discussion about it at the summit in November.
>>
>> I have added this to the agenda for the next release meeting on Sept 14.
>>
>> Rick
>>
>> On 09/10/2010 11:56 AM, Jay Pipes wrote:
>> > Hi Vish,
>> >
>> > Such a large patch has taken me quite some time to digest.  There is a
>> > larger discussion on large patches without any specifications, but
>> > I'll leave that for a later time! :)
>> >
>> > I am torn on this one, mostly because I spent a bunch of time
>> > attempting to do the datastore refactoring myself (as did Justin Santa
>> > Barbara), and thus I know the dragons that live in this layer of the
>> > code :)
>> >
>> > One of the things that both Justin and I had tried was to keep an
>> > abstraction layer that would allow both NoSQL as well as SQL data
>> > stores to be used.  Unfortunately, it seems that this patch removes
>> > the ability to use ReDIS, among other NoSQL stores.  I think this is a
>> > mistake, and although I like much of the code in this patch, I was
>> > hoping that SQLAlchemy could be hidden behind an abstraction layer
>> > that would play nicely with the non-relational data stores.
>> >
>> > As this patch stands, we take a 180 degree turn away from NoSQL data
>> > stores and back into the relatively comfortable norms of the SQL
>> > databases.  While there's nothing particularly wrong with SQL
>> > databases (as you know, I'm a fan of many of them ;) ), I think that
>> > keeping non-relational data store capabilities is pretty critical.
>> >
>> > After an email discussion with SQLAlchemy's Michael Bayer about
>> > SQLAlchemy's future with NoSQL data stores.  Although there is an
>> > issue in the SQLAlchemy trac system about this (see here:
>> > http://www.sqlalchemy.org/trac/ticket/1518) the likelihood of this
>> > module seeing the light of day is unlikely in the next year or two.
>> >
>> > So...what to do?  There are at least four options I can see:
>> >
>> > 1) Go forward with this patch and add NoSQL stores back at some later
>> > time by ourselves
>> > 2) Go forward with this patch and wait until SQLAlchemy properly
>> > supports key value stores
>> > 3) Delay this patch until after the Austin release and have a larger
>> > discussion about it here and at the next summit
>> > 4) Go back to the drawing board and try again with a less ambitious
>> > set of patches that incrementally changes the way the data stores
>> > work.
>> >
>> > I'm personally on the fence.  I'd prefer to at least delay the patch
>> > until after Austin, but I understand there are now at least 4 branches
>> > that depend on this one, which makes things, well, a bit difficult.
>> >
>> > -jay
>> >
>> > On Tue, Aug 31, 2010 at 8:46 PM, Vishvananda Ishaya
>> > <vishvananda@xxxxxxxxx> wrote:
>> >> I've proposed a merge of the orm refactor branch that a large part of
>> >> the
>> >> nasa/anso team has been working on.  I'm hoping everyone can pick it
>> >> apart
>> >> and we end up with a really clean system that everyone likes.  I've
>> >> copied
>> >> the description of the change and issues below.  If the mailing list
>> >> debates
>> >> get too complicated, we should just organize a time to discuss it in
>> >> IRC.
>> >>
>> >> Proposing merge to get feedback on orm refactoring. I am very
>> >> interested in
>> >> feedback to all of these changes.
>> >>
>> >> This is a huge set of changes, that touches almost all of the files.
>> >> I'm
>> >> sure I have broken quite a bit, but better to take the plunge now than
>> >> to
>> >> postpone this until later. The idea is to allow for pluggable backends
>> >> throughout the code.
>> >>
>> >> Brief Overview
>> >> For compute/volume/network, there are multiple classes
>> >> service - responsible for rpc
>> >>   this currently uses the existing cast and call in rpc.py and a little
>> >> bit
>> >> of magic
>> >>   to call public methods on the manager class.
>> >>   each service also reports its state into the database every 10
>> >> seconds
>> >> manager - responsible for managing respective object classes
>> >>   all the business logic for the classes go here
>> >> db (db_driver) - responsible for abstracting database access
>> >> driver (domain_driver) - responsible for executing actual shell
>> >> commands and
>> >> implementation
>> >>
>> >> Compute hasn't been fully cleaned up, but to get an idea of how it
>> >> works,
>> >> take a look
>> >> at volume and network
>> >>
>> >> Known issues/Things to be done:
>> >>
>> >> * nova-api accesses db objects directly
>> >>   It seems cleaner to have only the managers dealing with their
>> >> respective
>> >> objects. This would
>> >>   mean code for 'run_instances' would move into the manager class and
>> >> it
>> >> would do the initial
>> >>   setup and cast out to the remote service
>> >>
>> >> * db code uses flat methods to define its interface
>> >>   In my mind this is a little prettier as an abstract base class, but
>> >> driver
>> >> loading code
>> >>   can load a module or a class. It works, so I'm not sure it needs to
>> >> be
>> >> changed but feel
>> >>   free to debate it.
>> >>
>> >> * Service classes have no code in them
>> >>   Not sure if this is a problem for people, but the magic of calling
>> >> the
>> >> manager's methods is
>> >>   done in the base class. We could remove the magic from the base class
>> >> and
>> >> explicitly
>> >>   wrap methods that we want to make available via rpc if this seems
>> >> nasty.
>> >>
>> >> * AuthManager Projects/Users/Roles are not integrated into this system.
>> >>   In order for everything to live happily in the backend, we need some
>> >> type
>> >>   of adaptor for LDAP
>> >>
>> >> * Context is not passed properly across rabbit
>> >>   Context should probably be changed to a simple dictionary so that it
>> >> can
>> >> be
>> >>   passed properly through the queue
>> >>
>> >> * No authorization checks on access to objects
>> >>   We need to decide on which layer auth checks should happen.
>> >>
>> >> * Some of the methods in ComputeManager need to be moved into other
>> >> layers/managers
>> >> * Compute driver layer should be abstracted more cleanly
>> >> * Flat networking is untested and may need to be reworked
>> >> * Some of the api commands are not working yet
>> >> * Nova Swift Authentication needs to be refactored(Todd is working on
>> >> this)
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~nova
>> >> Post to     : nova@xxxxxxxxxxxxxxxxxxx
>> >> Unsubscribe : https://launchpad.net/~nova
>> >> More help   : https://help.launchpad.net/ListHelp
>> >>
>> >>
>> >
>> > _______________________________________________
>> > Mailing list: https://launchpad.net/~nova
>> > Post to     : nova@xxxxxxxxxxxxxxxxxxx
>> > Unsubscribe : https://launchpad.net/~nova
>> > More help   : https://help.launchpad.net/ListHelp
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~nova
>> Post to     : nova@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~nova
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~nova
> Post to     : nova@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~nova
> More help   : https://help.launchpad.net/ListHelp
>
>
Follow ups

Re: ORM Refactor
From: Eric Day, 2010-09-10
References

ORM Refactor
From: Vishvananda Ishaya, 2010-09-01
Re: ORM Refactor
From: Jay Pipes, 2010-09-10
Re: ORM Refactor
From: Rick Clark, 2010-09-10
Re: ORM Refactor
From: Justin Santa Barbara, 2010-09-10