openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #05790
Re: Database stuff
On Tue, Nov 29, 2011 at 3:43 PM, Soren Hansen <soren@xxxxxxxxxxx> wrote:
> 2011/11/29 Jay Pipes <jaypipes@xxxxxxxxx>:
>>> Besides, we don't really use transactions. I could easily read the
>>> same data from two separate nodes, make different (irreconcilable)
>>> changes on both nodes, and write them back, and the last one to write
>>> simply wins.
>> Sure, but using a KV store doesn't solve this problem...
>
> I'm not suggesting it will. My point is simply that using a KV store
> wouldn't lose us anything in that respect.
I see your point. But then again, it comes down to whether we care
about referential integrity or transactional safety. If we don't, then
we're just building a distributed system that has unreliable
persistent storage built into it, and that, IMHO, is a bigger problem
than the as-yet-unproven assertions around scalability of a relational
database in a distributed system. (more below)
>> As soon as someone can demonstrate the performance, scalability, and
>> robustness advantages of rewriting the data layer to use a
>> non-relational data store, I'm all ears. Until that point, I remain
>> unconvinced that the relational database is the source of major
>> bottlenecks.
>
> I understand that MySQL (and the other backends supported by
> SQLAlchemy, too) scales very well. Vertically. I doubt they'll be
> bottlenecks. Heck, they're even well-understood enough that people
> have built very decent HA setups using them. I just don't think
> they're a particularly good fit for a distributed system. You can have
> a highly available datastore all you want, but I'd sleep better
> knowing that our data is stored in a distributed system that is
> designed to handle network partitions well.
I guess I don't understand this. How do you sleep at night TODAY
knowing that the data Nova stores in its persistent storage is wide
open to referential integrity problems and transactional state
inconsistencies? What's the point of having a data store that
"understands network partitions" if we don't care enough to protect
the integrity of the data we're putting in the data store in the first
place? :(
-jay
Follow ups
References