← Back to team overview

nova-orchestration team mailing list archive

Re: Row-level db locking vs. ZooKeeper ...

 

I was thinking more about the number of nodes, but this is interesting too.

On Tue, Nov 29, 2011 at 5:42 AM, Yun Mao <yunmao@xxxxxxxxx> wrote:
> Everything is TCP for zookeeper network communication. I don't have an
> good answer for the scalability question. It depends on the workload,
> e.g. read heavy or write heavy, throughput, object size, etc. Here is
> some performance numbers:
> https://ramcloud.stanford.edu/wiki/display/ramcloud/ZooKeeper+Performance
>
> Yun
>
> On Thu, Nov 24, 2011 at 12:09 AM, Andrew Beekhof <andrew@xxxxxxxxxxx> wrote:
>> Out of curiosity, what does zookeeper use for inter-node communications?
>> How big can it scale?
>>
>> On Wed, Nov 9, 2011 at 1:35 AM, Yun Mao <yunmao@xxxxxxxxx> wrote:
>>> Row-level db locking is an internal mechanism for database to
>>> implement ACID transactions. It's not supposed to be exposed to users
>>> and implementation specific. For example, in MySQL you can only have
>>> it with InnoDB. With MyISAM you can have table-level locking only.
>>> There is no SQL standard that I'm aware of for you to explicitly grab
>>> a lock on a row, but you can use statement like "SELECT xxx FOR
>>> UPDATE" to block other transactions. I think Nova uses this trick to
>>> allocate fixed IP addresses from the pool.
>>>
>>> ZooKeeper is different from a database. One typical usage is to use it
>>> as a lock service only. That is the state does not live in ZooKeeper,
>>> but elsewhere.  So you can grab the lock, do whatever processing and
>>> release the lock later without race condition. For instance you can
>>> spawn a VM, take snapshot, etc.. Another usage is that if the state is
>>> relatively small (e.g. within 1MB), you can put it inside on a
>>> ZooKeeper node. ZooKeeper supports atomic node update in the spirit of
>>> compare and swap.
>>>
>>> ZooKeeper also differs from a DB from a HA perspective. It is designed
>>> to be distributed and fault tolerant. With 2f+1 nodes running, you can
>>> tolerate f node failures. Databases like MySQL is designed for a
>>> single node. You can use DRBD to replicate to another node but once
>>> there is a network disconnection it's getting tricky to recover.
>>>
>>> Thanks,
>>> Yun
>>>
>>>
>>> On Tue, Nov 8, 2011 at 8:56 AM, Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx> wrote:
>>>> Hi Yun,
>>>>
>>>> Thanks again for the pointer to ZooKeeper. I think it's a very interesting project and looks like a great way to solve the worker sync problem we face with Orchestration.
>>>>
>>>> One thing that wasn't clear to me immediately was the benefit of using something like ZooKeeper vs doing row-level db locking (where the row is the transaction state)?
>>>>
>>>> Perhaps you could elaborate on that a little?
>>>>
>>>> -S
>>>>
>>>
>>> --
>>> Mailing list: https://launchpad.net/~nova-orchestration
>>> Post to     : nova-orchestration@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~nova-orchestration
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>


References