← Back to team overview

mimblewimble team mailing list archive

Re: introduction

 

Anecdotally, over the past year I've experienced performance issues running
both Ripple and Parity nodes. These issues were generally related to disk
I/O and more specifically pointed toward RocksDB being the culprit.

I solved the Ripple I/O issue by changing nodes to use Ripple's
implementation-specific "NuDB"

The Parity issue got a lot better when they upgraded the version of RocksDB
that they were using, though I heard that there is also an initiative to
write a Parity specific DB.

On Fri, Mar 9, 2018 at 1:29 PM, Ignotus Peverell <
igno.peverell@xxxxxxxxxxxxxx> wrote:

> I'm not sure why but RocksDb seems really unpopular and lmdb very popular
> these days. Honestly, I didn't put that much thought into RocksDb
> originally. When I started on grin, I looked at the code of other Rust
> blockchain implementations. Parity was the more advanced one (on Ethereum)
> and they were using RocksDb, so I figured it would work out okay and the
> bindings would at least be decent. One often overlooked aspects of a
> database is the quality of the bindings in your PL, because poorly written
> bindings can make all the database guarantees go away. And I was a lot more
> worried about the cryptography and the size of range proofs back then.
>
> I know the opinions of the lmdb author and others regarding atomicity in
> storages and frankly, I think they're a little too storage-focused (I've
> known some Oracle DBAs with similar positions). In my experience, from an
> application standpoint, putting too much trust in storage guarantees is a
> bad idea. Everything fails eventually, and when it does storage people are
> pretty quick to put the blame on disks (gotta do Raid 60), networks,
> language bindings, or you. Btw I'm guilty as well, I have implemented some
> simple storages in the past.
>
> Truth is, it's actually rather easy to write a resilient blockchain node
> on a not-so-resilient storage (note: I'm talking about a node here, not
> wallets). The data is immutable and can be replayed at will. You messed up
> on the last block? Fine, restart on the one before that and just make sure
> it's all idempotent. If you're dealing with balances it's a little more
> complicated, but a node does not. And with careful design, you can make a
> lot of things idempotent. It's also practically impossible for grin to rely
> on an atomic storage because we have a separate state (Merkle Mountain
> Ranges) that are specifically designed to be easy to store in a flat file,
> while very unwieldy and slow to store in a k/v db. They're append-only for
> the most part so dealing with failure is also very easy (note: does not
> preclude bugs, but those get fixed). And when you squint right, the whole
> blockchain storage is append-only. From a storage standpoint, it's hard to
> find a more fault-tolerant use case.
>
> So anyway, I'm definitely not married to RocksDb, but I don't think it
> matters enormously either. My biggest beef with it at this point is that
> it's a pain to build and has probably 10x the number of features we need.
> But swapping it out is just 200 LOC [1]. So maybe it's worth doing it just
> for this reason.
>
> Now I'm going to link to this email on the 10 other places where I've been
> asked about this :-)
>
> - Igno
>
> [1] https://github.com/mimblewimble/grin/blob/master/store/src/lib.rs
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> On 8 March 2018 10:44 PM, Luke Kenneth Casson Leighton <lkcl@xxxxxxxx>
> wrote:
>
> > On Thu, Mar 8, 2018 at 8:03 PM, Ignotus Peverell
> >
> > igno.peverell@xxxxxxxxxxxxxx wrote:
> >
> > > > > There is a denial-of-service option when a user downloads the
> chain,
> > > > >
> > > > > the peer can give gigabytes of data and list the wrong unspent
> outputs.
> > > > >
> > > > > The user will see that the result do not add up to 0, but cannot
> tell where
> > > > >
> > > > > the problem is.
> > >
> > > > which to be honest I do not quite understand. The user normally
> downloads
> > > >
> > > > the chain by requesting blocks from peers, starting with just the
> headers
> > > >
> > > > which can be checked for proof-of-work.
> > >
> > > The paper here refers to the MimbleWimble-style fast sync (IBD),
> >
> > hiya igno,
> >
> > lots of techie TLAs here that clearly tell me you're on the case and
> >
> > know what you're doing. it'll take me a while to catch up / get to
> >
> > the point where i could usefully contribute, i must apologise.
> >
> > in the meantime (switching tracks), one way i can definitely
> >
> > contribute to the underlying reliability is to ask why rocksdb has
> >
> > been chosen?
> >
> > https://www.reddit.com/r/Monero/comments/4rdnrg/lmdb_vs_rocksdb/
> >
> > https://github.com/AltSysrq/lmdb-zero
> >
> > rocksdb is based on leveldb, which was designed to hammer both the
> >
> > CPU and the storage, on the assumption by google engineers that
> >
> > everyone will be using leveldb in google data centres, with google's
> >
> > money, and with google's resources, i.e. CPU is cheap and there will
> >
> > be nothing else going on. they also didn't do their homework in many
> >
> > other ways, resulting in an unstable pile of poo. and rocksdb is
> >
> > based on that.
> >
> > many people carrying out benchmark tests forget to switch off the
> >
> > compression, or they forget to compress the key and/or the value being
> >
> > stored when comparing against lmdb, or bdb, and so on.
> >
> > so. why was rocksdb chosen?
> >
> > l.
>
>
>
> --
> Mailing list: https://launchpad.net/~mimblewimble
> Post to     : mimblewimble@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~mimblewimble
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References