← Back to team overview

launchpad-dev team mailing list archive

Re: riptano 0-60

 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/2010 7:57 AM, Robert Collins wrote:
...

>> That relates to a specific use-case I have in mind: translations sharing
>> that we do.  With our current model, updating a single translation in
>> one place updates it for a dozen or so "contexts" (i.e. in both Ubuntu
>> Lucid and Ubuntu Maverick).  It means we'd have to do a dozen updates to
>> replicate the functionality with a fully denormalized model, and if
>> updates are slower (they basically include a read, right?) then we'd hit
>> a lot of trouble.
> 
> updates are writes - they don't (by default at least) need to read the
> old data at all. And if the db servers were to read the old data, that
> would be localised per-node holding the result, so - lets say we had 6
> nodes (which is what a loose discussion with mdennis suggested we'd
> need), then an write of a row would:
>  - on 3 machines add a row to the memtable
>  - on the coordinator, wait for 2 machines to ack that they had done the write
>  - return
> Writing two rows would be the same as one row, twice - but the three
> machines would be different : not the other three, but a
> three-per-row-key hash.
> 
> If we in the appserver needed to read-then-write, that would be a
> little different - but its also a bit of an anti pattern in Cassandra,
> apparently.

Just to mention there isn't the equivalent of "UPDATE foo=bar WHERE "
that is available from an RDBMS. As such it seems likely to me that
you'll need to do read-then-write for anything that you want to update.


> Paraphrasing, is it:
> result = defaultdict(set)
> for language in all_languages:
>     for product in products:
>         result[language].add(product.translations[language][english_string])
> ?
> 
> I can imagine storing that normalised and ready to use all the time :)
> 
> -Rob
> 

^- You're talking about the collating here, not the updating, right? (In
the context that it will set how we design the tables, and thus would
affect how we need to update them.)

I think the Cassandra way might be to just index all values of
"english_string" across all the different rows/CFs that it might exist.
But you'd still have to read that index to figure out what entries need
to be changed.

Either that, or when you store the translation, you use indirection.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzioQYACgkQJdeBCYSNAAPlMACdG3jYTmIhMNKKPjFjTEhOxSkR
h5YAn2AhsO9m0HwrEyJ7EpvWwps25XC0
=DBeV
-----END PGP SIGNATURE-----



References