← Back to team overview

maas-devel team mailing list archive

Re: Re-architecting without cobbler

 

On Tue, May 8, 2012 at 6:44 PM, Julian Edwards
<julian.edwards@xxxxxxxxxxxxx> wrote:
> On Saturday 05 May 2012 12:21:24 Clint Byrum wrote:
>> Excerpts from Julian Edwards's message of Thu May 03 23:42:12 -0700 2012:
>> > On Friday 04 May 2012 18:12:44 Robert Collins wrote:
>> > > I would discourage, strongly discourage, any direct DB access from
>> > > pserv: our experience with LP with such access has been universally
>> > > bad. Let the appserver drive the DB exclusively, and offer appropriate
>> > > APIs for getting stuff from/to it. I think we glossed over this on
>> > > IRC; celery talking to postgresql might mean this needs some extra
>> > > glue for celery, or something.
>> >
>> > For read access only, can you elaborate why this is bad?
>>
>> I've not been involved with Launchpad, but I have done a few multi-tiered
>> architectures.
>>
>> There are a few reasons:
>>
>> * The database used is an implementation detail. Putting a lightweight
>> layer of indirection between the DB and the other pieces of the app means
>> being able to swap out the DB for the cases that matter. With the hyper
>> scale requirement, this is likely to happen as it becomes clear which
>> tables just cannot be served through a purely relational model. API's
>> map intentions rather than implementations.
>>
>> * API's can be used as layers of control. The postgres and mysql
>> protocols both make proxying a real chore, and so, its hard to control
>> the number of threads. pgbouncer seems pretty good, but it then requires
>> a dedicated proxy just for pgsql, which ties you further into pg. An API
>> call, however, can be extended to provide needed metrics, and then be
>> an intelligent choke point or pressure-release for a limited resource
>> like the database.
>>
>> * Intelligence in the pipeline. This makes it easier to cache
>> intelligently, easier to route/shard/etc. The layer of indirection used
>> to be just in code, but you really need it in the network separation
>> so that the pieces can be scaled individually and whole parts can be
>> refactored without touching every place that might access that place.
>
> Thanks Clint, that's well elaborated.
>
> For the record, I was playing Devil's Advocate to some extent since we'd be
> insulated through Django's ORM, but the points are well understood.
>
>> Put more succinctly, API changes are easier than schema changes.
>
> I'd argue the opposite if you're using lazr.restful :)

Hah :P

In addition to Clint's excellent points (all of which I agree with),
I'd also add two more points:

* pserv, being twisted, means that it will have a hate-hate
relationship with ORM state, just keeping it from doing silly things
like keeping a transaction open for days will be an exercise in great
care and diligence.

* all the protections we (eventually) put in place around the DB (such
as timeouts and worker concurrency limits) will have to be replicated
for pserv, and as it has a different programming model, that means
double work. In LP we haven't done this yet, and we have had the
failure modes (like a script that goes nutty keeping backps from
running, or a concurrent script causing unanticipated load) at one
time or another. MAAS, being deployed on customer sites, outside of
our ops teams reach, has to insulate itself from these sorts of
things.

-Rob


Follow ups

References