← Back to team overview

launchpad-dev team mailing list archive

Re: Generic Colletion?

 

Hi Martin,

On Mon, Aug 2, 2010 at 7:33 AM, Martin Pool <mbp@xxxxxxxxxxxxx> wrote:
> On 2 August 2010 11:55, Jeroen Vermeulen <jtv@xxxxxxxxxxxxx> wrote:
>>> I think that it would be valuable to put some performance things to
>>> watch out for on that page too.
>>
>> I'll add a note.
>>
>> Thanks for doing this.  I was still kind of waiting for other solutions to
>> show up that might supplant mine (for instance Storm apparently has
>> something built in now) but I guess such a thin and convenient interface
>> never hurts anyway.  It gives us some freedom to mess with implementations,
>> too.
>
> I talked a bit to Jamu at the Rally about collections, and I recall
> there being some good reasons why we should use the builtin Storm ones
> rather than the Launchpad collection.  Unfortunately I left my memory
> of the exact differences on the other side of the world but maybe he
> can repeat it?

Actually, there is no builtin support in Storm for collections... we
use the pattern in Landscape, and have a base class [1] we use, but
implement it differently than in Launchpad.

In Launchpad the collection pattern is implemented as an object that
stores a list of Storm expressions built up as filtering methods are
called.  These expressions are then used as the where clause when
collection.find() is called.  This method is simple to implement and
works well until you need to optimize a query or generate a
particular query when a particular set of criteria are provided.  At
that point, you have a list of expression objects that are hard to
introspect and hard to restructure.  Jeroen suggested storing extra
data, when particular filtering methods are called, that can be used
to help generate the final query.  I haven't actually tried to do
this, but my impression is that it'll still be hard.

The way we implement the pattern in Landscape is, I think, a bit
simpler.  We don't end up with a very general solution like you have
with the implementation in Launchpad, but we don't have that many
objects we want collections for, so it's not really a problem.
Anyway, the pattern we use is to store the primitive values needed
to construct the query.  When collection.with_computer_ids(ids) is
invoked we store 'ids', instead of creating a Computer.id.is_in(ids)
expression and storing that.  Each collection must implement a
_get_result method that uses the primitive data that has been
gathered to generate a query.  The upshot of this approach is that
we have all the data that will be used in the query at the time we
generate it, and so it's fairly easy to decide what kind of query to
generate.  We have a case [2] where we generate one of three quite
differently spelt queries, depending on the particular filtering
that has been requested, which we have to do for performance
reasons.  Another nice aspect of storing just the raw data is that
it's easy to serialize to a session or somesuch.  It can be easily
used to recreate a collection in the pre-serialized state.

It's interesting to see how the pattern can be implemented in
different ways.  I don't think there's a black and white answer for
which is best, but the method we've used has been working for us so
far.

Thanks,
J.

[1] https://bazaar.launchpad.net/~landscape/landscape/trunk/annotate/head%3A/canonical/collection/collection.py
[2] https://bazaar.launchpad.net/~landscape/landscape/trunk/annotate/head%3A/canonical/landscape/model/activity/collection.py



Follow ups

References