launchpad-dev team mailing list archive

Thread
Date

Re: Some launchpad data model thoughts

To: launchpad-dev@xxxxxxxxxxxxxxxxxxx
From: Ian Booth <ian.m.booth@xxxxxxxxx>
Date: Fri, 27 Aug 2010 15:17:42 +1000
In-reply-to: <AANLkTi=CdS2ES4D+HCJSeL7JEY1xTdFoayOaYVmBSdf=@mail.gmail.com>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.8) Gecko/20100821 Thunderbird/3.1.2

Hi Robert

On 27/08/10 05:39, Robert Collins wrote:
> Hi Ian, yes this is a problem I see too. I am totally for separating
> data mapping and domain logic. I don't remember which threads I've
> mentioned this in, but the first and second Performance Tuesday
> threads probably have some ramblings.
>

Thanks, I have a look. I'm still finding my way around all the mailing
lists and copious volumes of information to digest :-)

> I don't think the units tests are a key aspect of it - performance of
> those tests is very important to us for agility and dev throughput,
> but there are more profound and significant problems that are caused

Fully agree about the bigger picture architectural issues. I mentioned
unit tests to also highlight another more immediate consequence of the
current implementation, especially the development velocity aspects as
you mention.

> by the current structure. And, optimising the unit tests should come
> *after* optimising production : if we, for instance, made all our
> pages twice as fast, we can reasonably expect the test suite to get
> faster (because the test suite is exercising code paths that are now
> twice as fast). But the reverse isn't true: making the test suite
> faster is much less likely to help production performance [unless one
> makes the test suite faster by making production code paths faster].
>

I think we are saying the same thing. I wasn't advocating optimising the
tests themselves, but the actual production code base as you say. Fix
the implementation and a nice side effect will be much faster unit
tests, but that's not the primary reason for doing it as such.

> A primary problem I see is that we have no high level language [other
> than object traversal] to describe how much of the object graph we
> wish to retrieve. E.g. 'Milestone X, all specs & bugs, the product and
> project group, all people associated with X, or in key roles on any
> related item, branding for any teams found'.
>

In my experience, you do need a use case driven approach to querying the
domain model if you want to avoid performance bottlenecks and allow
easier identification and tuning of performance "hotspots". I'm not very
familiar with Storm (yet!), but certainly in the Java world products
like Hibernate provided a few approaches to the problem. You could use a
high level object based query language (HQL) to pull in specific parts
of the domain model; you could use a criteria API; or you could control
things like strategies for collection loading and what type of fetching
strategy to use (join vs subselect etc) at the O/R mapping meta data level.

<snip>

>
> I see this as a limitation of Storm, our ORM : other ORM's (both in
> and out of the python world) have an explicit, decoupled, mapping

Indeed. I think therein lies a large part of the current problem.

> In terms of what we should do here; I was imagining a mapping *layer*,
> not necessarily DAO classes, - things like GenericCollection, and
> BugTaskSet are, to me, mappers, and would fit in that layer. I'd love
> to have the layer be Storm owned and driven. Being able to express -
> without SQL - the cut points in object graph traversal and have a
> mapping layer return a resultset for the resulting graph would be
> -awesome-.
>

I absolutely agree a decoupled O/R mapping layer would be preferred. I
see DAO classes sitting above that layer forming a convenient facade to
serve up domain objects. In production, DAOs use the features of the O/R
mapping infrastructure to do "the right thing". In testing, they can
easily be stubbed out as required. A factory implementation could then
be used to transform persistent domain objects into business domain
objects for use in the service layer.

The other thing to consider is that depending on the use case, quite
often an object graph is not required to be loaded at all. Rather
projections (at the SQL level) and DTOs can be utilised to load snippets
of information required to fill out a view or other use case just
requiring primarily read only access to some data. This approach amongst
other benefits also reduces the coupling between the domain model and
other parts of the system, notably the presentation layer. This often
has discernible benefits for testing.

>
> As a strategic concern, coding for Launchpad is full of hidden
> complexity. I'd like use to incrementally remove and improve on that.
> Less global state, more things-on-contexts (e.g. Request.annotations),
> and clearer code in the mapping/domain area would be awesome too.
>

Yeah, and it would make it easier for Launchpad newbies like me to get
up to speed faster :-) Complexity is evil.

Ian

References

Some launchpad data model thoughts
From: Ian Booth, 2010-08-25
Re: Some launchpad data model thoughts
From: Guilherme Salgado, 2010-08-26
Re: Some launchpad data model thoughts
From: Jonathan Lange, 2010-08-26
Re: Some launchpad data model thoughts
From: Guilherme Salgado, 2010-08-26
Re: Some launchpad data model thoughts
From: Robert Collins, 2010-08-26