launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #07055
Re: Block the use of non-cached References on model objects
On 2011-05-08 04:10, Robert Collins wrote:
If we generate an OOPS it means that scenario wasn't tested. Thats
suboptimal at best.
Wouldn't that nudge us back towards integration-level unit testing
though? I imagine before we could open a can of worms like this we'd
need comprehensive run-time support for tracking and gathering
object-graph requirements. Changes in low-level functions have to be
able to hand their object-graph requirements up the call chain so that
we don't need to hard-code them all over the higher layers of the chain.
In a year of close attention now, I've seen one case where eager
loading was a pessimisation - with commit() being slowed down by storm
pathology when 10's of K of objects are live - it has O(live) overhead
rather than O(changed). This is a bug with storm though: the eager
loading would still have made the script faster (and more consistent)
were it not for this defect.
But have you been eager-loading based on some notion of where it made
sense to do so, or have you been doing it arbitrarily for all kinds of
references? Any common sense you may have exercised would have
introduced an optimistic selection bias(*).
For example, foo.distribution will trigger few queries right now because
distributions are few, and hot in our caches. But we'd have to preempt
demand-loading of any foo.distribution references that might possibly
come into play, even though some number of them won't. Resolving
unneeded references can add up, so we'd need to have some idea of the
costs. And of course there's a similar question of unneeded database loads.
My vague and unsubstantiated concerns are that: (1) Eager-loading the
colder portions of a reference graph may sometimes be a net loss in the
grand scheme of things (considering cold-load speed, hot-load speed,
oopses, engineering time etc.). And that (2) Brittleness in requiring
explicit exceptions to break out of eager-loading may provide false
justification for accepting those losses.
I'm sure wholesale eager loading is faster than no eager loading, but do
we know how much bad we'd be accepting with the good? This is where
policy/mechanism separation comes in. Wouldn't we end up with the
hoop-jumping required for optimization, and the pain from the oopses we
introduced, driving our priorities when they should be driven by our
actual performance goals? AFAICS non-intrusive profiling ahead of
optimization would show us the same data and support the same changes,
but without these problems.
Jeroen
(*) This is why I generally advise against exercising common sense.
Follow ups
References