← Back to team overview

launchpad-dev team mailing list archive

result sets and iteration... unsafe idioms - seeking thoughts on how we can avoid them

 

So, I've been poking at the bug attachment timeout, and as I started
to learn the code I found this:

(Whomever wrote this, please don't feel picked-on, this is simply an
*example* of a repeated bug I've seen in the last few weeks).

Anyhow, this code:
---------------
    @property
    def regular_attachments(self):
        """The list of bug attachments that are not patches."""
        return [
            {
                'attachment': attachment,
                'file': ProxiedLibraryFileAlias(
                    attachment.libraryfile, attachment),
                }
            for attachment in self.context.attachments
            if attachment.type != BugAttachmentType.PATCH]

    @property
    def patches(self):
        """The list of bug attachments that are patches."""
        return [
            {
                'attachment': attachment,
                'file': ProxiedLibraryFileAlias(
                    attachment.libraryfile, attachment),
                }
            for attachment in self.context.attachments
            if attachment.type == BugAttachmentType.PATCH]
---------------------
is unsafe (performance wise), and its not *obvious* that its unsafe.

Its unsafe because its calling __iter__ on self.context.attachments,
which is a ResultSet; so a page template that shows
patches:
...
attachments:
...

Will cause two complete iterations of the resultset - and that means
two queries to the database, and two updates of all the objects found
by the result set.
Even where we have constrained object counts, this is still repetitive
work (and I suspect there is a 'if view/THING in the tal too, so we'll
also completely iterate these things 4 times.

If we had a separate mapping layer, this would be working totally from
within python, so even while a little inefficient, it would be
tolerable. As we don't though, its probably causing timeouts on bugs
with lots of attachments. One thing that makes this tricky is that by
returning a resultset attachments allows itself to be filtered later
on - and so it can be tricky to change the definition.

I'd love it if we can come up with some way, some convention that will
make it crystal clear when something is:
 - cheap
 - not cheap

if anyone has any ideas, please jump up and make some noise :)

-Rob



Follow ups