← Back to team overview

launchpad-reviewers team mailing list archive

Re: [Merge] ~ilkeremrekoc/launchpad:chunk-librarian-gc into launchpad:master

 

Added some comments. There is other place were we should make this change. So far, it looks good to me! :)

Diff comments:

> diff --git a/lib/lp/services/librarianserver/librariangc.py b/lib/lp/services/librarianserver/librariangc.py
> index bf2c4cf..2966295 100644
> --- a/lib/lp/services/librarianserver/librariangc.py
> +++ b/lib/lp/services/librarianserver/librariangc.py
> @@ -696,24 +697,45 @@ def delete_unwanted_disk_files(con):
>  
>      swift_enabled = getFeatureFlag("librarian.swift.enabled") or False
>  
> -    cur = con.cursor(name="librariangc_disk_lfcs")
> +    cur = con.cursor()  # nameless cursor for multiple executions
>  
>      # Calculate all stored LibraryFileContent ids that we want to keep.
>      # Results are ordered so we don't have to suck them all in at once.
> -    cur.execute(
> +    def get_next_wanted_content_id_generator():
>          """
> -        SELECT id FROM LibraryFileContent ORDER BY id

There is another place where we used the same query, I think we can make the same change there :)

> +        Generator that yields IDs from LibraryFileContent in chunks using
> +        keyset pagination. Fetches rows in batches of DATABASE_CHUNK_SIZE for
> +        efficiency and stops when done.
> +
> +        yields: int: Next ID from the table.
>          """
> -    )
> -    content_id_iter = iter(cur)
> +
> +        last_id = 0
> +        while True:
> +            cur.execute(
> +                """
> +                SELECT id
> +                FROM LibraryFileContent
> +                WHERE id > %s
> +                ORDER BY id
> +                LIMIT %s
> +                """,
> +                (last_id, DATABASE_CHUNK_SIZE),
> +            )
> +
> +            count = 0
> +            for row in cur:  # stream rows
> +                yield row[0]
> +                last_id = row[0]
> +                count += 1
> +
> +            if count == 0:
> +                break  # no more rows
> +
> +    content_id_iter = get_next_wanted_content_id_generator()
>  
>      def get_next_wanted_content_id():

do we need this oneliner function?

> -        try:
> -            result = next(content_id_iter)
> -        except StopIteration:
> -            return None
> -        else:
> -            return result[0]
> +        return next(content_id_iter, None)
>  
>      removed_count = 0
>      content_id = next_wanted_content_id = -1


-- 
https://code.launchpad.net/~ilkeremrekoc/launchpad/+git/launchpad/+merge/491294
Your team Launchpad code reviewers is requested to review the proposed merge of ~ilkeremrekoc/launchpad:chunk-librarian-gc into launchpad:master.



Follow ups

References