pyexiv2-developers team mailing list archive
-
pyexiv2-developers team
-
Mailing list archive
-
Message #00078
Re: Pickling and multiprocessing
On 2011-01-21, Damon Lynch <damonlynch@xxxxxxxxx> wrote:
> On Fri, Jan 21, 2011 at 3:35 AM, Olivier Tilloy <olivier@xxxxxxxxxx
> <mailto:olivier@xxxxxxxxxx>> wrote:
>
>
> Out of curiosity, what’s your use case for pickling image metadata?
> Ultimately, pickling is no more than serializing data (on disk or in
> memory), and this data is already in the image itself and can be
> "reconstructed" from just the file name. Wouldn’t that work for you?
>
>
> For instance, the problem of copying photos from memory cards onto the
> hard drive and renaming them. For each memory card, a process copies the
> photos and reads the metadata. So if you have two memory cards, that's
> two processes running in parallel. Both processes then send a message to
> a daemon renaming process, whose only task is to rename photos using the
> exif information, sequence numbers, and whatever else is needed. The
> daemon process itself could load the metadata but it's a relatively slow
> operation, and thus better to do in parallel.
I suppose that the daemon in charge of renaming photos only needs a
rather small subset of the EXIF/IPTC/XMP metadata in order to proceed to
renaming a photo. How about pickling and passing only this subset, e.g.
as a dictionary (should be easy since tags can be pickled)?
You’ll potentially save a lot of bandwidth in your inter-process
communication.
> My preliminary testing using multiprocessing, queues and pipes indicates
> that in the case of copying and renaming photos in parallel, the
> scanning phase (determining what photos are at a location and loading
> them into a TreeView to show the user) takes only 6% of the time it
> takes to do the same thing with threads and locks. Clearly the
> performance gains can be enormous.
Impressive gain indeed, way to go!
Follow ups
References