← Back to team overview

pyexiv2-developers team mailing list archive

Re: Introduction to my aims... and query on a glitch...

 

Hi Philip,

On 2010-09-20, Philip Graham <philip.graham567@xxxxxxxxx> wrote:
> After the successful install, I have now tested reading and writing most
> of the tags that I would find most useful, using pyexiv2. All is well,
> except for some of those pesky MS-originated tags, which I still use to
> retain backwards compatibility until XP disappears altogether! GNU-Linux
> rules - OK?
> 
> I am having difficulties with writing these following tags in the
> required 'byte' format in the the "XP" sub-group:
>     Exif.Image.XPTitle
>     Exif.Image.XPComment
>     Exif.Image.XPAuthor
>     Exif.Image.XPKeywords
>     Exif.Image.XPSubject

According to exiv2’s documentation, those tags are encoded in UCS2.
Using python’s UTF-16 codec should do the trick (UCS-2 being an older,
deprecated standard that was replaced by UTF-16).
So assuming "title" contains an ascii string, you’d something like this:

 m['Exif.Image.XPTitle'] =
pyexiv2.string_to_undefined(title.encode("utf-16")))

> I have used the command pyexiv2.utils.string_to_undefined(sequence) in
> my test script to convert the ascii string into the byte format, but
> after writing to the image file with metadata.write(), when examining
> the ensuing metadata through i.e. "eog" image viewer
> properties>metadata>details, the XP metadata displays in Asian
> characters! This display phenomena is the same irrespective of which
> image viewer is used. I also attempted this through the cli using only
> exiv2 on a fresh image and with the same strange result. Perhaps it is
> an exiv2 issue? Can anyone else reproduce these errors?

I can reproduce them, and as explained above, the missing bit was
encoding the string in UTF-16 first. See this example:

exiv2 -M"set Exif.Image.XPTitle 255 254 102 0 111 0 111 0 " image.jpg
# "255 254 102 0 111 0 111 0 " == byte sequence for "foo" in UTF-16

> Entering the same sequences interactively with the python interpreter
> gives the same result - converting with
> pyexiv2.utils.string_to_undefined(sequence) seems to work ok and the
> result prints out on the cli in oct() byte order as expected, same with
> reconverting it the other way with
> pyexiv2.utils.undefined_to_string(undefined) - on the cli the ascii
> chr() prints out OK. But, as soon as the metadata.write() is issued, the
> result in the image file is the strange Asian characters in this "XP"
> sub-group (only).
> 
> Some further information. I noticed when I extracted these tags in
> pyexiv2 from a test image containing the complete tags (which had been
> written successfully and correctly by another program) and displaying
> correctly in image viewers, the extracted result printed in 'byte'
> format, but each byte integer was separated by a zero integer thus: " 65
> 0 97 0 124 0 90 ... " and so on. When tested on the cli with interactive
> python doing the conversions, but before writing the metadata, the
> result was the expected " 65 97 124 90 ..." and so on. Using both forms
> still results in the erroronous display when the metadata is written.

Encoding in UTF-16 should work, and indeed it does using exiv2 on the
CLI, however it seems that for some reason writing any tag in the
Exif.Image.XP* family from pyexiv2 silently fails. Existing tags are not
overwritten, and new tags end up written with an empty value.
I will file a bug report and investigate this issue.

Cheers,

Olivier



Follow ups

References