There's some strange behaviour with respect to unicode characters in IPTC/XMP fields.
Windows XP, PM version 4.5.3 beta 1015
XMP/IPTC preferences:
- Order: "Sidecar", "XMP", "IPTC"
- Both XMP and IPTC are always added
- Default encoding: "Windows Latin1+Euro", not written as Unicode
First of all the CodedCharacterSet IPTC field always gets the "ESC%G" value indicating Unicode encoding, even though I have not specified this. This behaviour is new with the 4.5.3 beta I'm now using. I quite sure this wasn't the case with 4.5.2 (certainly not with 4.5.1), so this is a newly introduced bug.
The second problem is more complex, but basically it comes to this: the IPTC information "forgets" it has special characters... Let me try to describe this in a reproducible way.
1. In the IPTC dialogue, enter some special Unicode characters like ĀāĂăĠġĦħĨĩ in the Caption field.
2. Save (OK on the dialogue)
3. Open the image information again and voila, your characters have magically changed into AaAaGgHhIi.
Ah, remember I hadn't specified unicode encoding, so this may have been the cause (but then again I have XMP embedded as well, which takes precedence and always is in unicode).
4. Re-enter the characters and tick the "Write as unicode" box.
Nope does not work either.
5. Try again, now with both specifying "Unicode" as encoding ticking the "Write as unicode" box.
Nope does not work either.
Note: depending on whether or not you first re-enter the unicode characters and then change the encoding to unicode, you get a warning dialogue stating that the IPTC info should be re-read (this means you have to re-enter the special characters). This however has no influence on the result...
Right time to try to change the default encoding and see what happens then.
6. While you can't change the default encoding to "Unicode", you can specify to write IPTC in Unicode by default so that's what I chose.
7. Open image info again and try to enter the unicode characters again.
This does NOT work either.
Some more findings:
- some unicode characters do seem to stick; for instance ð works (I guess this has to do with them being present in the IPTC font encoding).
- during my testing I sometimes got a message that the font encoding does not support all characters and that it proposes to write unicode (IIRC I got this as an option). I can't reproduce this message, anymore, however and I'm quite sure it didn't help either.
- I also know at some point I have been able to get some of the special characters to stick in one of the fields (e.g., headline and location if I recall correctly). I can't reproduce this either though, it seemed erratic as well; on one field would stick, not on another, then I tried again and suddenly it stuck on another field as well. I looked as if PM checked to see whether or not a field was changed before it changes its value in the file. Changing an A back into an Ā does not seem to trigger this, but adding characters did. But as I said, I can't reproduce this anymore...
Am I missing something, or is something really broken? (The CodedCharacterSet IPTC field sticking to "ESC%G" definitely is a bug)
Hope my story makes sense and allows you to reproduce (and fix) things.