Author Topic: Character Set Global to UTF-8  (Read 6182 times)

Offline info@ts-foto.de

  • Newcomer
  • *
  • Posts: 18
    • View Profile
Character Set Global to UTF-8
« on: November 25, 2014, 10:12:25 AM »
System WIN 7 64 Bit, PM 5.0 V 15800
Character set: In Dialog “I” you have right side down, two options: IPTC Encoding: In: for example, Latin 1 Western European (with Option Button ”Write as Unicode”. That everything works I have to set the “In-Option to write as Unicode and click the Checkbox Write as Unicode. Then the characters “ü”Ä and others are correctly. In the preferences the [IPTC7XMP] the Default IPTC Encoding “Unicode” is missing, you only can do the checkbox, but this is not enough that this works correct. Have a look at the screenshots and watch the field “Prüfungsbezeichnung”, so you will see this problem while watching on the character “ü”.

[attachment deleted by admin]
« Last Edit: November 25, 2014, 10:14:01 AM by info@ts-foto.de »

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Character Set Global to UTF-8
« Reply #1 on: November 25, 2014, 11:14:25 AM »
If the input IPTC data is encoded as UTF-8 Unicode, PM will automatically interpret the data as Unicode.

Where does this metadata come from that is not marked as UTF-8 (but actually contains Unicode UTF-8 characters) come from?

-Kirk

Offline info@ts-foto.de

  • Newcomer
  • *
  • Posts: 18
    • View Profile
Re: Character Set Global to UTF-8
« Reply #2 on: November 25, 2014, 09:22:13 PM »
This Data was written on my win7 64Bit pc on location. Same Settings And same Version of PM
« Last Edit: November 25, 2014, 10:36:52 PM by info@ts-foto.de »

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Character Set Global to UTF-8
« Reply #3 on: November 25, 2014, 11:02:59 PM »
Can I also see a screen shot of your IPTC/XMP settings as well?

Thanks,

-Kirk

Offline info@ts-foto.de

  • Newcomer
  • *
  • Posts: 18
    • View Profile
Re: Character Set Global to UTF-8
« Reply #4 on: November 26, 2014, 02:17:41 AM »
this is my pc in the Office. I'm shure that I have the Same Settings on the Location. The next time I_m over there ist next Saturdy. I think if there is an coose Option [Default IPTC Encoding] to set to UTF-8, this will solve this Problem.
First i thought this was a Problem between MAC an PC. While three of our photographers still work since a view years with PM on MAc. I have had allway Problems with there caracters. No while testing it on PC it seem to be another Problem.



[attachment deleted by admin]

Offline info@ts-foto.de

  • Newcomer
  • *
  • Posts: 18
    • View Profile
Re: Character Set Global to UTF-8
« Reply #5 on: November 30, 2014, 07:08:27 AM »
Hello, i have the same Settings on the other pc, I yust forgot to do an Screenshot. Next time Ill be there in two weeks. Today I wrote IPTC Data with this Settings above on this Office PC. Now I copied the files to the RAID and check the Data: Have a look and be supriesed, I dont understand this I just kopied the files to the Raid. everything ist done with PM v5 b15800(e6ecc1c)

[attachment deleted by admin]

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Character Set Global to UTF-8
« Reply #6 on: November 30, 2014, 07:13:52 AM »
Can you post the sample image (the horse and the rider) so I can look at its metadata internally?

Thanks,

-Kirk

Offline dennis

  • President
  • Camera Bits Staff
  • Sr. Member
  • *****
  • Posts: 462
    • View Profile
    • Camera Bits, Inc.
Re: Character Set Global to UTF-8
« Reply #7 on: December 03, 2014, 11:28:33 AM »
A few things:

1) You should probably be reading XMP first and IPTC second.  XMP contains all of the IPTC IIM metadata and is Unicode by default (IIM stands for Information Interchange Model and is a mixed binary and text block).  IPTC IIM doesn't contain several newer fields that are XMP only (e.g. Event).  Unless you are using other software that is ignorant of XMP and doesn't update it then you shouldn't be reading IPTC IIM first.  It is OK to write IPTC IIM in addition to XMP (and PM will keep these in sync), but if you mark this as Unicode IPTC, then not all software can detect this for IPTC IIM (older versions of Photoshop for example).

2) The sample file you sent does NOT have Unicode IPTC.  It is missing the record 1:90 field for "coded character set".  If this field is missing or doesn't have <esc>%G as the value, then the data isn't marked as Unicode.  Typically this record 1:90 field will be missing (in which case the character set is undefined) or it has <esc>%G.  Other values are highly unlikely (it is possible to specify a particular character set besides Unicode but that is not very useful and PM won't recognize anything but Unicode for for this 1:90 field).

I suspect your other computer does NOT have the checkbox set for writing IPTC IIM as Unicode (I believe the default is to have this option off due to the history of other software not being very compatible with Unicode in the IPTC IIM).  I tried this on both Mac and Windows and it works fine here.  PM does indeed write the 1:90 field and puts <esc>%G there to identify Unicode.  I then manually deleted all XMP from the JPG file to force PM to load IPTC IIM and all the special characters come through just fine.

HTH...

--dennis

Offline info@ts-foto.de

  • Newcomer
  • *
  • Posts: 18
    • View Profile
Re: Character Set Global to UTF-8
« Reply #8 on: December 04, 2014, 01:47:18 AM »
Hello, what is this: <esc>%G

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Character Set Global to UTF-8
« Reply #9 on: December 04, 2014, 06:43:53 AM »
Hello, what is this: <esc>%G

It the byte sequence that indicates that IPTC binary metadata is encoded with UTF-8.

-Kirk