Author Topic: Checking character coding  (Read 2382 times)

Offline vAfotoriporter

  • Uber Member
  • ******
  • Posts: 1030
    • View Profile
    • Attila Volgyi photojournalist
Checking character coding
« on: January 30, 2012, 03:13:08 AM »
I have a customer who requires captions with ancient characters that is almost sure not the same as the one I use (they use Windows, I use Mac). How can I check the text encoding of the captions to set my IPTC stationery publish the required encoding to maximize compatibility?
Working on Mac, OSX, iOS and with some Canons.
Allways shooting RAW.

http://www.volgyiattila.com

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 21909
    • View Profile
    • Camera Bits, Inc.
Re: Checking character coding
« Reply #1 on: January 30, 2012, 06:18:41 AM »
Szurkebarat,

I have a customer who requires captions with ancient characters that is almost sure not the same as the one I use (they use Windows, I use Mac). How can I check the text encoding of the captions to set my IPTC stationery publish the required encoding to maximize compatibility?

Use UTF-8 Unicode.

Otherwise, what encoding do they require?

-Kirk

Offline Frantisek Vlcek

  • Sr. Member
  • ****
  • Posts: 449
    • View Profile
Re: Checking character coding
« Reply #2 on: January 31, 2012, 07:49:32 AM »
Hi Szurkebarat, please check with your customer what does their app read - ie the old Legacy IPTC, or IPTC/XML. If XML, there is a good chance it will work with UTF8 encoding. However, Legacy IPTC doesn't specify encodings, and allows only Ansi standard. Therefore, most national newspapers and software vendors needing to put their languages' characters in IPTC way back before IPTC/XML was introduced, simply used their own national and OS encoding. Which can be quite a pitfal because as it is NOT specified inside Legacy IPTC which encoding is used, some apps expect Ansi, some expect same encoding as the OS they run on, and some are completely custom... I had a same problem with one client, whose legacy system doesn't accept anything else then Windows Central European, and accepts only Legacy IPTC. Fortunately in Photomechanic it's easy to set the encoding even for such situations, just that you might have to try few different options before you arrive at the right solution (as I assume, customers with legacy systems usually don't know anything about the technicalities which were implemented by their long-gone-IT dept 10 years ago or outsourced entirely "and it just worked" because their photographers were forced to use their own tools).

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 21909
    • View Profile
    • Camera Bits, Inc.
Re: Checking character coding
« Reply #3 on: January 31, 2012, 08:41:10 AM »
Frantisek,

Hi Szurkebarat, please check with your customer what does their app read - ie the old Legacy IPTC, or IPTC/XML. If XML, there is a good chance it will work with UTF8 encoding. However, Legacy IPTC doesn't specify encodings, and allows only Ansi standard. Therefore, most national newspapers and software vendors needing to put their languages' characters in IPTC way back before IPTC/XML was introduced, simply used their own national and OS encoding. Which can be quite a pitfal because as it is NOT specified inside Legacy IPTC which encoding is used, some apps expect Ansi, some expect same encoding as the OS they run on, and some are completely custom... I had a same problem with one client, whose legacy system doesn't accept anything else then Windows Central European, and accepts only Legacy IPTC. Fortunately in Photomechanic it's easy to set the encoding even for such situations, just that you might have to try few different options before you arrive at the right solution (as I assume, customers with legacy systems usually don't know anything about the technicalities which were implemented by their long-gone-IT dept 10 years ago or outsourced entirely "and it just worked" because their photographers were forced to use their own tools).

Actually, legacy IPTC can indicate that it is using UTF-8 fairly straightforwardly.  It is other character sets that cannot quite be indicated (the method for other encodings is so arcane that nobody does it!)

-Kirk

Offline vAfotoriporter

  • Uber Member
  • ******
  • Posts: 1030
    • View Profile
    • Attila Volgyi photojournalist
Re: Checking character coding
« Reply #4 on: January 31, 2012, 10:21:46 AM »
Szurkebarat,

I have a customer who requires captions with ancient characters that is almost sure not the same as the one I use (they use Windows, I use Mac). How can I check the text encoding of the captions to set my IPTC stationery publish the required encoding to maximize compatibility?

Use UTF-8 Unicode.

Otherwise, what encoding do they require?

-Kirk

They didn't specify any encoding (they don't even know what it is I guess). But I had issues with them in the past.

They require me to have éáűőúöüóí characters in the captions and in the past they didn't display correctly. I use PM on OSX to fill captions.

They use Windows and PhotoShop with Canto Cumulus.
Working on Mac, OSX, iOS and with some Canons.
Allways shooting RAW.

http://www.volgyiattila.com

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 21909
    • View Profile
    • Camera Bits, Inc.
Re: Checking character coding
« Reply #5 on: January 31, 2012, 10:41:13 AM »
Szurkebarat,

I have a customer who requires captions with ancient characters that is almost sure not the same as the one I use (they use Windows, I use Mac). How can I check the text encoding of the captions to set my IPTC stationery publish the required encoding to maximize compatibility?

Use UTF-8 Unicode.

Otherwise, what encoding do they require?

They didn't specify any encoding (they don't even know what it is I guess). But I had issues with them in the past.

They require me to have éáűőúöüóí characters in the captions and in the past they didn't display correctly. I use PM on OSX to fill captions.

They use Windows and PhotoShop with Canto Cumulus.

I think you'll just have to experiment by writing the name of the encoding you're trying, followed by the accented characters above in the Caption field and then save it out.  Try this on each of the encodings.  Send the images to your customer and have them open up the images in Photoshop and see which one looks correct.  Then you should be able to use that encoding in PM and be ready to continue with real images.

HTH,

-Kirk