Author Topic: What IPTC encoding  (Read 12732 times)

Offline vAfotoriporter

  • Uber Member
  • ******
  • Posts: 1046
    • View Profile
    • Attila Volgyi photojournalist
What IPTC encoding
« on: May 10, 2011, 08:16:23 AM »
I read many topics called the same but I couldn't really find my answer.
A customer requests the photos with IPTC encoded Microsoft Central Europe to encode ancient characters that need to be used. Actually none of my other clients specify any encoding and nor do they need any ancients in the captions.
Until this specific request from this customer I have been using Mac Central European - or maybe Mac Roman?

Is it correct to assume until I don't use any ancients then using any encoding won't change anything and it plays a role only if ancients and special characters come into play?
Working on Mac, OSX, iOS and with some Canons.
Allways shooting RAW.

http://www.volgyiattila.hu

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25503
    • View Profile
    • Camera Bits, Inc.
Re: What IPTC encoding
« Reply #1 on: May 10, 2011, 08:59:45 AM »
I read many topics called the same but I couldn't really find my answer.
A customer requests the photos with IPTC encoded Microsoft Central Europe to encode ancient characters that need to be used. Actually none of my other clients specify any encoding and nor do they need any ancients in the captions.
Until this specific request from this customer I have been using Mac Central European - or maybe Mac Roman?

Is it correct to assume until I don't use any ancients then using any encoding won't change anything and it plays a role only if ancients and special characters come into play?

IPTC really only specifies 7-bit ASCII and any time you use accented characters their value lies in the 8-bit range which is unspecified.  There is the ability to control the character encoding in IPTC but it is quite arcane and PM only supports 'unspecified' and UTF-8.  We allow you to override the 'unspecified' encoding with a list of supported 8-bit encodings.

It is true that in general all systems and software will interpret non-accented characters equally.  The exception would be some system that uses something other than ASCII for 7-bit characters but I know of no current systems that do that.

So yes, you can assume that as long as you don't enter accented characters all platforms will be able to view your captions as you intended.  If your customer could use Unicode (UTF-8) instead or better yet, XMP for metadata then everything should work correctly even if you enter accented characters in a variety of languages.

-Kirk

Offline fabianlujan

  • Full Member
  • ***
  • Posts: 107
  • sports photographer
    • View Profile
Re: What IPTC encoding
« Reply #2 on: June 06, 2014, 06:17:58 AM »
Kirk,
I need to use many characters that appear "?" when code replace is done.
What is wrong?

á é í ó ú

are the common, but eastern europe, and some other accents too.

Thanks

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25503
    • View Profile
    • Camera Bits, Inc.
Re: What IPTC encoding
« Reply #3 on: June 06, 2014, 08:23:59 AM »
I need to use many characters that appear "?" when code replace is done.
What is wrong?

á é í ó ú

are the common, but eastern europe, and some other accents too.

Code Replacement files must always be written with the Unicode UTF-8 encoding.  Use a text editor that lets you save your text as UTF-8.

-Kirk

Offline fabianlujan

  • Full Member
  • ***
  • Posts: 107
  • sports photographer
    • View Profile
Re: What IPTC encoding
« Reply #4 on: June 06, 2014, 10:26:12 AM »
Is Notepad able to do it?

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25503
    • View Profile
    • Camera Bits, Inc.
Re: What IPTC encoding
« Reply #5 on: June 06, 2014, 10:27:39 AM »
Is Notepad able to do it?

Notepad++ definitely can.  I am not sure about the original Notepad though.

-Kirk

Offline Sven

  • Uber Member
  • ******
  • Posts: 1059
    • View Profile
Re: What IPTC encoding
« Reply #6 on: June 07, 2014, 12:38:58 AM »
Wordpad is able to save as UTF-8
sports (swim, bike, run in any combination or alone), animals, sometimes aeroplanes
sony alpha 1, 9, 7III and a bunch of lenses (24-600mm)

Offline FVlcek

  • Sr. Member
  • ****
  • Posts: 467
    • View Profile
Re: What IPTC encoding
« Reply #7 on: June 07, 2014, 03:19:18 AM »
Notepad in recent versions of Windows should be able to save UTF-8 files as well, IIRC. It should be as an option in the file type drop-down menu.

With regards to encodings and accented characters, plain old IPTC was specified only for 7-bit ASCII, as Kirk noted, that means, only "normal" characters. You can of course write into it using other encodings, but there is no way in plain old IPTC to note which encoding has been used. Therefore, using plain old IPTC with any other languages then English is not recommended. Every and each software can expect some specific encoding used in plain old IPTC, and can hiccup if there are characters in another encoding (in the best scenario, just replace them with ?s). Only IPTC/XMP allows for specification of encoding or even UTF usage. Because of the IPTC limitation, most European newspapers and agencies developed their own workarounds before the advent of XMP, which can still cause a mess, as there are lots of legacy software backends in the publishing business, that still expect the IPTC with some local encoding used. Hermes, one of the most used newspaper publishing software, only accepts XMP in more recent versions, which are not always deployed in all newspapers yet because of the upgrade/IT costs.

Therefore, if your client needs specific encoding, you should contact him/her to find out whether they accept XMP, and if not, nudge them towards it, and if it's not feasible for them to upgrade their software, find out which encoding they need the captions in.

Yes, it can be a mess. In an ideal world, everybody would be using XMP with Unicode...

Fortunately, Photomechanic can let you select any encoding even for plain old IPTC, which literally saved my a** many times.

Offline fabianlujan

  • Full Member
  • ***
  • Posts: 107
  • sports photographer
    • View Profile
Re: What IPTC encoding
« Reply #8 on: June 09, 2014, 12:54:57 PM »
Great info! Thanks!!
I'm doing my way now using Notepad++
Thanks!