Author Topic: Specific Unicode Characters  (Read 3476 times)

Offline IanGoldstein

  • Member
  • **
  • Posts: 98
    • View Profile
Specific Unicode Characters
« on: October 19, 2007, 03:17:04 PM »
Would it be possible to specify a particular Unicode character (or a character from another character set) rather than attempt to enter the literal character?

I am using PM under Win XP, so to enter the copyright symbol (ISO-8859-1 character 0169) I can hold down the ALT key and enter "0169" on the numeric keypad. However, that requires a lookup of the code for a specific character, and might not work well on systems using other default character sets.

I thought I would be able to use Code Replacements to simplify the use of some common characters. I setup "c" as a code for "©", "tm" as a code for "™", etc. Unfortunately, this does not work well. When I tested it I found that entering "\c\" in an IPTC field resulted in a "?" character. As I should have expected, code replacements does not seem to handle such characters. The IPTC field, however, does handle these if they are directly entered.

Perhaps you could add a special variable such as "char" which when used as {char:169} would insert character 169. A similar "unicode" variable would be a good idea; the same copyright symbol code thus be inserted with {unicode:00A9} (assuming you wanted it in 4-digit hex).

-Ian

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24756
    • View Profile
    • Camera Bits, Inc.
Re: Specific Unicode Characters
« Reply #1 on: October 19, 2007, 04:39:11 PM »
Ian,

Would it be possible to specify a particular Unicode character (or a character from another character set) rather than attempt to enter the literal character?

I am using PM under Win XP, so to enter the copyright symbol (ISO-8859-1 character 0169) I can hold down the ALT key and enter "0169" on the numeric keypad. However, that requires a lookup of the code for a specific character, and might not work well on systems using other default character sets.

I thought I would be able to use Code Replacements to simplify the use of some common characters. I setup "c" as a code for "©", "tm" as a code for "™", etc. Unfortunately, this does not work well. When I tested it I found that entering "\c\" in an IPTC field resulted in a "?" character. As I should have expected, code replacements does not seem to handle such characters. The IPTC field, however, does handle these if they are directly entered.

Perhaps you could add a special variable such as "char" which when used as {char:169} would insert character 169. A similar "unicode" variable would be a good idea; the same copyright symbol code thus be inserted with {unicode:00A9} (assuming you wanted it in 4-digit hex).

I suppose we could do something like that, but in the meantime, make sure that your Code Replacement text files are UTF-8 and they should work fine.

-Kirk