Author Topic: Accented letters find and replace ?  (Read 3607 times)

Offline whistlerdan

  • Newcomer
  • *
  • Posts: 36
    • View Profile
Accented letters find and replace ?
« on: October 05, 2011, 04:09:00 PM »
This is sort of a two part question.....

I noticed that when I put an accented letter in the IPTC, when I go to save it it gives me a warning to save as unicode instead or risk losing some data.  But whether I chose to take it up on the offer of unicode, or continue as normal , when I upload my files to Photoshelter the accented letters all come out with a ? following them.  (eg Mu?cke or Se?bastien)

What is the best practice here?  To not use accented letters at all ?

The second problem is that I tried to do a find a replace on some of the names with the accented letters in them.  For example one guy is called Marc Gené.  Doing a find of  Gené and replace with Gene does not work, it just does nothing.  Ok I thought maybe you have to be a bit cleverer so I tried first replacing Gené with XYZ, then find XYZ and replace with Gene.  Unfortunately when you find and replace Gené with XYZ , it replaces all the Genés with XYź.  It doesn't matter what you try and replace it with, it always throws a random accent in there!

Any ideas ?

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Accented letters find and replace ?
« Reply #1 on: October 05, 2011, 05:24:28 PM »
This is sort of a two part question.....

I noticed that when I put an accented letter in the IPTC, when I go to save it it gives me a warning to save as unicode instead or risk losing some data.  But whether I chose to take it up on the offer of unicode, or continue as normal , when I upload my files to Photoshelter the accented letters all come out with a ? following them.  (eg Mu?cke or Se?bastien)

What is the best practice here?  To not use accented letters at all ?

The second problem is that I tried to do a find a replace on some of the names with the accented letters in them.  For example one guy is called Marc Gené.  Doing a find of  Gené and replace with Gene does not work, it just does nothing.  Ok I thought maybe you have to be a bit cleverer so I tried first replacing Gené with XYZ, then find XYZ and replace with Gene.  Unfortunately when you find and replace Gené with XYZ , it replaces all the Genés with XYź.  It doesn't matter what you try and replace it with, it always throws a random accent in there!

Any ideas ?

IPTC is generally interpreted in the default character set on a given computer system.  That's why we offer to encode the IPTC data as UTF-8 Unicode.  XMP has always been UTF-8 Unicode and doesn't suffer this problem.  I don't know what metadata format Photoshelter prefers, but you could set PM to only write out XMP to your files and then Photoshelter would have to use XMP data since there would not be any alternate IPTC data available.

As for the Find and Replace issue, I'll have to look into it.  There may be a bug in the code.  Unfortunately I don't have much spare time right now since we're readying our demo for PhotoPlus in about three weeks.

I'll see what I can do.

-Kirk

Offline whistlerdan

  • Newcomer
  • *
  • Posts: 36
    • View Profile
Re: Accented letters find and replace ?
« Reply #2 on: October 06, 2011, 12:11:42 PM »
Ok thanks for the info Kirk.

When changing the reading and writing setting in the preferences/IPTC menu I also ran into the problem again that I had yesterday where I would make a change and then pressing OK to confirm it crashed PM.

Upon force quitting it and re-starting , the changes had been confirmed.

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Accented letters find and replace ?
« Reply #3 on: October 06, 2011, 12:34:41 PM »
Ok thanks for the info Kirk.

When changing the reading and writing setting in the preferences/IPTC menu I also ran into the problem again that I had yesterday where I would make a change and then pressing OK to confirm it crashed PM.

Upon force quitting it and re-starting , the changes had been confirmed.

Please file a crash report when the Crash Reporter comes up.

-Kirk

Offline whistlerdan

  • Newcomer
  • *
  • Posts: 36
    • View Profile
Re: Accented letters find and replace ?
« Reply #4 on: October 06, 2011, 12:42:28 PM »
A further issue regarding the accented letters which has just cause a problem....

As find a replace did not work I needed to manually search for every name with an accent in it and then make the changes.

Performing a search for Sebastien for example, does find all the JPEGs with Sébastien in them.  What I have only just discovered though is that it DOES NOT find RAW files with Sébasien in them.

Even an exact search for Sébastien , does not find RAW files that contain Sébastien in the caption ! Argh! There seems to be no way to search for words with accented letters if they are RAW files.

Regarding crash report, I would but it doesn't come up.  It just locks up with the spinning ball and the preferences pane stays on top of all my other windows.  I tried leaving it for 20 minutes and it was still there, same as last time so I just used to force quit option in OSX.  That doesn't give you a crash report.
« Last Edit: October 06, 2011, 12:44:06 PM by whistlerdan »

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Accented letters find and replace ?
« Reply #5 on: October 06, 2011, 01:20:38 PM »
A further issue regarding the accented letters which has just cause a problem....

As find a replace did not work I needed to manually search for every name with an accent in it and then make the changes.

Performing a search for Sebastien for example, does find all the JPEGs with Sébastien in them.  What I have only just discovered though is that it DOES NOT find RAW files with Sébasien in them.

Even an exact search for Sébastien , does not find RAW files that contain Sébastien in the caption ! Argh! There seems to be no way to search for words with accented letters if they are RAW files.

Could you send me some sample files that I can use to investigate this issue?  Please click on my name to the left of this message, then click on the 'personal message' link.  I will respond with upload instructions.

Thanks,

-Kirk

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Accented letters find and replace ?
« Reply #6 on: October 10, 2011, 05:13:47 PM »
Dan,

I investigated the sample files.  Thanks for providing them.  The Sébastien text can be found with Find, but you must copy the exact text from the Caption field of your photo.  If you enter Sébastien by typing:

S, Option-e, e, bastien

...it won't work.  The data in XMP is not entered in the same way.  Let me explain-- if I look at the data that makes up the characters in the Caption field and the characters you likely typed I see:

Caption: Se(0xcc 0x81)bastien
Find: S(0xc3 0xa9)bastien

The Caption version is: S, e + Combining Acute Accent, bastien
The Find version is: S, Latin Small E with Acute character, bastien

The find / replace functionality looks for strings of bytes that match.  It is fairly simplistic and cannot handle the case you've provided.  It would need to be upgraded to a fully Unicode-aware search that would normalize the data to be searched and the search string itself.

Regards,

-Kirk

Offline whistlerdan

  • Newcomer
  • *
  • Posts: 36
    • View Profile
Re: Accented letters find and replace ?
« Reply #7 on: October 10, 2011, 05:28:22 PM »
Ok I think I somewhat understand.

Given that though, what I can't quite get is why, if I just highlight Sébastien within a description and copy and paste that exact thing into search, it doesn't work.  Doesn't find the very same photo from which I just copied the text from.  That's how I was trying to do it, which could be wrong.  But it seemed like a surefire way to make sure I was searching for the exact same thing ?


Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 24731
    • View Profile
    • Camera Bits, Inc.
Re: Accented letters find and replace ?
« Reply #8 on: October 10, 2011, 07:47:58 PM »
Dan,

Ok I think I somewhat understand.

Given that though, what I can't quite get is why, if I just highlight Sébastien within a description and copy and paste that exact thing into search, it doesn't work.  Doesn't find the very same photo from which I just copied the text from.  That's how I was trying to do it, which could be wrong.  But it seemed like a surefire way to make sure I was searching for the exact same thing ?

If you turn off Case Insensitive searching, it should work with your copy/paste.  It works for me.

-Kirk