Author Topic: Keywords containing umlauts can't be searched in Structured Keywords dialog  (Read 5819 times)

Offline mhobi

  • Newcomer
  • *
  • Posts: 39
    • View Profile
Hello

Keywords containing umlauts can't be searched in the Structured Keywords dialog, e.g. 'Zürich'.
If I enter a search string like 'Zür' and click 'Find', nothing happens.

Regards, Michael
« Last Edit: March 29, 2020, 05:13:29 AM by mhobi »

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25503
    • View Profile
    • Camera Bits, Inc.
Michael,

Keywords containing umlauts can't be searched in the Structured Keywords dialog, e.g. 'Zürich'.
If I enter a search string like 'Zür' and click 'Find', nothing happens.

Can you export and post your Structured Keywords file, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your file.

Thanks,

-Kirk

Offline mhobi

  • Newcomer
  • *
  • Posts: 39
    • View Profile
Hi Kirk

Attached is the exported keyword file.
Initially I exported it from Aperture and then imported into PhotoMechanic when I was migrating away from Aperture.

Thanks and regards
Michael
« Last Edit: March 31, 2020, 07:45:23 AM by mhobi »

Offline mhobi

  • Newcomer
  • *
  • Posts: 39
    • View Profile
Hi Kirk

Meanwhile I examined the file with a hex editor and saw the following:
e.g. 'ü' is encoded as '75 CC 88' hex which seems to mean 'u' with 'COMBINING DIAERESIS'.
If I convert the hex string '75 CC 88' to 'C3 BC' ('ü') it seems to work.
I am no Unicode specialist, but it seems I have to correct my keywords file, right?  ;)

Thanks and regards
Michael

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25503
    • View Profile
    • Camera Bits, Inc.
Michael,

Meanwhile I examined the file with a hex editor and saw the following:
e.g. 'ü' is encoded as '75 CC 88' hex which seems to mean 'u' with 'COMBINING DIAERESIS'.
If I convert the hex string '75 CC 88' to 'C3 BC' ('ü') it seems to work.
I am no Unicode specialist, but it seems I have to correct my keywords file, right?  ;)

Ideally, you shouldn't have to do anything special.  PM should normalize the Unicode text that you enter into the Find field and normalize the Unicode text in your Structured Keywords data such that matches can occur.

I did not see your file, did you forget to attach it earlier?

Thanks,

-Kirk

Offline mhobi

  • Newcomer
  • *
  • Posts: 39
    • View Profile
Hi Kirk

Sorry, I removed the file as I manually corrected my keyword file and it works now. I replaced '75 CC 88' ('u' with 'COMBINING DIAERESIS') hex with 'C3 BC' hex ('ü').
But to demonstrate the issue, I created a new file with 1 keyword 'Zürich' with the old encoding the file had ('ü' is encoded as '75 CC 88' hex) and attached it.

Thanks for all your help and best regards
Michael