Author Topic: Search do not recognise the danish/nordic character å  (Read 2756 times)

Offline michaelbothager

  • Member
  • **
  • Posts: 51
    • View Profile
Search do not recognise the danish/nordic character å
« on: April 26, 2022, 08:42:45 AM »
When I perform a search like ‘keyw båd’ - the search result includes images without the search term, but keywords including ‘bad’

If I then perform a Find (cmd+f) in the same contact sheet, and search ‘båd’, the correct images are selected. So in some parts of Pm plus, localised characters are honoured, but not everywhere.

Translation: ‘båd’ is ‘boat’, and ‘bad’ can be ‘bath’, or the substring ‘bad’ can be part of ‘badebro’, meaning ‘jetty’.

/Michael

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Search do not recognise the danish/nordic character å
« Reply #1 on: April 26, 2022, 09:14:39 AM »
Michael,

When I perform a search like ‘keyw båd’ - the search result includes images without the search term, but keywords including ‘bad’

If I then perform a Find (cmd+f) in the same contact sheet, and search ‘båd’, the correct images are selected. So in some parts of Pm plus, localised characters are honoured, but not everywhere.

Translation: ‘båd’ is ‘boat’, and ‘bad’ can be ‘bath’, or the substring ‘bad’ can be part of ‘badebro’, meaning ‘jetty’.

Can you give me a small JPEG with those characters in the keyword field, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your JPEG file.

Thanks,

-Kirk

Offline michaelbothager

  • Member
  • **
  • Posts: 51
    • View Profile
Re: Search do not recognise the danish/nordic character å
« Reply #2 on: April 26, 2022, 09:53:19 AM »
Kirk,


Can you give me a small JPEG with those characters in the keyword field, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your JPEG file.


Sure, I've attached two images; A.jpg contains 'båd' and B.jpg contains 'badebro', and other nordic/danish letters

/Michael

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Search do not recognise the danish/nordic character å
« Reply #3 on: April 26, 2022, 09:54:15 AM »
Thank you.

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Search do not recognise the danish/nordic character å
« Reply #4 on: April 26, 2022, 10:09:35 AM »
Michael,


Can you give me a small JPEG with those characters in the keyword field, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your JPEG file.


Sure, I've attached two images; A.jpg contains 'båd' and B.jpg contains 'badebro', and other nordic/danish letters

I created a catalog and added your two images to it.  I opened up one of the images and found the "båd" in it and copied it to the clipboard.  I pasted it into the Search field and the images were both displayed as results.  I then used the Find panel and pasted "båd" into it and searched only the Keywords field and A.jpg was selected and B.jpg was not selected, as expected.

I don't have the 'å' as a direct character on my keyboard, but if I type 'Option-a' that 'å' is produced.  If I enter 'båd' manually (not copy and paste) and search or find, it works fine for me.  I'm wondering if when you type the 'å' on your keyboard, it produces a different Unicode sequence than is represented in the file and as such does not match?

Photo Mechanic Plus does something called Unicode Normalization to try and eliminate these concerns, but there are several methods, each with their own benefits and drawbacks.

-Kirk

Offline michaelbothager

  • Member
  • **
  • Posts: 51
    • View Profile
Re: Search do not recognise the danish/nordic character å
« Reply #5 on: April 26, 2022, 10:56:05 AM »
Kirk,
Michael,


Can you give me a small JPEG with those characters in the keyword field, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your JPEG file.


Sure, I've attached two images; A.jpg contains 'båd' and B.jpg contains 'badebro', and other nordic/danish letters

I created a catalog and added your two images to it.  I opened up one of the images and found the "båd" in it and copied it to the clipboard.  I pasted it into the Search field and the images were both displayed as results.  I then used the Find panel and pasted "båd" into it and searched only the Keywords field and A.jpg was selected and B.jpg was not selected, as expected.

I don't have the 'å' as a direct character on my keyboard, but if I type 'Option-a' that 'å' is produced.  If I enter 'båd' manually (not copy and paste) and search or find, it works fine for me.  I'm wondering if when you type the 'å' on your keyboard, it produces a different Unicode sequence than is represented in the file and as such does not match?

Photo Mechanic Plus does something called Unicode Normalization to try and eliminate these concerns, but there are several methods, each with their own benefits and drawbacks.

-Kirk

  • I'm using a standard danish Apple keyboard.
  • The danish keyboard has the letter 'Å' just to the right of 'P'.
  • Entering 'Option-a' produces 'ª'
  • System Preferences, Keyboard, Input Sources only have 'Danish' as keyboard.
  • System Preferences, Language & Region, General lists English (UK) as primary, and Danish as the only other locale (English because of Photoshop...)

How can I help regarding Unicode sequences? I know diddly-squat about that ;o)

/Michael

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Search do not recognise the danish/nordic character å
« Reply #6 on: April 26, 2022, 11:45:52 AM »
Michael,


Can you give me a small JPEG with those characters in the keyword field, please?  Use the 'Attachments and other options' link when you're composing your reply to this message and there you'll be able to upload your JPEG file.


Sure, I've attached two images; A.jpg contains 'båd' and B.jpg contains 'badebro', and other nordic/danish letters

I created a catalog and added your two images to it.  I opened up one of the images and found the "båd" in it and copied it to the clipboard.  I pasted it into the Search field and the images were both displayed as results.  I then used the Find panel and pasted "båd" into it and searched only the Keywords field and A.jpg was selected and B.jpg was not selected, as expected.

I don't have the 'å' as a direct character on my keyboard, but if I type 'Option-a' that 'å' is produced.  If I enter 'båd' manually (not copy and paste) and search or find, it works fine for me.  I'm wondering if when you type the 'å' on your keyboard, it produces a different Unicode sequence than is represented in the file and as such does not match?

Photo Mechanic Plus does something called Unicode Normalization to try and eliminate these concerns, but there are several methods, each with their own benefits and drawbacks.

  • I'm using a standard danish Apple keyboard.
  • The danish keyboard has the letter 'Å' just to the right of 'P'.
  • Entering 'Option-a' produces 'ª'
  • System Preferences, Keyboard, Input Sources only have 'Danish' as keyboard.
  • System Preferences, Language & Region, General lists English (UK) as primary, and Danish as the only other locale (English because of Photoshop...)

How can I help regarding Unicode sequences? I know diddly-squat about that ;o)

I'll have to think on it awhile and add some logging code and then get you an updated build to try.

-Kirk

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Search do not recognise the danish/nordic character å
« Reply #7 on: April 27, 2022, 10:03:22 AM »
Michael,

I changed my keyboard to Danish.  I can see that the letter next to 'P' is å.  I typed 'båd' into the search field and both images appeared.

So I'm not sure what to think now.

-Kirk