Author Topic: Closed: Same files twice in catalog  (Read 6731 times)

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Closed: Same files twice in catalog
« on: October 31, 2021, 06:01:41 AM »
In my catalog I have images appearing twice. Looking at the ones that are duplicated, it appears they are the ones that have special characters in their path (e.g. ô, é, etc.). I noticed this since the count from syncing the catalog and the count shown when browsing the catalog didn't match. I'm sure this has to do with UTF-8 character normalization where e.g. accented characters can be written using different code point sequences and this might have changed over time (the catalog exists since almost the beginning of PM Plus).

Now to find a way to get rid of the duplicates (they are all over the place); syncing didn't work (not even a full sync)…
« Last Edit: December 04, 2021, 03:18:53 AM by Hayo Baan »
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #1 on: October 31, 2021, 06:41:26 AM »
Update1: as an experiment I removed the “first”, “second” and both copies of some duplicated items from the catalog and then did a full sync. Lo and behold: the duplicate items returned for each of these trials :(

Update2: as a further experiment, I scanned a folder with duplicates in it to a new catalog. At first scan, images weren't duplicated and when I then did a full sync, they weren't either. So the issue must be related to an “old” scan to catalog.

For the record: I am quite positive I might have switched the file system on which the duplicated files reside from Mac OS X extended to APFS so that might have introduced a filename encoding change. I'll see if I can verify if that's the case.
Update: no, even on a very old backup do the accents in the file name show up similarly. Hmm, it is of course possible that an OS update changed the normalization, so we still can't rule out the encoding…
« Last Edit: October 31, 2021, 07:15:38 AM by Hayo Baan »
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #2 on: November 11, 2021, 12:52:40 PM »
Kirk et al, no ideas how I can fix this (short of rebuilding the whole catalog)?
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Open: Same files twice in catalog
« Reply #3 on: November 11, 2021, 02:21:20 PM »
Hayo,

Kirk et al, no ideas how I can fix this (short of rebuilding the whole catalog)?

I wonder if you force a "Reintegrate Forgotten Catalog" maintenance operation if that would correct the problem?  Give that a try.

-Kirk

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Open: Same files twice in catalog
« Reply #4 on: November 11, 2021, 03:18:35 PM »
Hayo,

Alternatively, you could remove everything and then do a Full Sync and it should add them all back in.  If it still gets duplicates then I want to get to the bottom of that, ASAP.

-Kirk

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #5 on: November 13, 2021, 03:44:36 AM »
I wonder if you force a "Reintegrate Forgotten Catalog" maintenance operation if that would correct the problem?  Give that a try.

Sadly, that didn't work.

Alternatively, you could remove everything and then do a Full Sync and it should add them all back in.  If it still gets duplicates then I want to get to the bottom of that, ASAP.

Interesting, I tried this on one of the folders+subfolders with duplicates. Selected all images, executed remove from catalog and then – surprisingly – was presented with the just one version of the duplicates still there (all other items were gone though). So they are simply very stubborn :o
(guess I hadn't noticed this the first time I tried to remove both versions of a duplicate)

Another try: I noticed that the folders with duplicates showed up twice in the browse view (each time with the correct number of images, but viewing the folder gave the duplicates again). So I then tried removing a duplicate folder from there. One of the offending ones did indeed get removed, but the second one didn't. And when I then did a sync, the duplicates popped up again, but the folder then did not duplicate itself.

As a further attempt at solving this, I removed one of the proxy files (there were two identifiable ones) for a duplicate image and removed it again from the catalog. Sadly that didn't help either (one version did not remove and the image is still duplicated after a sync). Now there is only one proxy file left though. But now I again have the directory in the browser twice, with different counts) :o

I've attached an image of the browser with two problematic folders.

If you can think of anything more I can try to help you identify and solve this issue, please let me know. Otherwise I see no other solution then to rebuild the catalog from scratch.
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Open: Same files twice in catalog
« Reply #6 on: November 15, 2021, 09:31:27 AM »
Hayo,

If you can think of anything more I can try to help you identify and solve this issue, please let me know. Otherwise I see no other solution then to rebuild the catalog from scratch.

If you create a new catalog and add just that folder and then view it, are the images doubled?  What about after making some additions/changes to that folder and using Sync?  Any doubles then?

-Kirk

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #7 on: November 15, 2021, 11:44:11 AM »
If you create a new catalog and add just that folder and then view it, are the images doubled?  What about after making some additions/changes to that folder and using Sync?  Any doubles then?

That was actually one of the things I had already tried. With a new catalog, there are no duplicates…
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #8 on: November 28, 2021, 02:58:14 AM »
Hi Kirk, anything you ant me to try to help you find + solve the issue? Otherwise I will simply have to rebuild the catalog.
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Open: Same files twice in catalog
« Reply #9 on: November 29, 2021, 09:22:33 AM »
Hayo,

Hi Kirk, anything you ant me to try to help you find + solve the issue? Otherwise I will simply have to rebuild the catalog.

I think there is no other solution.  I'm sorry for the inconvenience.

-Kirk

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Open: Same files twice in catalog
« Reply #10 on: November 29, 2021, 11:00:51 AM »
No worries, I'll rebuild the catalog.
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Closed: Same files twice in catalog
« Reply #11 on: December 04, 2021, 03:18:34 AM »
Just to come back to this issue, before rebuilding the whole catalog, I tried one more thing. Instead of trying to remove the double pictures/the folders with those pictures in them, I removed the parent directory of the folder with the duplicates. And lo and behold, after syncing the catalog, no more duplicates! So after investigating which folders had duplicates and removing their parent directories in the whole catalog and re-syncing the whole thing, I have a fully complete catalog again without having to rebuild the whole thing ;D
(I still had to rebuild ~30% of the catalog (one of the offending folders alone was in a directory with almost 25% of all the images), but still way better than having to rebuild the whole thing ;)).

@Kirk, maybe this sheds some light on the cause?
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Closed: Same files twice in catalog
« Reply #12 on: December 04, 2021, 11:32:57 AM »
Hayo,

I suppose at that point in the folder hierarchy none of the letters in the path components were represented differently by the Unicode normalization?  What did that parent path look like?

-Kirk

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Closed: Same files twice in catalog
« Reply #13 on: December 04, 2021, 02:19:34 PM »
Hi Kirk, the parent paths had no special characters in them.  They consisted only of ascii chars.
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Closed: Same files twice in catalog
« Reply #14 on: December 04, 2021, 08:58:59 PM »
Hayo,

Hi Kirk, the parent paths had no special characters in them.  They consisted only of ascii chars.

That makes sense in how removing that folder would correct the problem.

-Kirk