Author Topic: Moved catalogue, but original location in 'Documents' is still being written to.  (Read 11729 times)

Offline Bill Kelly

  • Software Developer
  • Full Member
  • ***
  • Posts: 131
    • View Profile
    • Camera Bits, Inc.
is there a way to get the thumbnail generation running faster?

Once the Metadata Gathering phase is completed, Preview Generation will be allowed to run at maximum speed. At that point it will use up to 10 CPU cores simultaneously, provided it can read the image data fast enough from disk.


Offline JSW

  • Newcomer
  • *
  • Posts: 19
    • View Profile
It's running around 44% CPU usage now the metadata gathering is complete, which is much better.

For future optimisation, might it be worth utilising one core to gather metadata and the remaining ones to generate thumbnails? I.e. f there's a constant image queue which builds up, it could be worth putting more cores on it to get the job completed faster overall. PM has always been about speed, which is why I've been a customer for many years.

Really appreciate your detailed technical replies and the hard work you've been putting into this. I've been waiting a long time for PMP to have one place to organise all my photos. I have many lightroom catalogs for different work types, so having everything in one place is just great.

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
It's running around 44% CPU usage now the metadata gathering is complete, which is much better.

For future optimisation, might it be worth utilising one core to gather metadata and the remaining ones to generate thumbnails? I.e. f there's a constant image queue which builds up, it could be worth putting more cores on it to get the job completed faster overall. PM has always been about speed, which is why I've been a customer for many years.

The processes that gather the metadata and generate the previews are I/O limited, not CPU limited.  Meaning, they're waiting around for the data they requested and would then burn CPU cycles on to arrive.

-Kirk

Offline Bill Kelly

  • Software Developer
  • Full Member
  • ***
  • Posts: 131
    • View Profile
    • Camera Bits, Inc.
The processes that gather the metadata and generate the previews are I/O limited, not CPU limited.  Meaning, they're waiting around for the data they requested and would then burn CPU cycles on to arrive.

To add to this, the reason we staggered the Metadata Gathering and Preview Generation tasks rather than allowing them to run in parallel, was due to I/O contention. Running those tasks in parallel seemed to make things slower overall. (Adding metadata records to a catalog does a ton of disk I/O due to extensive database indexing involved.)

But in general - yes: It's hard for PM to keep all the CPU cores busy. We're usually starving for data from disk. (That said, we still have a few Catalog-related optimizations on the TO-DO list.)


Offline JSW

  • Newcomer
  • *
  • Posts: 19
    • View Profile
OK, makes sense. Always interested in performance tradeoffs.

I have a further question I'm afraid. After everything was green and complete, I just wanted to make sure everything was settled down. I had added a few photos from my phone to the computer and wanted to see how long it took to scan and pick them up. I clicked 'Scan to Catalog, Start'. PMP has been running for over 4 hours so far and still has 265450 items to scan, ie it is re-scanning *everything*. It's also up to 17109 batches for image preview to generate. Why?

I would have expected 1 batch of folder of the dozen or so photos I added since the last scan. Why does it seem to be doing everything again?  Surely it checks the last modified timestamp on images to see if it needs to re-scan the metadata or regenerate the thumbnails? Is this expected, or is something else wrong with my database after the sync repair?

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
OK, makes sense. Always interested in performance tradeoffs.

I have a further question I'm afraid. After everything was green and complete, I just wanted to make sure everything was settled down. I had added a few photos from my phone to the computer and wanted to see how long it took to scan and pick them up. I clicked 'Scan to Catalog, Start'. PMP has been running for over 4 hours so far and still has 265450 items to scan, ie it is re-scanning *everything*. It's also up to 17109 batches for image preview to generate. Why?

I would have expected 1 batch of folder of the dozen or so photos I added since the last scan. Why does it seem to be doing everything again?  Surely it checks the last modified timestamp on images to see if it needs to re-scan the metadata or regenerate the thumbnails? Is this expected, or is something else wrong with my database after the sync repair?

Scan to Catalog is best used when building a catalog for the first time.  After that, use Catalog Sync, in Quick sync mode.  It will pick up the new files.

-Kirk

Offline JSW

  • Newcomer
  • *
  • Posts: 19
    • View Profile
Ah, got you. Hadn't picked that up reading the documentation. I took Catalog Sync to mean 'syncing  *between* catalogs' not that that it would check the files and update the catalog. Misleading name.

Still, would be nice if Scan to Catalog checked if it was worth reimporting metadata and regenerating the thumbnails.Don't know if I'll remember to use Catalog Sync next time to add photos and it's a heck of a time hit to have it totally rescan everything.

Just spotted a minor issue of cut off text at the bottom of the Catalog Sync dialog bar. (The 'Clear this log when starting a new Sync' text).

Other minor UI issue, most other 'Catalog' menu items have the same name on both the menu and the window. However:
'Sync Catalogs....' give you 'Catalog Sync'.
'Example Searches...' gives you 'Search Examples'
'Manage Catalogs....' gives you 'Catalog Management'.

It would be nice to have the window titles consistently named like the rest of the menu items.

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Ah, got you. Hadn't picked that up reading the documentation. I took Catalog Sync to mean 'syncing  *between* catalogs' not that that it would check the files and update the catalog. Misleading name.

What name do you suggest?

Still, would be nice if Scan to Catalog checked if it was worth reimporting metadata and regenerating the thumbnails.Don't know if I'll remember to use Catalog Sync next time to add photos and it's a heck of a time hit to have it totally rescan everything.

I suppose it could warn you that your destination catalogs already have items in it and that Catalog Sync would be more appropriate.

Just spotted a minor issue of cut off text at the bottom of the Catalog Sync dialog bar. (The 'Clear this log when starting a new Sync' text).

The content is taller than the window.  You can scroll it vertically.  Or resize the window taller.

-Kirk

Offline JSW

  • Newcomer
  • *
  • Posts: 19
    • View Profile
Ah, got you. Hadn't picked that up reading the documentation. I took Catalog Sync to mean 'syncing  *between* catalogs' not that that it would check the files and update the catalog. Misleading name.

What name do you suggest?

Personally, I'd get rid of the 'Catalog Sync' window completely. If you look at the dialogues in my attachment below, they are almost identical. We don't need two windows that do pretty much the same task. You could easily fit all that functionality in one window.

We have 3 level of scan:
1) 'Scan to Catalog' option (scan everything and catalog and generate images regardless).
2) 'Full Sync' (Catalog new metadata, identify missing files, scan new files to catalog).
3) Quick Sync (Scan only for new files, adding them to catalog).

I'd redesign the 'Sync Mode' dropdown to be a radio button, so the 3 modes were visible at all time and the user could clearly see they are choosing between 3 scan settings.

If you want to keep them seperate, I'd rename 'Catalogs to Sync' to 'Update catalogs'. It says what it does clearly.

You'd then have 'Scan to Catalog' (implicit in there is that you are creating it/completely recreating any metadata & previews') and 'Update catalogs' (with two update options, ideally as radio buttons). Much clearer.

Still, would be nice if Scan to Catalog checked if it was worth reimporting metadata and regenerating the thumbnails.Don't know if I'll remember to use Catalog Sync next time to add photos and it's a heck of a time hit to have it totally rescan everything.

I suppose it could warn you that your destination catalogs already have items in it and that Catalog Sync would be more appropriate.

If you're not going to simplify the menus by collapsing the 2 functions into one, then yes, that would be useful.


Just spotted a minor issue of cut off text at the bottom of the Catalog Sync dialog bar. (The 'Clear this log when starting a new Sync' text).

The content is taller than the window.  You can scroll it vertically.  Or resize the window taller.

That's odd, I can't get to that window again. Ie the 'Sync Catalogs' window that I'm seeing now doesn't look like the one I screenshotted a few moments ago. Ie I'm seeing my file paths in the screen here, but they weren't on the screenshot.

I was just observing that it only would take reducing the Sync status window a tiny bit to comfortably fitting everything onscreen the first time the user encounters the window, ie a bit tidier and slicker. I have too many years working as a software tester I guess. ;-)