Author Topic: Remove duplicates  (Read 31214 times)

Offline vassing

  • Newcomer
  • *
  • Posts: 36
  • Photojournalist
    • View Profile
    • Morten Vassing│Photojournalist
Remove duplicates
« on: January 15, 2007, 12:56:38 PM »
I have cleaned up house and saved all digital files from various disks to one hard drive (plus backup) but now the question of duplicate removal arises. When Ctrl+M renaming the files using the [datesort]--Subject--[timesortlong] variable in Photo Mechanic I get all duplicates with the capital letter A,B,C,D... etc at the end of the file name as I have chosen in Preferences. Then I go to the appropriate folder in Windows XP, select all by way of Ctrl+A and then manually click out all file names without the capital letter in the end of the name and delete all the rest. Works like a charm but darn time and labor intensive. Surely, there must another way  :o.

Best,

Morten Vassing
Morten Vassing│Photojournalist - COPENHAGEN│Denmark - Nikon D3│MacBook Pro │ Mac OS X (10.8)│PM5

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Remove duplicates
« Reply #1 on: January 15, 2007, 06:07:22 PM »
Morten,

I have cleaned up house and saved all digital files from various disks to one hard drive (plus backup) but now the question of duplicate removal arises. When Ctrl+M renaming the files using the [datesort]--Subject--[timesortlong] variable in Photo Mechanic I get all duplicates with the capital letter A,B,C,D... etc at the end of the file name as I have chosen in Preferences. Then I go to the appropriate folder in Windows XP, select all by way of Ctrl+A and then manually click out all file names without the capital letter in the end of the name and delete all the rest. Works like a charm but darn time and labor intensive. Surely, there must another way  :o.

If you use that renaming string, you have no guarantee that all images are not duplicates.  If you shot a burst of images, then timesortlong can have multiple exposures with the exact same time.  On the 1D/1D Mark II/1D Mark II-N you could have up to eight images with the same timestamp.

You may find that using something like {frame} in your renaming string will eliminate the collisions.

HTH,

-Kirk

Offline vassing

  • Newcomer
  • *
  • Posts: 36
  • Photojournalist
    • View Profile
    • Morten Vassing│Photojournalist
Re: Remove duplicates
« Reply #2 on: January 16, 2007, 06:44:35 AM »
Morten,

I have cleaned up house and saved all digital files from various disks to one hard drive (plus backup) but now the question of duplicate removal arises. When Ctrl+M renaming the files using the [datesort]--Subject--[timesortlong] variable in Photo Mechanic I get all duplicates with the capital letter A,B,C,D... etc at the end of the file name as I have chosen in Preferences. Then I go to the appropriate folder in Windows XP, select all by way of Ctrl+A and then manually click out all file names without the capital letter in the end of the name and delete all the rest. Works like a charm but darn time and labor intensive. Surely, there must another way  :o.

If you use that renaming string, you have no guarantee that all images are not duplicates.  If you shot a burst of images, then timesortlong can have multiple exposures with the exact same time.  On the 1D/1D Mark II/1D Mark II-N you could have up to eight images with the same timestamp.

You may find that using something like {frame} in your renaming string will eliminate the collisions.

HTH,

-Kirk


Kirk,

Does the same apply to NEFs from Nikon models D2H and D2xs?

Best,

Morten Vassing
Morten Vassing│Photojournalist - COPENHAGEN│Denmark - Nikon D3│MacBook Pro │ Mac OS X (10.8)│PM5

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Remove duplicates
« Reply #3 on: January 16, 2007, 07:08:33 AM »
Morten,

I have cleaned up house and saved all digital files from various disks to one hard drive (plus backup) but now the question of duplicate removal arises. When Ctrl+M renaming the files using the [datesort]--Subject--[timesortlong] variable in Photo Mechanic I get all duplicates with the capital letter A,B,C,D... etc at the end of the file name as I have chosen in Preferences. Then I go to the appropriate folder in Windows XP, select all by way of Ctrl+A and then manually click out all file names without the capital letter in the end of the name and delete all the rest. Works like a charm but darn time and labor intensive. Surely, there must another way  :o.

If you use that renaming string, you have no guarantee that all images are not duplicates.  If you shot a burst of images, then timesortlong can have multiple exposures with the exact same time.  On the 1D/1D Mark II/1D Mark II-N you could have up to eight images with the same timestamp.

You may find that using something like {frame} in your renaming string will eliminate the collisions.

Does the same apply to NEFs from Nikon models D2H and D2xs?

As long as you Ingest the images while they are named XYZ_1234.ext where XYZ are the first three letters are DSC, IMG, _MG, your initials, etc. and the 'ext' is a filetype that we support, then the frame number will be tucked away in the Photo Mechanic image preferences and will always be available for renaming purposes.  Moreover some cameras write out the frame number inside the file and we can get it from there if the photo has never been ingested via Photo Mechanic.  And some cameras support the {shutter} variable which is generally a unique number as well.

My advice would to be to experiment with some photos.  You can always put the variables into your Info Text (Edit->Set Info Text...) and then turn on the Tooltips from the Image menu.  Then when you hover over images you can see the values of those variables.  If they look like they have valid values then go ahead and use them in your renaming.

HTH,

-Kirk

Offline thermarest

  • Newcomer
  • *
  • Posts: 13
    • View Profile
Re: Remove duplicates
« Reply #4 on: February 06, 2007, 05:38:35 AM »
Wow...I've been using PM for a while and did not know about the frame variable. I had actually given up using PM for renaming because I wanted to retain the frame number. I've been using iView, which has a search and replace function in renaming, and just replacing the IMG. But that's a pain since I have to catalog images with iView first. Great!

Offline vAfotoriporter

  • Uber Member
  • ******
  • Posts: 1046
    • View Profile
    • Attila Volgyi photojournalist
Re: Remove duplicates
« Reply #5 on: April 16, 2007, 03:44:49 PM »
My 1D mkII sometimes gives me  repeated file numbers even if set to continous numbering. I think its because reusing a card used but not formatted after download. This usually happens after the counter turns from 9999 to 0001. Usually I simply timesort the files and rename them using sequential numbers from 0001, but it isn't the best solution so I would prefer the original frame numbers but those are conflicting.

My filenames are {year2}{month0}{day0}{frame4}a.ext where a is for marking the 1DmkII body ext is of course the extention. Any ideas as a solution?
Working on Mac, OSX, iOS and with some Canons.
Allways shooting RAW.

http://www.volgyiattila.hu

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Remove duplicates
« Reply #6 on: April 16, 2007, 03:50:52 PM »
My 1D mkII sometimes gives me  repeated file numbers even if set to continous numbering. I think its because reusing a card used but not formatted after download. This usually happens after the counter turns from 9999 to 0001. Usually I simply timesort the files and rename them using sequential numbers from 0001, but it isn't the best solution so I would prefer the original frame numbers but those are conflicting.

My filenames are {year2}{month0}{day0}{frame4}a.ext where a is for marking the 1DmkII body ext is of course the extention. Any ideas as a solution?

We always recommend reformatting the card in the camera each time you have finished Ingesting the files from the card.

That should solve the problem you described.

-Kirk

Offline vassing

  • Newcomer
  • *
  • Posts: 36
  • Photojournalist
    • View Profile
    • Morten Vassing│Photojournalist
Re: Remove duplicates
« Reply #7 on: December 13, 2007, 04:01:09 AM »
Kirk,

I set my Nikon D3 to 11fps while mounting a Nikkor DX format lens and shot several hundred files with appropriate pauses laid in for the buffer to catch up and I can now attest that even with 11 frames pr second the {timesortlong} Variable will name each file with a unique time stamp guaranteeing that a one time renaming of a file will include that unique identifier for all times - making life so much easier for me and my clients when I have to go back into the archive and find that specific file.

BTW, for anyone else interested in a fool proof DAM naming scheme I suggest something like this suggested to me by Rob Galbraith during one of his inspiring digital workshops on workflow for the working photographer:

VAS-{datesort}-Subject-{timesortlong}

which will result in a unique file name like VAS-20071212-Train-10562386 where VAS is your short personal identifier then year month day-Subject-hour minute second spilt second.

Obviously, I don't know if this will work with Canon cameras as Kirk pointed out previously that this might not be the case.

Happy shooting, everyone.

And happy holidays to each and every one of you Photo Mechanic die hard fans...

Cheers,

Morten Vassing
Morten Vassing│Photojournalist - COPENHAGEN│Denmark - Nikon D3│MacBook Pro │ Mac OS X (10.8)│PM5

Offline FVlcek

  • Sr. Member
  • ****
  • Posts: 467
    • View Profile
Re: Remove duplicates
« Reply #8 on: December 14, 2007, 02:33:37 AM »
There is also the {subsecond} variable, which is honoured at least on Nikon cameras. It's 1/10th seconds. But that's not good enough for 11fps of D3 :)

Frantisek

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2552
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Remove duplicates
« Reply #9 on: December 14, 2007, 03:22:19 PM »
frantisek, actually the subsecond is 1/100th of a second.  Enough even for 11 fps :)
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline janeenadamsmartin

  • Newcomer
  • *
  • Posts: 38
    • View Profile
Re: Remove duplicates
« Reply #10 on: January 04, 2008, 04:09:17 PM »
Do the "frame" and "subsecond" variables work for all cameras?  I have some shots with an SD300 and a Nikon D70 where neither appears.  I can understand that maybe "subsecond" wouldn't because the camera can't shoot that fast, but should the "frame" variable work for all cameras?

jan

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Remove duplicates
« Reply #11 on: January 04, 2008, 06:42:59 PM »
Jan,

Do the "frame" and "subsecond" variables work for all cameras?  I have some shots with an SD300 and a Nikon D70 where neither appears.  I can understand that maybe "subsecond" wouldn't because the camera can't shoot that fast, but should the "frame" variable work for all cameras?

Some cameras do not create a "frame" EXIF tag.  So Photo Mechanic infers the frame number from the filename which from the camera is the last four digits in the filename.  If the file has been renamed before Photo Mechanic has done anything to the file then the frame number can no longer be inferred.

-Kirk


Offline janeenadamsmartin

  • Newcomer
  • *
  • Posts: 38
    • View Profile
Re: Remove duplicates
« Reply #12 on: January 04, 2008, 10:44:00 PM »
Thanks, Kirk.

It looks like user error.  I can find the frame number on shots taken after March 2005, which is about the time I got Photo Mechanic.  I must have been fooling around and renaming things prior to ingesting them and then decided on a method in March.

Thanks for your always prompt responses,
jan

Offline JAS Photo

  • Sr. Member
  • ****
  • Posts: 394
    • View Profile
    • JAS Photo LLC
Re: Remove duplicates
« Reply #13 on: January 05, 2008, 09:08:33 AM »
I shoot RAW + JPG - I usually print from the JPG - if it's good enough. I safe my PS work as JPGs also to upload to printer. When I'm finished with a job I would like to delete all the original JPG's - I want to save the fixed JPG.
Does anyone have any idea how to delete the original JPG's (I'll keep the RAW's) but leave the worked on JPG's.
I'm working in Windows XP.

Also - I've found the long date/time stamp tooo long for clients - now I use the frame# in front + my ID + date so the client just needs to give me the first 4 digits. It becomes XXXX-JAS20080105 if / when I start to shoot with 2 bodies more I may switch to 8 digit time using Hours (HH) minutes (MM) seconds in thousandths (SSSS)   then -JAS20080105.
Joe Sorrentino
JAS Photo LLC
http://www.JASPhoto.com

Offline Pavel

  • Member
  • **
  • Posts: 89
    • View Profile
Re: Remove duplicates
« Reply #14 on: January 05, 2008, 11:06:11 AM »
I use this as a naming scheme:
{year2}{month0}{day0}-{hour24}{minute}{second}
and have been very happy with it because it allows me to identify the event  very easily by finding the {year2}{month0} folder and then the individual shot by {hour24}{minute}{second}. THe day is there for sorting and the A,B,C etc extensions have worked great for me. I also like this because it is not too long to easily display under the thumbnails and easy enough to memorize with a glance.

I too however have been wondering about how to remove duplicates in a safe maner and have left it be for two years due to fears of deleting the only originals, so I am very happy to hear about this {frame} variable which I did not know how to find a use for.  Thank you Kirk.

The idea of the subsecond variable is interesting as well.  Any thought from anyone on which of these two ideas may be better and if there may be even a third way?  I don't like the idea of using anything that makes the name too long but I have several rename strings saved under different names and have not objection to renaming a folder for duplicate deletion, and then renaming it back for storage.  It only takes a few seconds after all.  What a great program.  I can honestly say that it is the most valuable photographic program on my machine.