Author Topic: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic  (Read 18872 times)

Offline gregkeene

  • Newcomer
  • *
  • Posts: 9
    • View Profile
In addition to using Photo Mechanic as part of my work flow, I have some custom software written in Perl which loads IPTC tags and writes out a some additional tags. The module I'm using is by Phil Harvey called Image::ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/).

What I found is that when I try to write the file, I get this error:
Error: [minor] 2064 unreferenced bytes at end of file not copied

I contacted Phil and he looked at the file and stated that '... the software is adding a non-standard trailer containing what looks to be an empty IPTC structure'. 

If I use my same custom software on native 30D .CR2 images, I don't get the error. It's only after Photo Mechanic edits the IPTC tags. I'd be interested in your thoughts and happy to help pursue the issue.

I'm using 4.4.3 on Mac OS X (10.4.6 - Intel).

Thanks,

Greg

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #1 on: May 25, 2006, 12:37:32 PM »
In addition to using Photo Mechanic as part of my work flow, I have some custom software written in Perl which loads IPTC tags and writes out a some additional tags. The module I'm using is by Phil Harvey called Image::ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/).

What I found is that when I try to write the file, I get this error:
Error: [minor] 2064 unreferenced bytes at end of file not copied

I contacted Phil and he looked at the file and stated that '... the software is adding a non-standard trailer containing what looks to be an empty IPTC structure'. 

If I use my same custom software on native 30D .CR2 images, I don't get the error. It's only after Photo Mechanic edits the IPTC tags. I'd be interested in your thoughts and happy to help pursue the issue.

I'm using 4.4.3 on Mac OS X (10.4.6 - Intel).

While the data you are referring to is formatted like IPTC data, it is Photo Mechanic's image preferences that are contained in that data.  The last 12 bytes of the file are a backward link to the image preferences that are a fixed size 2048 byte block. (2048 + 12 accounts for your 2064 bytes as called out in the warning.)

You can choose to not preserve that data, but if you do decide to discard it, then you will lose the Color Class, Tag, Rotation, Crop, and in some cases the frame number.  But no IPTC caption data will be lost.

What data are you adding to the CR2 files?  Be very careful.  We only decided to implement IPTC/XMP captioning in CR2 files (and other RAW files) when we were able to come up with a way to completely revert any changes that we had added.  Only then did we feel safe in making the feature available in Photo Mechanic.

HTH,

-Kirk

Offline gregkeene

  • Newcomer
  • *
  • Posts: 9
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #2 on: May 25, 2006, 12:44:41 PM »
In addition to using Photo Mechanic as part of my work flow, I have some custom software written in Perl which loads IPTC tags and writes out a some additional tags. The module I'm using is by Phil Harvey called Image::ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/).

What I found is that when I try to write the file, I get this error:
Error: [minor] 2064 unreferenced bytes at end of file not copied

I contacted Phil and he looked at the file and stated that '... the software is adding a non-standard trailer containing what looks to be an empty IPTC structure'. 

If I use my same custom software on native 30D .CR2 images, I don't get the error. It's only after Photo Mechanic edits the IPTC tags. I'd be interested in your thoughts and happy to help pursue the issue.

I'm using 4.4.3 on Mac OS X (10.4.6 - Intel).

While the data you are referring to is formatted like IPTC data, it is Photo Mechanic's image preferences that are contained in that data.  The last 12 bytes of the file are a backward link to the image preferences that are a fixed size 2048 byte block. (2048 + 12 accounts for your 2064 bytes as called out in the warning.)

You can choose to not preserve that data, but if you do decide to discard it, then you will lose the Color Class, Tag, Rotation, Crop, and in some cases the frame number.  But no IPTC caption data will be lost.

What data are you adding to the CR2 files?  Be very careful.  We only decided to implement IPTC/XMP captioning in CR2 files (and other RAW files) when we were able to come up with a way to completely revert any changes that we had added.  Only then did we feel safe in making the feature available in Photo Mechanic.

HTH,

-Kirk

Kirk:

Thanks for the info. I'm assuming the the EXIF rotation will stay, just not yours, correct?

FYI, the data I'm trying to add are values for keywords and captions. Because those values are being pulled from a database, I can't use Photo Mechanic to do it. My work flow is to set the Object Name to the SKU# using Photo Mechanic, then all the keywords and caption are added using my software (again, thousands of photos and SKUs). So, I'm not adding any new fields, just values to existing fields.

Phil may download your software to see if there are changes to make to support the values.

Greg

Offline Kirk Baker

  • Senior Software Engineer
  • Camera Bits Staff
  • Superhero Member
  • *****
  • Posts: 25020
    • View Profile
    • Camera Bits, Inc.
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #3 on: May 25, 2006, 12:51:19 PM »
In addition to using Photo Mechanic as part of my work flow, I have some custom software written in Perl which loads IPTC tags and writes out a some additional tags. The module I'm using is by Phil Harvey called Image::ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/).

What I found is that when I try to write the file, I get this error:
Error: [minor] 2064 unreferenced bytes at end of file not copied

I contacted Phil and he looked at the file and stated that '... the software is adding a non-standard trailer containing what looks to be an empty IPTC structure'. 

If I use my same custom software on native 30D .CR2 images, I don't get the error. It's only after Photo Mechanic edits the IPTC tags. I'd be interested in your thoughts and happy to help pursue the issue.

I'm using 4.4.3 on Mac OS X (10.4.6 - Intel).

While the data you are referring to is formatted like IPTC data, it is Photo Mechanic's image preferences that are contained in that data.  The last 12 bytes of the file are a backward link to the image preferences that are a fixed size 2048 byte block. (2048 + 12 accounts for your 2064 bytes as called out in the warning.)

You can choose to not preserve that data, but if you do decide to discard it, then you will lose the Color Class, Tag, Rotation, Crop, and in some cases the frame number.  But no IPTC caption data will be lost.

What data are you adding to the CR2 files?  Be very careful.  We only decided to implement IPTC/XMP captioning in CR2 files (and other RAW files) when we were able to come up with a way to completely revert any changes that we had added.  Only then did we feel safe in making the feature available in Photo Mechanic.
Kirk:

Thanks for the info. I'm assuming the the EXIF rotation will stay, just not yours, correct?

That's correct.

Quote from: gregkeene
FYI, the data I'm trying to add are values for keywords and captions. Because those values are being pulled from a database, I can't use Photo Mechanic to do it. My work flow is to set the Object Name to the SKU# using Photo Mechanic, then all the keywords and caption are added using my software (again, thousands of photos and SKUs). So, I'm not adding any new fields, just values to existing fields.

But when you make the block of IPTC data bigger are you relocating the tiff table and putting your expanded IPTC data in a new location?  Because if you don't you risk damaging the CR2 file in such a way that some programs, including Canon's may not be able to parse the files any longer.

Quote from: gregkeene
Phil may download your software to see if there are changes to make to support the values.

I'd be happy to give him a license if he wants one.

-Kirk

« Last Edit: May 25, 2006, 01:10:13 PM by Kirk Baker »

Offline gregkeene

  • Newcomer
  • *
  • Posts: 9
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #4 on: May 25, 2006, 12:58:51 PM »
In addition to using Photo Mechanic as part of my work flow, I have some custom software written in Perl which loads IPTC tags and writes out a some additional tags. The module I'm using is by Phil Harvey called Image::ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/).

What I found is that when I try to write the file, I get this error:
Error: [minor] 2064 unreferenced bytes at end of file not copied

I contacted Phil and he looked at the file and stated that '... the software is adding a non-standard trailer containing what looks to be an empty IPTC structure'. 

If I use my same custom software on native 30D .CR2 images, I don't get the error. It's only after Photo Mechanic edits the IPTC tags. I'd be interested in your thoughts and happy to help pursue the issue.

I'm using 4.4.3 on Mac OS X (10.4.6 - Intel).

While the data you are referring to is formatted like IPTC data, it is Photo Mechanic's image preferences that are contained in that data.  The last 12 bytes of the file are a backward link to the image preferences that are a fixed size 2048 byte block. (2048 + 12 accounts for your 2064 bytes as called out in the warning.)

You can choose to not preserve that data, but if you do decide to discard it, then you will lose the Color Class, Tag, Rotation, Crop, and in some cases the frame number.  But no IPTC caption data will be lost.

What data are you adding to the CR2 files?  Be very careful.  We only decided to implement IPTC/XMP captioning in CR2 files (and other RAW files) when we were able to come up with a way to completely revert any changes that we had added.  Only then did we feel safe in making the feature available in Photo Mechanic.
Kirk:

Thanks for the info. I'm assuming the the EXIF rotation will stay, just not yours, correct?
Quote from: gregkeene
FYI, the data I'm trying to add are values for keywords and captions. Because those values are being pulled from a database, I can't use Photo Mechanic to do it. My work flow is to set the Object Name to the SKU# using Photo Mechanic, then all the keywords and caption are added using my software (again, thousands of photos and SKUs). So, I'm not adding any new fields, just values to existing fields.

But when you make the block of IPTC data bigger are you relocating the tiff table and putting your expanded IPTC data in a new location?  Because if you don't you risk damaging the CR2 file in such a way that some programs, including Canon's may not be able to parse the files any longer.

Quote from: gregkeene
Phil may download your software to see if there are changes to make to support the values.

I'd be happy to give him a license if he wants one.

-Kirk

I've told Phil about your generous offer.

Hopefully he's correctly relocating the TIFF table to keep it compatible. I pointed him to this thread so he'll hopefully respond.

Also, if you are ever looking for beta testers, etc., I have experience there and would be happy to help do that in the future. I can test both Window and Mac (Intel only).

Greg

Offline Phil Harvey

  • Newcomer
  • *
  • Posts: 12
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #5 on: May 25, 2006, 06:41:30 PM »
Thanks Kirk.  I had figured out the meaning of the last 12 bytes of the trailer (length word plus magic number), but I didn't know what information you were storing there.

I have just downloaded the PhotoMechanic trial version and tried it out, and discovered where the conflict is arising.

The interesting thing is that ExifTool will tolerate extra unreferenced data at the end of a CR2 image because this is how the Canon utilities (such as DPP) store editing information.  So the trailer itself isn't the problem.  (I see that PhotoMechanic checks specifically for the Canon trailer and inserts its trailer before that if it exists.  Interesting.  How do you deal with other trailers which may be added by utilities similar to PhotoMechanic?)

The problem is that if IPTC information is edited with PhotoMechanic, a copy of IFD0 is made and the offset at the start of the file (which is normally 0x10) is modified.  Unfortunately, ExifTool (and maybe other utilities) use the first 12 bytes of the image as a "magic number" to recognize CR2 images.  These bytes should be either 49 49 2a 00 10 00 00 00 43 52 02 00 (little endian) or 4d 4d 00 2a 00 00 00 10 43 52 02 00 (big endian).  Otherwise the image isn't recognized as a valid CR2.  Perhaps I could relax my magic number test, but I worry that other utilities may be doing the same thing.

Also, the old (now unreferenced) IFD0 information which PhotoMechanic attempts to preserve is lost when the image is subsequently edited by ExifTool (and any utility which actually modifies the IFD structure).  So the attempt to preserve this information fails in this case.

As well, since the image is no longer recognized as a CR2 image, the rest of the CR2 identifier 43 52 02 00 ("CR\x02\x00") is also lost.

But when you make the block of IPTC data bigger are you relocating the tiff table and putting your expanded IPTC data in a new location?  Because if you don't you risk damaging the CR2 file in such a way that some programs, including Canon's may not be able to parse the files any longer.

This is a valid point.  Any tool which modifies a CR2 image without understanding the structure of the proprietary information runs a risk of damaging the file as you mention.  ExifTool rewrites the image without relocating the main IFD, but significant effort has been put into understanding all of the proprietary information and rewriting it properly.  It has been tested fairly extensively with various utilities (ALL of Canon's utilities, Photoshop, and many other 3rd party utilities), all without any problems, with one exception:  The built-in Apple OS X raw utilities lose the white balance information when an image is edited with ExifTool.  This bug has been reported, and is more than likely a result of an overly simplistic reading algorithm used by Apple (it is likely that they read the WB information from a fixed offset in the file -- eek!).  In fact, the Apple routines are so fragile they have also been broken in the past by camera firmware updates.  Hopefully Apple will fix this.  But the good news is that your conservative approach of preserving the original IFD structure is fully compatible with Apple's brain-dead algorthm... (am I being too harsh toward Apple? can you tell I'm bitter?...)

- Phil Harvey
« Last Edit: May 26, 2006, 05:57:30 AM by boardhead »

Offline Phil Harvey

  • Newcomer
  • *
  • Posts: 12
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #6 on: May 26, 2006, 07:18:57 AM »
OK.  I have implemented a solution to provide compatibility between ExifTool and PhotoMechanic-edited CR2 images:

ExifTool 6.21 (just released from http://owl.phy.queensu.ca/~phil/exiftool/) relaxes the CR2 magic-number test to allow recognition of CR2 images which have been edited by PhotoMechanic.

Offline dennis

  • President
  • Camera Bits Staff
  • Sr. Member
  • *****
  • Posts: 469
    • View Profile
    • Camera Bits, Inc.
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #7 on: May 26, 2006, 10:15:44 AM »
The problem is that if IPTC information is edited with PhotoMechanic, a copy of IFD0 is made and the offset at the start of the file (which is normally 0x10) is modified.  Unfortunately, ExifTool (and maybe other utilities) use the first 12 bytes of the image as a "magic number" to recognize CR2 images.  These bytes should be either 49 49 2a 00 10 00 00 00 43 52 02 00 (little endian) or 4d 4d 00 2a 00 00 00 10 43 52 02 00 (big endian).  Otherwise the image isn't recognized as a valid CR2.

Hi Phil,

There are only two choices to add info to a TIFF-based RAW file such as the Canon CR2 and Nikon NEF.  One is to rewrite the whole file which is what ExifTool and some camera manufacturer's software like Nikon Capture do.  The other option is to relocate the main IFD which is what Photo Mechanic does to preserve the location of all RAW information.  Photo Mechanic does this in such a way that it can undo the edits made to the file in case bonehead software fails to parse the TIFF-based format with a relocated IFD.  However, we make no guarantees about being able to undo the edits we made if other software (such as ExifTool) rewrites the file.

And yes, unfortunately Apple is a prime offender regarding bad parsing of RAW files and we have posted bug reports about this too, which Apple did fix with CR2 files (but they haven't fixed ORF or 1D TIFFs and will need to be careful if they decide to support other TIFF-based RAW formats).  Apple isn't the only software that uses incompetent parsing (Nikon's component given to Adobe to handle the D2X encrypted white balance is another which Thomas worked-around I believe), but Apple's parsing is particularly bad for whatever reason.  For example, just add one single byte to the END of a E-300 ORF and Apple's OS, including Preview and Aperture, will produce a garbage image.  This doesn't affect a C-8080 ORF, but relocating the IFD does break Apple's OS for ORFs.

This is the problem with TIFF-based files: they don't lend themselves to editing.  I complained to Thomas Knoll about this with DNG and he is the one who suggested relocating the IFD which I was aware of but didn't want to do without a mechanism to undo (which I subsequently invented due to the requests of many users to be able to embed IPTC/XMP into RAW files).  And now we discover that Microsoft's new Windows Media Photo format (WDP) is also based on TIFF!  When will these people learn?  Perhaps they figure it is best to stick with this well-known formatting schema despite its inadequacies, but there are other container formats that would have made a better choice.  I suppose we can all blame Kodak for this since they started using a TIFF-based RAW format back with the DCS 4xx cameras and even gave the files a TIF extension  ::), causing many users to have a heart attack when seeing a tiny grayscale thumbnail when opening the RAW TIF files directly into Photoshop from their $26K camera.  But at least Kodak left enough room in the file so that it was possible to insert TIFF tags without having to relocate everything.  I hate to see what other hordes of bonehead software will break when parsing a modified WDP (hopefully Microsoft will address this issue of updating WDP files before it reaches 1.0).

My concerns about rewriting the entire RAW file as ExifTool does is that when new models come out, unless ExifTool is careful to only update based on known models then it can potentially corrupt the RAW photos and I don't think there is a way to undo this (or is this planned for ExifTool?).  Its one thing for the camera manufacturer to rewrite the file since they own the proprietary format, but when modifying proprietary formats one should tread very carefully.

Camera manufactures that continue to create proprietary RAW files demonstrate their lack of ingenuity or care for end users, not sure which (or both).  And I don't think DNG is the answer either since it allows proprietary data inside of its openly defined format (which is controlled by Adobe).

PS - I thought a CR2 file was recognized as a Canon RAW file by its .CR2 filename extension?  ???  But thanks for making the fix in the new ExifTool to handle the patched IFD offset.  ExifTool is quite amazing at digging out info from the various formats.

Dennis

Offline gregkeene

  • Newcomer
  • *
  • Posts: 9
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #8 on: May 26, 2006, 10:23:38 AM »
As a user of both tools, thanks for working together to bridge the gaps left by the manufacturers. I greatly appreciate it.

Best regards,

Greg

Offline Phil Harvey

  • Newcomer
  • *
  • Posts: 12
    • View Profile
Re: Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic
« Reply #9 on: May 26, 2006, 11:05:37 AM »
Hi Dennis,

This is the problem with TIFF-based files: they don't lend themselves to editing.  I complained to Thomas Knoll about this with DNG and he is the one who suggested relocating the IFD which I was aware of but didn't want to do without a mechanism to undo (which I subsequently invented due to the requests of many users to be able to embed IPTC/XMP into RAW files). [...]

I agree completely.  The TIFF format is very poor.  I have had arguments with DNG supporters about this, but it's a losing battle.  (See my rant about the TIFF format on http://owl.phy.queensu.ca/~phil/exiftool/canon_raw.html.)  I have even gone so far as to start work on defining a file format which could potentially solve these problems (see the MIE format supported by ExifTool), even though it is unlikely that anyone will ever adopt it.

My concerns about rewriting the entire RAW file as ExifTool does is that when new models come out, unless ExifTool is careful to only update based on known models then it can potentially corrupt the RAW photos and I don't think there is a way to undo this (or is this planned for ExifTool?).

You are correct, and there is no undo planned for ExifTool.  But there is only one way to ensure safety of your images: back up the originals!  Of course I do everything I can to ensure that images aren't corrupted, but the possibility always exists whenever a file is modified, even for minor edits.

PS - I thought a CR2 file was recognized as a Canon RAW file by its .CR2 filename extension?

The extension is an unreliable way to determine the file type.  I have frequently seen files with the wrong extension, or files with extensions missing.  ExifTool can still handle these.  Also, ExifTool is designed to be able to parse images in memory, or from an open stream, and in both cases there is no associated file name, and hence no extension.

I still have some work to do to test ExifTool on other types of PhotoMechanic-edited RAW images.  So far I have just taken a look at CR2.  You may be getting a note from me to take you up on Kirk's license offer if I am not able to complete all my testing before my demo license expires.

I appreciate how helpful the people at Camera Bits have been.

- Phil