Photo Mechanic > Support

Perl Image::ExifTool finds error in IPTC Tag from PhotoMechanic

<< < (2/2)

Phil Harvey:
Thanks Kirk.  I had figured out the meaning of the last 12 bytes of the trailer (length word plus magic number), but I didn't know what information you were storing there.

I have just downloaded the PhotoMechanic trial version and tried it out, and discovered where the conflict is arising.

The interesting thing is that ExifTool will tolerate extra unreferenced data at the end of a CR2 image because this is how the Canon utilities (such as DPP) store editing information.  So the trailer itself isn't the problem.  (I see that PhotoMechanic checks specifically for the Canon trailer and inserts its trailer before that if it exists.  Interesting.  How do you deal with other trailers which may be added by utilities similar to PhotoMechanic?)

The problem is that if IPTC information is edited with PhotoMechanic, a copy of IFD0 is made and the offset at the start of the file (which is normally 0x10) is modified.  Unfortunately, ExifTool (and maybe other utilities) use the first 12 bytes of the image as a "magic number" to recognize CR2 images.  These bytes should be either 49 49 2a 00 10 00 00 00 43 52 02 00 (little endian) or 4d 4d 00 2a 00 00 00 10 43 52 02 00 (big endian).  Otherwise the image isn't recognized as a valid CR2.  Perhaps I could relax my magic number test, but I worry that other utilities may be doing the same thing.

Also, the old (now unreferenced) IFD0 information which PhotoMechanic attempts to preserve is lost when the image is subsequently edited by ExifTool (and any utility which actually modifies the IFD structure).  So the attempt to preserve this information fails in this case.

As well, since the image is no longer recognized as a CR2 image, the rest of the CR2 identifier 43 52 02 00 ("CR\x02\x00") is also lost.


--- Quote from: Kirk Baker on May 25, 2006, 12:51:19 PM ---But when you make the block of IPTC data bigger are you relocating the tiff table and putting your expanded IPTC data in a new location?  Because if you don't you risk damaging the CR2 file in such a way that some programs, including Canon's may not be able to parse the files any longer.

--- End quote ---

This is a valid point.  Any tool which modifies a CR2 image without understanding the structure of the proprietary information runs a risk of damaging the file as you mention.  ExifTool rewrites the image without relocating the main IFD, but significant effort has been put into understanding all of the proprietary information and rewriting it properly.  It has been tested fairly extensively with various utilities (ALL of Canon's utilities, Photoshop, and many other 3rd party utilities), all without any problems, with one exception:  The built-in Apple OS X raw utilities lose the white balance information when an image is edited with ExifTool.  This bug has been reported, and is more than likely a result of an overly simplistic reading algorithm used by Apple (it is likely that they read the WB information from a fixed offset in the file -- eek!).  In fact, the Apple routines are so fragile they have also been broken in the past by camera firmware updates.  Hopefully Apple will fix this.  But the good news is that your conservative approach of preserving the original IFD structure is fully compatible with Apple's brain-dead algorthm... (am I being too harsh toward Apple? can you tell I'm bitter?...)

- Phil Harvey

Phil Harvey:
OK.  I have implemented a solution to provide compatibility between ExifTool and PhotoMechanic-edited CR2 images:

ExifTool 6.21 (just released from http://owl.phy.queensu.ca/~phil/exiftool/) relaxes the CR2 magic-number test to allow recognition of CR2 images which have been edited by PhotoMechanic.

dennis:

--- Quote from: boardhead on May 25, 2006, 06:41:30 PM ---The problem is that if IPTC information is edited with PhotoMechanic, a copy of IFD0 is made and the offset at the start of the file (which is normally 0x10) is modified.  Unfortunately, ExifTool (and maybe other utilities) use the first 12 bytes of the image as a "magic number" to recognize CR2 images.  These bytes should be either 49 49 2a 00 10 00 00 00 43 52 02 00 (little endian) or 4d 4d 00 2a 00 00 00 10 43 52 02 00 (big endian).  Otherwise the image isn't recognized as a valid CR2.
--- End quote ---

Hi Phil,

There are only two choices to add info to a TIFF-based RAW file such as the Canon CR2 and Nikon NEF.  One is to rewrite the whole file which is what ExifTool and some camera manufacturer's software like Nikon Capture do.  The other option is to relocate the main IFD which is what Photo Mechanic does to preserve the location of all RAW information.  Photo Mechanic does this in such a way that it can undo the edits made to the file in case bonehead software fails to parse the TIFF-based format with a relocated IFD.  However, we make no guarantees about being able to undo the edits we made if other software (such as ExifTool) rewrites the file.

And yes, unfortunately Apple is a prime offender regarding bad parsing of RAW files and we have posted bug reports about this too, which Apple did fix with CR2 files (but they haven't fixed ORF or 1D TIFFs and will need to be careful if they decide to support other TIFF-based RAW formats).  Apple isn't the only software that uses incompetent parsing (Nikon's component given to Adobe to handle the D2X encrypted white balance is another which Thomas worked-around I believe), but Apple's parsing is particularly bad for whatever reason.  For example, just add one single byte to the END of a E-300 ORF and Apple's OS, including Preview and Aperture, will produce a garbage image.  This doesn't affect a C-8080 ORF, but relocating the IFD does break Apple's OS for ORFs.

This is the problem with TIFF-based files: they don't lend themselves to editing.  I complained to Thomas Knoll about this with DNG and he is the one who suggested relocating the IFD which I was aware of but didn't want to do without a mechanism to undo (which I subsequently invented due to the requests of many users to be able to embed IPTC/XMP into RAW files).  And now we discover that Microsoft's new Windows Media Photo format (WDP) is also based on TIFF!  When will these people learn?  Perhaps they figure it is best to stick with this well-known formatting schema despite its inadequacies, but there are other container formats that would have made a better choice.  I suppose we can all blame Kodak for this since they started using a TIFF-based RAW format back with the DCS 4xx cameras and even gave the files a TIF extension  ::), causing many users to have a heart attack when seeing a tiny grayscale thumbnail when opening the RAW TIF files directly into Photoshop from their $26K camera.  But at least Kodak left enough room in the file so that it was possible to insert TIFF tags without having to relocate everything.  I hate to see what other hordes of bonehead software will break when parsing a modified WDP (hopefully Microsoft will address this issue of updating WDP files before it reaches 1.0).

My concerns about rewriting the entire RAW file as ExifTool does is that when new models come out, unless ExifTool is careful to only update based on known models then it can potentially corrupt the RAW photos and I don't think there is a way to undo this (or is this planned for ExifTool?).  Its one thing for the camera manufacturer to rewrite the file since they own the proprietary format, but when modifying proprietary formats one should tread very carefully.

Camera manufactures that continue to create proprietary RAW files demonstrate their lack of ingenuity or care for end users, not sure which (or both).  And I don't think DNG is the answer either since it allows proprietary data inside of its openly defined format (which is controlled by Adobe).

PS - I thought a CR2 file was recognized as a Canon RAW file by its .CR2 filename extension?  ???  But thanks for making the fix in the new ExifTool to handle the patched IFD offset.  ExifTool is quite amazing at digging out info from the various formats.

Dennis

gregkeene:
As a user of both tools, thanks for working together to bridge the gaps left by the manufacturers. I greatly appreciate it.

Best regards,

Greg

Phil Harvey:
Hi Dennis,


--- Quote from: dennis on May 26, 2006, 10:15:44 AM ---This is the problem with TIFF-based files: they don't lend themselves to editing.  I complained to Thomas Knoll about this with DNG and he is the one who suggested relocating the IFD which I was aware of but didn't want to do without a mechanism to undo (which I subsequently invented due to the requests of many users to be able to embed IPTC/XMP into RAW files). [...]

--- End quote ---

I agree completely.  The TIFF format is very poor.  I have had arguments with DNG supporters about this, but it's a losing battle.  (See my rant about the TIFF format on http://owl.phy.queensu.ca/~phil/exiftool/canon_raw.html.)  I have even gone so far as to start work on defining a file format which could potentially solve these problems (see the MIE format supported by ExifTool), even though it is unlikely that anyone will ever adopt it.


--- Quote from: dennis on May 26, 2006, 10:15:44 AM ---My concerns about rewriting the entire RAW file as ExifTool does is that when new models come out, unless ExifTool is careful to only update based on known models then it can potentially corrupt the RAW photos and I don't think there is a way to undo this (or is this planned for ExifTool?).

--- End quote ---

You are correct, and there is no undo planned for ExifTool.  But there is only one way to ensure safety of your images: back up the originals!  Of course I do everything I can to ensure that images aren't corrupted, but the possibility always exists whenever a file is modified, even for minor edits.


--- Quote from: dennis on May 26, 2006, 10:15:44 AM ---PS - I thought a CR2 file was recognized as a Canon RAW file by its .CR2 filename extension?

--- End quote ---

The extension is an unreliable way to determine the file type.  I have frequently seen files with the wrong extension, or files with extensions missing.  ExifTool can still handle these.  Also, ExifTool is designed to be able to parse images in memory, or from an open stream, and in both cases there is no associated file name, and hence no extension.

I still have some work to do to test ExifTool on other types of PhotoMechanic-edited RAW images.  So far I have just taken a look at CR2.  You may be getting a note from me to take you up on Kirk's license offer if I am not able to complete all my testing before my demo license expires.

I appreciate how helpful the people at Camera Bits have been.

- Phil

Navigation

[0] Message Index

[*] Previous page

Go to full version