I had the privilege to chat over e-mail with Hans Fremuth from Metability Software about image metadata. He was very gracious not only to answer my questions but also to provide a great history of the image metadata standards available today (EXIF, IPTC, XMP). His long history with file metadata in general and image metadata in particular makes him a great resource for a serious photographer. Hans’ great understanding of the history of modern image metadata standards provides great clarity to his vision about the field of managing unstructured data in general and image metadata in particular. This article is the first part of the interview.
Who is Hans Fremuth? And what is Metability Software?
My personal journey is on the way to ultimate metadata management is published on my blog KnowYourFiles.com and on twitter as @KnowYourFiles. More about “FileMind”, the first product from Metability Software.
Metability Software is a self-funded startup that focuses solely on file metadata in content-bearing files. Many editing applications have support for some basic metadata editing, but it’s really not very sophisticated. We believe metadata will be the backbone of managing files in the future – and this means it’s time for some better tools to do this. We want to be the ‘Swiss Army Knife’ for all things about file metadata. It’s a long way to go, especially since generally accepted standards are missing. But we are just starting anyway.
Why is creating metadata important?
Question: On your blog you compare files without metadata to cans without labels. You not only talk about image files, or video files, but all files on one’s computer or network. Why is creating metadata for files important?
The simple answer: because it’s about time! Think of it this way: we have been pushing around files based on its file name since the fall of the Soviet Empire. That’s some time ago, really. What was the biggest innovation since then? We can give our files names that are longer than 8 characters, wow!
I am sorry, but files have not arrived in the 21st century yet. Google, Blogs, YouTube, Twitter – much has happened. But nothing happened to the way we shuffle our files. We still do it the way we always did.
Files will not go away, either. Not everything will end up ‘in the big cloud’. I would bet that 9 out of 10 readers here have now more files on their local and network drives than 5 years ago.
Sticking on some metadata makes files richer and much more valuable. It gives you the power to find the right file – not just a pile of files. The key is to hide all the complex technology and semantics ‘under the hood’ – users shouldn’t be bothered with it at all. That is where most current attempts for a semantic desktop solution fall short: they are great for rocket scientists, and strictly academic.
How is the XMP metadata standard different
Question: Photographers have become used with multiple metadata standards: EXIF, IPTC and XMP. I understand that EXIF metadata is produced by the digital camera while IPTC metadata is supposed to be edited by the author of the picture. When it comes to XMP however I don’t really know how it relates to EXIF and IPTC. Is XMP a superset of EXIF and IPTC? On your blog you refer to IPTC as legacy metadata when compared with XMP, why is that?
To put it simply: XMP is the ‘New Beetle’, IPTC the ‘olde Bug’.
And here is the long answer: IPTC/IIM has been around for almost 20 years, and it is universally adopted by the entire media industry (photographers,media companies, stock agencies etc.). This is for a good reason: it solved a big problem at the dawn of digital photography.
Remember the old fancy-edged postcard-sized pictures in black and white, showing a family reunion or a sports event? Guess what – it already had metadata on its backside: year and season, photographer, and some handwritten notes with names and the location of the picture. In the professional world, newspapers and magazines did this the same way: they simply tacked a sheet of paper to the photograph, containing all the needed information about the shot.
Now fast forward to the digital age: within a period of less than 20 years, hundreds of thousands of newspapers and magazines went electronic. Large rooms lined with filing cabinets got shrunk to just a few hard disks. No more photo paper, no more extra sheet of paper – this process simply wouldn’t work any longer.
It was the IPTC (a nonprofit organization founded by news companies, based in in London) that formulated the idea to stuff photo details right into the picture file itself. The whitepaper “Anatomy of a Wire Story”, written in 1989, laid the foundation, and in 1993 the “Information Interchange Model” (IIM) was born.
The IIM became quickly known as the “IPTC” (its correct name is either IIM or IPTC/IIM). It specified fields with descriptive metadata that could be added to TIFF, JPEG and Photoshop files. It was a hit, and a large number of photo applications (Adobe Photoshop, ThumbsPlus, iView etc.) as well as enterprise systems quickly learned how to embed and read IPTC data in files.
At around the same time, the manufacturers of digital imaging equipment ran into a similar dilemma: how can picture-taking conditions be documented? As a result, the EXIF standard appeared 1995 and evolved to its current version 2.2 in the year 2002. EXIF information is now automatically injected into every picture a digital camera takes.
So EXIF and IPTC/IIM go hand in hand, they don’t really overlap. EXIF contains physical information such as date, ISO sensitivity, F-Stop, exposure time, GPS location etc., at the time a picture was taken. On the other hand, IPTC contains descriptive information such as copyright, caption, creator etc.
Now, getting back to the comparison of old Volkswagen Beetle vs. the ‘New Beetle’:
IPTC as well as EXIF are encoded in “binary blocks” that are not very flexible. The defined fields are limited in size and have problems with foreign characters. Both standards are not well suited to be extended with additional fields. Its “old school data”, and to no surprise: both EXIF and IIM were conceived at a time when HTML or XML were nowhere to be found in a paper dictionary. People wore big glasses and colorful ties,cars had no power windows (nope, especially not the old Beetle) or remote locks. The Google founders were still in high school!