Shallow Thoughts : tags : tagging

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 08 Jan 2015

Accessing image metadata: storing tags inside the image file

A recent Slashdot discussion on image tagging and organization a while back got me thinking about putting image tags inside each image, in its metadata.

Currently, I use my MetaPho image tagger to update a file named Tags in the same directory as the images I'm tagging. Then I have a script called fotogr that searches for combinations of tags in these Tags files.

That works fine. But I have occasionally wondered if I should also be saving tags inside the images themselves, in case I ever want compatibility with other programs. I decided I should at least figure out how that would work, in case I want to add it to MetaPho.

I thought it would be simple -- add some sort of key in the images's EXIF tags. But no -- EXIF has no provision for tags or keywords. But JPEG (and some other formats) supports lots of tags besides EXIF. Was it one of the XMP tags?

Web searching only increased my confusion; it seems that there is no standard for this, but there have been lots of pseudo-standards over the years. It's not clear what tag most programs read, but my impression is that the most common is the "Keywords" IPTC tag.

Okay. So how would I read or change that from a Python program?

Lots of Python libraries can read EXIF tags, including Python's own PIL library -- I even wrote a few years ago about reading EXIF from PIL. But writing it is another story.

Nearly everybody points to pyexiv2, a fairly mature library that even has a well-written pyexiv2 tutorial. Great! The only problem with it is that the pyexiv2 front page has a big red Deprecation warning saying that it's being replaced by GExiv2. With a link that goes to a nonexistent page; and Debian doesn't seem to have a package for GExiv2, nor could I find a tutorial on it anywhere.

Sigh. I have to say that pyexiv2 sounds like a much better bet for now even if it is supposedly deprecated.

Following the tutorial, I was able to whip up a little proof of concept that can look for an IPTC Keywords tag in an existing image, print out its value, add new tags to it and write it back to the file.

import sys
import pyexiv2

if len(sys.argv) < 2:
    print "Usage:", sys.argv[0], "imagename.jpg [tag ...]"
    sys.exit(1)

metadata = pyexiv2.ImageMetadata(sys.argv[1])
metadata.read()

newkeywords = sys.argv[2:]

keyword_tag = 'Iptc.Application2.Keywords'
if keyword_tag in metadata.iptc_keys:
    tag = metadata[keyword_tag]
    oldkeywords = tag.value
    print "Existing keywords:", oldkeywords
    if not newkeywords:
        sys.exit(0)
    for newkey in newkeywords:
        oldkeywords.append(newkey)
    tag.value = oldkeywords
else:
    print "No IPTC keywords set yet"
    if not newkeywords:
        sys.exit(0)
    metadata[keyword_tag] = pyexiv2.IptcTag(keyword_tag, newkeywords)

tag = metadata[keyword_tag]
print "New keywords:", tag.value

metadata.write()

Does that mean I'm immediately adding it to MetaPho? No. To be honest, I'm not sure I care very much, since I don't have any other software that uses that IPTC field and no other MetaPho user has ever asked for it. But it's nice to know that if I ever have a reason to add it, I can.

Tags: , , ,
[ 10:28 Jan 08, 2015    More photo | permalink to this entry | ]

Thu, 21 Feb 2013

New project: Metapho image tagger

I'm excited about my new project: MetaPho, an image tagger.

It arose out of a discussion on the LinuxChix Techtalk list: photo collection management software. John Sturdy was looking for an efficient way of viewing and tagging large collections of photos. Like me, he likes fast, lightweight, keyboard-driven programs. And like me, he didn't want a database-driven system that ties you forever to one image cataloging program. I put my image tags in plaintext files, named Keywords, so that I can easily write scripts to search or modify them, or user grep, and I can even make quick changes with a text editor.

I shared some tips on how I use my Pho image viewer for tagging images, and it sounded close to what he was looking for. But as we discussed ideas about image tagging, we realized that there were things he wanted to do that pho doesn't do well, things not offered by any other image tagger we've been able to find. While discussing how we might add new tagging functionality to pho, I increasingly had the feeling that I was trying to fit off-road tires onto a Miata -- or insert your own favorite metaphor for "making something do something it wasn't designed to do."

Pho is a great image viewer, but the more I patched it to handle tagging, the uglier and more complicated the code got, and it also got more complex to use.

[metapho screenshot] And really, everything we needed for tagging could be easily done in a Python-GTK application. (Pho is written in C because it does a lot of complicated focus management to deal with how window managers handle window moving and resizing. A tagger wouldn't need any of that.)

I whipped up a demo image viewer in a few hours and showed it to John. We continued the discussion, I made a GitHub repo, and over the next week or so the code grew into an efficient and already surprisingly usable image tagger.

We have big plans for it, like tags organized into categories so we can have lots of tags without cluttering the interface too much. But really, even as it is, it's better than anything I've used before. I've been scanning in lots of photos from old family albums (like this one of my mother and grandmother, and me at 9 months) and it's been great to be able to add and review tags easily.

If you want to check out MetaPho, or contribute to it (either code or user interface design), it lives in my MetaPho repository on GitHub. And I wrote up a quick man page in markdown format: metapho.1.md.

Feedback and contributors welcome!

Tags: , , , , ,
[ 19:31 Feb 21, 2013    More programming | permalink to this entry | ]