The Descriptive Camera works a lot like a regular camera—point it at subject and press the shutter button to capture the scene. However, instead of producing an image, this prototype outputs a text description of the scene. Modern digital cameras capture gobs of parsable metadata about photos such as the camera's settings, the location of the photo, the date, and time, but they don't output any information about the content of the photo. The Descriptive Camera only outputs the metadata about the content.
As we amass an incredible amount of photos, it becomes increasingly difficult to manage our collections. Imagine if descriptive metadata about each photo could be appended to the image on the fly—information about who is in each photo, what they're doing, and their environment could become incredibly useful in being able to search, filter, and cross-reference our photo collections. Of course, we don't yet have the technology that makes this a practical proposition, but the Descriptive Camera explores these possibilities.
In our class, Computational Cameras, we've been discussing "parsability" of data. While there are many ways to parse data in an image, I wanted to take a different approach so that the text content of the image is the primary focus of the camera's output. The camera's output is to its image as a MIDI file of a piano concerto is to its recording. Both can be manipulated, but in different ways.