Kioku: A Semantic Indexing and Exploration Interface for Digital Images A collaboration with Akira Shibata and Andrew Childs
Information analytics gives us the tools to see inside large datasets, revealing patterns, trends, and hidden insight. Shouldn’t we have the same tools to explore the most important data in our lives—our personal data?
Watch the Thesis Week 2012 presentation:
Taking personal photos has never been easier and because of it our personal photo albums continue to grow at a staggering rate. Kioku offers a new way of exploring our memories using powerful analytical tools. Kioku allows users to see their photos in ways they never have before, giving them the ability to explore their personal photo collections in many dimensions simultaneously. Have you ever wondered what every birthday from the past five years looked like sorted by your best friends, number of ”likes”, and color? Now you can find out. Life isn’t a timeline. Let your memories come alive.
Kioku uses a data analysis tool known as Principal Component Analysis (PCA) as a core method of dimensional reduction that can reveal what differentiates images most from each other depending on the selected criteria. Kioku makes use of time and location data from an image’s Exif data, social data from Flickr, various color, brightness and contrast data, as well as face data.
Kioku is a tool that goes way beyond standard photo collection management solutions like Picasa, iPhoto, or Lightroom. Linear or sequential approaches to exploring larger digital image collections become increasingly difficult to use as the collections continue to grow. The estimated number of digital photos taken by Americans in 2006 was 53 billion photos or roughly 177 per person. In 2011 those numbers jump to 80 billion and 255 respectively. By 2015 it is expected that Americans will take 105 billion digital photos or 322 per person.The advent of social media and collectively sourced catalogs of photos also contribute greatly to the amount of relevant photo content available to users. These factors are resulting in an exponential growth of personal photo album size.
With sequential and grid-based approaches to exploring image collections relationships other than immediate sequence and event or “album” filtering are impossible. Kioku can display images sequentially like any other photo application can but it can also show you relationships and trends in your photos that you could never see before.
Kioku’s backend system is written entirely in Python. The front-end interface makes use of the Google Maps API in an untended way as method of exploring images on a 2D plane.
SymbiosisO: Voxel An Interactive Textile Installation
Somehow concurrent to my work on thesis I was able to collaborate with Kärt Ojavee and Eszter Ozsvald on an interactive textile installation at the Frank Gehry designed Tribeca Issey Miyake space.
Image: coolhunting.com
The installation grew out of previous work that Kärt and Eszter had done that focused on interactive textiles in the form of thermochromically coated felt. Using imbedded nichrome heating wires powered by an Arduino microcontroller, patterns could be made to slowly emerge from the felt’s surface. These previous projects were usually touch activated.
Image: coolhunting.com
Being given the opportunity to work with Issey Miyake’s flagship space we began thinking about how we could use the thermochromic technique in a way that wouldn’t be overpowered by Gehry’s titanium ribbons that intersect the space. The objects that Kärt and Eszter had worked on were very personal, slow, and subdued experiences. The installation at Issey needed to be faster and bolder. We decided that a modular approach would give us the most flexibility; allowing us to configure and reconfigure the installation as needed on site. This also felt at home with some of Issey’s brands like their 132 5 line and Bao Bao. The color, a vibrant cobalt blue, began to emerge while thinking about how the space would feel in the Spring (we started this project in Sept. 2011). After seeing some images of Issey’s Spring / Summer collection we knew that something blue was probably our direction. I love Japanese textiles and started thinking about the traditional tie-dying technique called “shibori” which typically has white resist lines with a deep indigo blue dye. It felt like the repetitive / modular patterns in shibori came alive with totally new technology.
While Kärt and Eszter had used hexagonal patterns before as well as Voronoi patterns, we began experimenting with geometric patterns that could be overlaid and integrated with the hexagons to create a new layer of illusion. The rotated cube-like pattern made a lot of sense. After failing to come up with a fitting name we stumbled upon a wikipedia entry on volumetric pixels, also known as Voxels.
The main installation consisted of 68 active “voxels” which users can interact with using any mobile device by creating patterns on a web app and submitting them. The patterns quickly emerge from the blue hexagons as a single image or animated sequence.
We spent a great deal of time finding the right material combination that would allow images to appear and disappear fast enough for the system to function interactively. When working with thermochromic ink in this way it’s often easy to introduce enough heat to make a color change but getting rid of the heat, effectively resetting the state becomes a new problem. We eventually found that a coated silk outer layer with a poly-felt core provided the best transition time, between 8 and 16 seconds, not the 100 frames per second that many gamers enjoy, but a major speed up from the original natural felt systems.
Photo: Gion
The installation enjoyed a well attended private opening sponsored by Surface magazine.
Special thanks to: Robert Samsel von Leszczynski for making this project happen, Tom Igoe, Yoni Ben Simhon, Johnny Lu, and Stepan Boltalin for making things work with minutes to spare
Some additional images documenting our process and physical fabrication:
A System for Non-linear Exploration of Visual Information
Abstract:
The amount of personal data in the form of social media, email, music, video, and personal photos continues to grow at a rapid pace. While some advances have been made in the more conventional methods of indexing and searching large catalogues of images, few approaches address the need for a system that enables users to explore their personal libraries in a non-linear and dynamic way. In this paper we propose an interactive system that enables users to explore their personal photographs by dynamically organizing them based on multiple image attributes such as, but not limited to: time, location, face recognition, color, as well as social attributes. A prototype system and interface were developed. The underlying processing of image data employed a suite of algorithmic methods including hierarchical clustering, face detection and recognition, histogram analysis, and others…
This project began to take shape as I considered the possibility of physically reconstructing my memory of an architectural space… specifically the house I grew up in. The circumstances surrounding this house are unusual in that during my time in undergraduate study my parents were forced to sell the house unexpectedly. It was strange to never see the place that I expected to return to. The house was sold and its new owners completely renovated it to the point that it was no longer recognizable. I decided to rebuild the house from memory as best I could. Here are my attempts in 3D:
I had planned to export the 3D model to a DXF Autocad file and use an intriguing little program written by Thomas Haenselmann called DXF2papercraft which does what it sounds like… transforming 3D models into flat paper cut-outs with glue tabs. Sadly my models were not agreeing with the software so I built by hand freestyle. Seen here at studio is my finished model just about to be completely crushed.
The concept for the book was to create a paper model that was not intended to be collapsible and then simply force it to collapse by closing a very sturdy book on it. The model (or memory) would then be “reconstructed” to the best of my ability using pop up derived mechanisms and techniques.
The result was not exactly what I had expected but I am still very pleased with the final object.
Face Recognition, XML, initial clusters… Closing in on a working backend system
We, being myself and Akira Shibata, accomplished a lot over the break. The current face recognition system based on Phillip Wagner’s Eigen Face implementation is working and provides a good starting point that can be improved upon later. For now it provides us with fit values that we can use to cluster faces with.
Above is an Eigen Face reconstruction using a combination of faces we extracted, about 500 faces. For fun we decided to run our PCA Test (Principal Component Analysis) but with the test image included in the training set (below). The result is clear with the image rapidly converging. Beside it is a colorized grayscale version of Eigen Faces a useful method for showing variation that the human eye has a hard time discerning in gray. Also a short animation combining the progression of all Eigen Faces.
XML
Next we began anticipating the need to make data available to a front-end interactive system. We researched a number of methods including existing xml schemas but in the end nothing existed that fit our needs so we created our own that our Python can generate:
Finally we began what we had really been waiting for… clustering. We are still trying to make our way around Matplotlib, but we were able to make an initial time-based cluster and make a combination plot with a dendrogram (seen above). Because we tend to think of time in a linear fashion the plotted representation may seem couter intuitive at first, but this cluster is our first attempt toward very rudimentary event detection. We are essentially trying finding a likeness among the photos based entirely on the EXIF time data represented in UNIX time. The dendrogram method configures the clusters so that things are in a rough sequence from left to right, but it also avoids intersecting itself so there is some shuffling, but we can interpret the dendrogram as illustrating a series of larger events containing various levels of sub-events. The dark blue represents more recent events with the lighter colors receding into the past. One can imagine how this method of event detection could become more interesting with histogram and faces combined, but already it shows different and possibly interesting way of interpreting one’s personal data.
Histograms and Frequency …ever closer to face clustering, but not just yet.
Using the birthday data set a plot of face frequency over 5 years. Not especially meaningful as it is, but getting ready for face comparison and clustering.
Playing with im.histogram() method in PIL (Python Image Library) to cluster images based on their RGB histograms.
from PIL import Image
from numpy import*def pca(X):
# Principal Component Analysis# input: X, matrix with training data as flattened arrays in rows# return: projection matrix (with important dimensions first),# variance and mean#get dimensions
num_data,dim = X.shape#center data
mean_X = X.mean(axis=0)for i inrange(num_data):
X[i] -= mean_X
if dim>100:
print'PCA - compact trick used'
M = dot(X,X.T)#covariance matrix
e,EV = linalg.eigh(M)#eigenvalues and eigenvectors
tmp = dot(X.T,EV).T#this is the compact trick
V = tmp[::-1]#reverse since last eigenvectors are the ones we want
S = sqrt(e)[::-1]#reverse since eigenvalues are in increasing orderelse:
print'PCA - SVD used'
U,S,V = linalg.svd(X)
V = V[:num_data]#only makes sense to return the first num_data#return the projection matrix, the variance and the meanreturn V,S,mean_X
I’ve been thinking about a project since the beginning of this class. In essence it is an anti-pop-up book, or perhaps a destructive / reconstructive pop up book. The idea is fairly simple… a sturdy book with thick heavy board-like pages, perhaps even metal hinges for a binding. Then… on the pages delicate models are assembled, perhaps architecture or models of other objects, in this rendering a paper model of the Reichstag in Berlin. These models are not meant to be collapsible. After the model is complete the page is shut with force and the model is compressed. Upon opening the page the model is somewhat discernible but will required additional work to reconstruct it towards its original likeness. I like the theory of reconstructive memory… I felt that this was an engaging metaphor for it.
The reconstruction of the model will presumably employ many of the pop up techniques learned in this class.
A Mechanical Analysis of Jan Pienkowski’s Haunted House(Dutton 1979)
This book was among my favorite picture books as a young child. I have vivid memories of turning the loaded pages of pop ups and pulling the interactive tabs.
Though I hadn’t seen the book since I was at least six or seven years old, everything came back to me immediately… the stylistically 70′s bubble lettering on it’s cover, the aligator in the bathtub. It was a real joy to revisit this experience. After going through its pages though I was surprised by how simple it is mechanically. No floating layers… almost entirely V-folds. Simple, but well implemented.
A few of Pienkowski’s haunted scenes:
The bathroom with multiple pull tabs and a helical spring-like alien’s antennae.
Years later these pages came to mind and I wondered if Bobby Henderson’s “Flying Spaghetti Monster”, of Kansas Board of Education and Intelligent Design fame, had possibly been inspired here.
Extracting EXIF data from images for clusteringI was able to use the Python image library (PIL) to extract time and GPS location data from a test image set. The test set is comprised of roughly 500 images from various devices including iPhone and point and shoot cameras. The images are from birthday parties over a 5 year period.
These simple plots of EXIF time and location. The frequency of images can clearly be seen with spikes each year. The two points in the GPS plot are Tokyo and New York with 4 of the 5 birthdays were in New York and 1 was in Tokyo.
Research on face recognition has yielded som promising leads, especially a 2011 paper:
This paper describes a method of clustering faces using “rank-order distance” in which a clustering can be optimized by understanding a face’s neighbors:
“The Rank-Order distance is motivated by an observation that faces of the same person usually share their top neighbors. Specifically, for each face, we generate a ranking order list by sorting all other faces in the dataset by absolute distance (e. g., L1 or L2 distance between extracted face recognition features).”