I’m a believer in the strategy: When faced with obstacles, switch to something else… More often than not a seemingly unrelated perspective offers an unseen inroad. I had been struggling to understand how an unsupervised or semi-supervised method of face detection would function in the context of personal photo collections… So I instead focused on how the varying relevance of multiple images might be displayed in a way that is concise and intuitive. In essence… How to organize the eventual clusters of images that easily conveys rank of relevance and that can pe reconfigured dynamically. Below are some sketches of different approaches using scale and proximity as parameters for relevance.
A stylized rendering of how simple clusters might appear.
By chance after thinking about the visual and UI side of things I continued my search for unsupervised approaches to face and object recognition and came across this paper from:
They describe an unsupervised approach that can self-organize faces from multiple images sequences by a relatively simple and computationally lightweight method using Minimal Spanning Tree formation wherein the distances of faces within two image sequences can be attained through dissimilarity matrices. In the equation below they are essentially subtracting the pixels from one normalized, fitted, and grayscale face image from another to find dissimilarity.
Previously I was trying to understand the problem of face recognition in a more conventional approach using Principal Components and Eigen or Fisher Faces. The problem is that in this case we don’t have a nicely tagged set of training data like Facebook to start with and are specifically trying to reduce the need for the user to do so. The above unsupervised chaining method is encouraging with accuracy around 88%. It offers a self-organizing method for building clusters that could later be combined with more conventional methods for new images coming into the system.