November 10, 2005

Clay Shirky: Tagging

To be honest, the last few weeks' speakers had been a little bit of a letdown after the great start (I'm thinking Steven Johnson, Ze Frank, especially) we had. But Clay brought it all back. I really enjoyed hearing what he had to say. Some of the other speakers (Lili Chang and Curtis Wong, especially) had been a little short on content, I thought. Clay's speech was very information dense and I had trouble staying up with my notes. Excellent.

He spoke about classification systems, starting with their pre-digital history, and spoke about how we have applied old, analog classifications to the web (usually with detrimental results). This is something I obliquely covered in my mid-term paper for ICM. It's something I've thought a lot about. The folder hierarchy that we use to organize our files on a computer bears almost no resemeblance to the physical reality of the way the computer writes to the hard drive, and in a lot of ways is a useful mental model--or, it was a useful mental model when we weren't technologically able to store and retrieve files the way the brain does--by associating them with other things and by searching. But now that we have searching programs for our computers (like google desktop), this is no longer necessary. In just a couple years the old problem of losing your file by not paying attention to "where" you saved it should be history. You saved it on your computer, dummy. You shouldn't need to know more than that to be able to get to it again. The problem of folders/hierarchies is even more ridiculous with Applications, of which there are relatively few. There's 0 reason that you should ever have to navigate to an "Applications" folder to open a specific application. All you should have to do is say, "hey, open Firefox." Using quicksilver allows you to do this on the Mac.

Clay talked about a lot more than just classification. He talked about how the Dewey Decimal System, which is an ontological*, hierarchical categorization that is determined by humans. The DD system has evolved over the years as the world has changed. For instance, the Soviet Union section is now the "Former Soviet Union" section. This change was made basically because the libraries didn't want to add categories and incur all the costs involved in reshelving.

There's one crisis in particular that analog, hierarchical categorization systems face: What if something falls in two categories? Clay used the example of a book about Art and Creativity. Is it about Art, or about Creativity? Where does it go? How do we label it?

In my mid-term paper for ICM I also talked a little about pluralism, the idea that something can be part of two (or more) things equally, and we don't necessarily have to decide which one of those it is to be able to deal with it. I hadn't really thought of it as a specifically digital idea, but it really is. Tags allow us to get around the categoricization of websites. A website can be tagged both art *and* creativity. Problem solved! I believe the reason Yahoo! bought Flickr earlier this year was not for the photos and community, but for the tagging, to use for their search results.

Clay mentioned a number of problems with tagging, one of which is the thesaurus problem, where similar spellings or slightly different names (nyc, new_york, newyork, etc.) really mean the same thing but will be considered different by Flickr.

One thing that I was thinking about, especially concerning flickr, is the usefulness of metadata for tags (i.e., tags for tags). What if people were allowed to "rank" their tags, by being able to mark some tags as primary importance and others as secondary. Maybe you could put a star or two next to a tag that you thought was more important than the others in describing your picture. And I think you could use a metatag:tag format as well. For instance, a picture could be labeled "location:nyc building city urban" etc. This has the potential of just compounding the problems of tags (it exponentially increases the tagspace), and it also could potentially inherit the ontological-categorization problems. But part of the problem of the categorization systems like the Dewey Decimal system is that they try to label books using labels that aren't very good for books (and they only use one label). If your metatags were more like "location" "date" "person" etc., there might be some use to them. I'm not sure if it would be good to allow users to make up their own metatags on the fly or to define a narrow set of metatags that could be used. The consistency of the data increases when the metatagspace is narrowly defined, but you run the risk of missing important things.

And, if you think about it, all the tags on flickr right now have metadata applied to them implicitly. As flickr interprets the tags, they all mean what the picture is "about."

* generally, it's a good bet that I'll enjoy a talk if the speaker uses a word like "ontological."

Posted November 10, 2005 11:59 PM. Categories: Week 10 | Permalink