Mapping Week 2: Methods and Mapping
In this week's selection from Visual Explanations, Tufte concerns himself with graphing data—specifically, the data from John Snow's investigation of the Broad Street Cholera outbreak in 1854. His main point is that (what he calls) aggregation—both spatial and temporal—can "mask relevant detail and generate misleading signals" (p. 36), which can in turn lead to an incorrect interpretation of the data. Tufte draws a distinction between "method" and "reality" - the former being bias introduced by aggregation techniques, and the latter being the "true story of the data." He goes on to note:
A further difficulty arises, a result of fast computing. It is easy now to sort through thousands of plausible varieties of graphical and statistical aggregations—and then to select for publication only those findings strongly favorable to the point of view being advocated. (p. 37)
What interests me here is that Tufte seems to take for granted the accuracy of the data—as if the collection of data is free from political and rhethorical considerations. But the process of collecting data is itself a kind of mapping: you have to decide which chunks of reality are relevant, how to formalize those chunks, how to digitize them. So data visualization is, in a sense, a map of a map, doubly subject to the problems of subjectivity and arbitrariness that Tufte mentions.
So the question is this: can data visualizers can take the data as basic? Or does the process (or potential) of data visualization itself have an effect on how data is collected? (The analogy here is with the observer's paradox, or even the uncertainty principle: observing the world to collect data also changes the world that you're observing.) Do researchers (unconciously?) practice data collection techniques that create data that is more easily visualized? Conversely, do data visualizers seek out data that is easy to visualize? Or, taking a step back: do we organize our world in a way that encourages certain methods of data collection and data visualization?
An even better question: How do these questions come to bear on Tufte's assertion that "the reason we seek causal explanations is in order to to intervene, to govern the cause so as to govern the effect" (p. 28)?