Zoe Fraade-Blanar

Current: A News Project

A data visualization tool to help Journalists monitor rising topics of interest and keep tabs on the competition's News coverage.

http://www.anewsproject.com

Classes
Crafting with Data: Revelations, Illusions, Truth and the Future,Mashups: Remixing the Web


The inspiration for Current came from a conversation with the New York Times newsroom. When I asked what magical tool they wish they had access to they said they wished they had a way to ensure that they'd never miss another important news item. Too often they only realize the importance and public interest in a topic hours or days after their competition has already scooped them.

Current aims to take some of the guess work out of the ebb and flow of public interest by monitoring search requests, Twitter posts and blog posts. It compiles these area of interest into general topics using language processing and displays them through various views in a stream graph.

Current's strength comes from its ability to measure public areas of interest in realtime, and closely monitor the competition. By cross-referencing the topic areas with coverage at the top 20 newspapers in the country, Journalists will never again be able to say they didn't realize the importance of a news item until it was too late.

Data can be very overwhelming and intimidating, especially to people who are more used to dealing with words and imagery. But just as it is human nature to be more willing to listen to a well-dressed speaker, I posit that a truly beautiful visualization will allow even journalists to access the information it contains.

Background
The concept for Current as a "dashboard' where journalists can keep track of topics and the competition's reaction was first put forth by Clay Shirky, and more fully expanded in a paper on the future of Journalism created for Future of Infrastructure. Although news as a centralized industry may not be around for much longer, the Journalist as an observer of breaking events surely will.

Stream graphs are relatively new, being a play on more traditional stacked graphs. I have seen versions used for visualizing music listening habits and box office receipts over time.

Audience
The target audience for Current is working reporters and other students of human nature.

User Scenario
A Journalist might open Current at the beginning of her shift to monitor what the public has been interested in during the last 24 hours. She would want to check to make sure her newspaper covered any topics that gained a large share of people's attention. She would also want to double check on any topics that all the other newspapers have chosen to cover to make sure that she hadn't overlooked anything important. Topics she wasn't sure about she could monitor from hour to hour to see if they went away or grew in importance.

Implementation
Current is designed in PHP and Processing. Three scripts harvest the information and then process the language to group the raw data into topics, cross references it with news sources, and store it in a database. Another script feeds information from the database into a Processing application. Processing handles the visualization and user interaction involved in selecting individual topics, different news sources, and choosing different views.

Conclusion
I have discovered that I need a new computer. This type of visualization simply requires more power and a better video card than the Toughbook I use can provide. It's an important lesson going into a Thesis that will in all likelihood use similar resources. I also discovered that Google is not infallible. I had placed great store in their top-40 rising topics tool, which has had recent outages throughout the project. Subsequently I've learned to cache data like mad to ensure against these issues.