November 07, 2005

Text Processing / Networking in Java

For this week's assignment, there were two main ideas that I worked on. Unfortunately, I could only get the first one to work. It is a visualizer for text. It shows you the relative frequency of each alphabetic character as well as the relative frequency of words of each word length. When run from processing it is possible to load new URLs, but this doesn't seem to work from inside a browser.

Link to the applet.

textVisualizer.jpg

The other project I worked on was part-crawler, part data-miner. It crawled through all of the archived movie reviews on Metacritic's Archives and extracted movie data and reviewer data. The idea was to put the data together so that you could rate a few movies, and then the program would find reviewers who rated those movies similarly to the way you did, and then recommend a movie to you based on those reviewers' reviews.

I didn't have any trouble getting the crawler to work, but I kept getting memory errors in Processing (all told, there are about 5000 movies there with many more reviewers and almost 200,000 individual reviews). The current plan is to re-imagine this with Perl and a MySQL database. It may be a little while before that is up and running.


Posted November 7, 2005 09:36 PM. Categories: Week 9 | Permalink