Category Archives: Learning Bit by Bit

Reading and Writing Electronic Text – Final Project

新しい言葉 (New Words) A method for generating phonetically unique words from a corpus of existing words in the Japanese language. —resulting in words that are statistically similar to Japanese. (project proposal) This project was also presented as a final project for Learning … Continue reading

Posted in Learning Bit by Bit, Reading and Writing Electronic Text, Spring 2011 | Leave a comment

Learning Bit by Bit – Final Project Proposal

Deep Search Explorer Discovering the Unknown Deep Search Explorer is an extension of a previous project titled IPExplore that randomly searches IP addresses. IPExplore was an attempt at regaining a less mediated experience of discovery on the internet; inspired in part … Continue reading

Posted in Learning Bit by Bit, Spring 2011, Uncategorized | Leave a comment

Learning Bit by Bit – Recommendation Engines

Recommendation Engines While there is no doubt a need for recommendation within information systems that are growing in volume and traffic as fast as they are. Recommendation can be thought of in a sense as subjective filtering. Taking the common … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Learning Bit by Bit – Clustering

Clustering data from IPExplore As an exercise I chose to use a database log from a previous project called IPExplore that visits IP addresses randomly. The full data comprised over 10,000 IP addresses of which roughly 600 were “hits” or … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Learning Bit by Bit – PageRank

Among the different texts on the topic of PageRank that we looked at, the two seminal papers by Larry Page and Sergey Brin; The Anatomy of a Large-Scale Hypertextual Web Search Engine and The PageRank Citation Ranking: Bringing Order to the … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Tagged | Leave a comment

Learning Bit by Bit – Jaron Lanier

You Are Not A Gadget (Chapters 1 – 3) By Jaron Lanier Lanier Begins a section in his book with the statement: “The most important thing about a technology is how it changes people.”…In relative terms technology itself could be … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Learning Bit by Bit – Hidden Markov Models

Part of Speech Tagging and HMM’s I thought it might be appropriate to begin this posting on Hidden Markov Models and POS with the above video… I came home last night to find my roommate watching this on YouTube. A … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Learning Bit by Bit – Text Generation

Generating text with n-gram language models Using n-gram probabilistic language models for generating text. I chose a lengthy corpus: Tolstoy’s “War and Peace” as my main text complimenting it later with “Paradise Lost” by Milton.  Finally generating text from three poems by … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Bit by Bit – Stop Tokenization

Stop Word Sets – Tokenization – Search When thinking about the importance of a “stop list” or “stop word list”,  essentially a low level gateway through which initial tokenization takes place, it’s hard not to consider on a more general … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment

Learning Bit by Bit – ELIZA

An ELIZA style chatbot using Regular Expressions Though I struggled with RegEx I managed to put something together that is almost functional. I hope to grasp the potential power of using RegEx in different ways. package class2.regexes;   import java.io.BufferedReader; … Continue reading

Posted in Learning Bit by Bit, Spring 2011 | Leave a comment