<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ITPindia &#187; Mainstreaming Information</title>
	<atom:link href="http://itp.nyu.edu/~ia303/thunk/category/mainstreaming-information/feed/" rel="self" type="application/rss+xml" />
	<link>http://itp.nyu.edu/~ia303/thunk</link>
	<description>India’s ITP blog</description>
	<lastBuildDate>Fri, 10 Apr 2009 06:20:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Updated bookalator slides</title>
		<link>http://itp.nyu.edu/~ia303/thunk/2009/04/07/updated-bookalator-slides/</link>
		<comments>http://itp.nyu.edu/~ia303/thunk/2009/04/07/updated-bookalator-slides/#comments</comments>
		<pubDate>Tue, 07 Apr 2009 12:59:23 +0000</pubDate>
		<dc:creator>India</dc:creator>
				<category><![CDATA[A2Z]]></category>
		<category><![CDATA[Mainstreaming Information]]></category>
		<category><![CDATA[midterm]]></category>

		<guid isPermaLink="false">http://itp.nyu.edu/~ia303/thunk/?p=578</guid>
		<description><![CDATA[
(Click to download PDF, 236 KB)
Pursuant to a discussion with Christian, I made some changes to the project as described last month.
]]></description>
			<content:encoded><![CDATA[<p><a href='http://itp.nyu.edu/~ia303/thunk/wp-content/uploads/20090406-slides.pdf'><img src="http://itp.nyu.edu/~ia303/thunk/wp-content/uploads/mi_midterm-400x299.png" alt="Mainstreaming Information midterm slide" title="Mainstreaming Information midterm slide" width="400" height="299" class="alignnone size-medium wp-image-579" /></a><br />
(Click to download PDF, 236 KB)</p>
<p>Pursuant to a discussion with Christian, I made some changes to the project <a href="'http://itp.nyu.edu/~ia303/thunk/2009/03/10/a2z-midterm-vocabu-lame/">as described last month</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://itp.nyu.edu/~ia303/thunk/2009/04/07/updated-bookalator-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comparalator</title>
		<link>http://itp.nyu.edu/~ia303/thunk/2009/03/24/comparalator/</link>
		<comments>http://itp.nyu.edu/~ia303/thunk/2009/03/24/comparalator/#comments</comments>
		<pubDate>Tue, 24 Mar 2009 10:48:43 +0000</pubDate>
		<dc:creator>India</dc:creator>
				<category><![CDATA[A2Z]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Mainstreaming Information]]></category>
		<category><![CDATA[final project]]></category>
		<category><![CDATA[homework]]></category>
		<category><![CDATA[midterm]]></category>

		<guid isPermaLink="false">http://itp.nyu.edu/~ia303/thunk/?p=560</guid>
		<description><![CDATA[
As you may recall, for my midterm project, I got stumped on several seemingly simple tasks. One of those&#8212;the most important, since upon it depends my semester-long assignment for Mainstreaming Information&#8212;was figuring out a way to compare one list of words to another and pull out the words that were unique to one of those [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/nypl/3109979241/"><img src="http://itp.nyu.edu/~ia303/thunk/wp-content/uploads/date_merchant.jpg" alt="date merchant" title="date merchant" width="383" height="348" class="alignnone size-full wp-image-562" /></a></p>
<p>As you may recall, for my midterm project, I got <a href="/2009/03/10/a2z-midterm-vocabu-lame/">stumped</a> on several seemingly simple tasks. One of those&#8212;the most important, since upon it depends my semester-long assignment for Mainstreaming Information&#8212;was figuring out a way to compare one list of words to another and pull out the words that were unique to one of those lists. In my head, I can see very easily how this would be done. Given my special way of haphazardly flailing through code, however, I just couldn&#8217;t get it to work.</p>
<p>Until today!</p>
<p>In fiddling with the <a href="http://www.decontextualize.com/teaching/a2z/bayesed-and-confused/">Bayesian comparison code</a> for this week&#8217;s homework, I finally pulled out a list of unique words. Of course, this is a completely perverse misuse of that code&#8212;like using a steamroller to kill a pillbug&#8212;but as long as it works, I don&#8217;t fucking care.</p>
<p>So, here&#8217;s what I did. In BayesClassifier.java, I replaced the last two <code>for</code> loops with the following:</p>
<pre class="brush: java">for (String word: uniqueWords)
    {
      for (BayesCategory bcat: categories)
      {
        double wordProb = bcat.relevance(word, categories);
        if (wordProb &lt; 1)
        {
        println(word);
        }
        else {}
      } // end for bcat
    } // end for word

    for (BayesCategory bcat: categories)
    {
      double score = bcat.score(uniqueWords, categoryWordTotal);
      println(&quot;---The following words were not found in &quot; + bcat.getName());
    } // end for bcat</pre>
<p>And in BayesCategory.java I replaced the percentage and relevance blocks with</p>
<pre class="brush: java"> public double percentage(String word)
  {
    if (count.containsKey(word))
    {
      return count.get(word);
    } // end if
    else
    {
      return 0.001;
    } // end else
  } // end percentage

  public double relevance(String word, ArrayList&lt;BayesCategory&gt; categories)
  {
    double percentageSum = 0;
    for (BayesCategory bcat: categories)
    {
      percentageSum += bcat.percentage(word);
    } // end for bcat
    return percentage(word);
  } // end relevance</pre>
<p>So now, if I run the command </p>
<blockquote><p><code>$ java BayesClassifier A2_unique.txt < B1_unique.txt | sort >results.txt</code></p></blockquote>
<p>I get a list of words that are in B1_unique.txt (<cite><a rel="nofollow" href="http://www.amazon.com/gp/product/0765313383?ie=UTF8&amp;tag=indink-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0765313383indink-20" >The Masada Scroll</a></cite> by Paul Block and Robert Vaughan, 2007) but not in A2_unique.txt (<cite><a href="http://books.google.com/books?id=ko0NAAAAYAAJ">Zuleika Dobson or, An Oxford Love Story</a></cite> by Max Beerbohm, 1911). For example,</p>
<blockquote><p>Akbar, Allah, Allahu, Apostolic, Ariminum, Arkadiane, Asmodeus, Astaroth, Barabbas, Beelzebub, Bellarmino, Blavatsky, Brandeis, Breviary, Byzantine, Caiaphas, Calpurnius, Catacombs, Charlemagne, Clambering, DNA, Diavolo, Franciscan, Freemasons, GPS, Gymnasium, Haddad, Hades, IDs, IRA, Jettisoning, Kathleen, Lefkovitz, MD, MRI, Masada, Masonic, Muhammad, Muhammadan, Nazarene, Nazareth, Olympics, Orthodoxy, Palatine, Palazzi, Palestine, Palestinian, Palestinians, Petrovna, Pleasant, Plenty, Plunge, Pocketing, Pontiff, Pontifical, Pontius, Praetorian, Prissy, Professors, Protestants, Rasulullaah, Ratsach, Revving, Rosicrucians, Satan, Scrolls, Seder, Shakespeare, Syracuse, Tacitus, Theosophical, Torah, Trastevere, Turkish, USB, Uzi, VAIO, VCR, Yeah, Yechida, Yeetgadal, Yiddish, adrenalin, agita, airliner, airport, ankh, awesome, bitch, bomb, bookstores, braked, breastplate, briefcase, broadsword, broiler, brotherhood, bulrushes, cellular, checkpoint, chuckling, chutzpah, combatant, computer, dashboard, database, departmental, desktop, divorce, dysentery, electricity, enabling, entrepreneurs, firearms, firestorm, fishtailed, flagon, forensics, goatskin, groggily, gunfire, gunman, gunshots, handbag, handball, handbrake, handgun, helicopter, helmets, highwaymen, hijinks, homeland, homeless, homespun, hometown, innkeeper, internship, journalist, kebob, kidnappers, kilometers, lab, laptop, lyre, mawkish, monitor, muezzin, nickname, nightfall, nonbeliever, northeaster, notebook, notepad, notepaper, numerology, paganism, password, pastries, phone, photo, photocopies, photocopy, photograph, photos, pig, pigeons, pistol, playback, police, quintessentially, recycles, redialed, roadblock, roadway, sandwich, screensaver, site, sites, submachine, superheating, synagogue, taped, taxi, terrorism, terrorist, terrorists, thousandfold, thrashing, toga, tortured, trigonometry, universe, unto, vegetables, vehicles, video, videotape, vinegar, violence, warehouses, waterfall, welfare, wholeheartedly, whoosh, whore, windshield, worker, workstation, worldwide, yardstick, yarmulkes, yeetkadash, zooming</p></blockquote>
<p>And if I run the comparison in the opposite direction, I come up with words such as</p>
<blockquote><p>Abernethy, Abiding, Abimelech, Abyssinian, Academically, Academy, Accidents, Achillem, Adam, Adieu, Admirably, Age, Agency, Agents, Alas, Albert, Alighting, America, Atlantic, Australia, Balliol, Baron, Baronet, Britannia, Broadway, Brobdingnagian, Colonials, Cossacks, Crimea, Devon, Dewlap, Duchess, Duke, Dukedom, Earl, Edwardian, Egyptians, Elizabethan, Englishmen, Englishwoman, Europe, Holbein, Ireland, Iscariot, Isis, Japanese, Kaiser, Liberals, London, Madrid, Meistersinger, Messrs, Monsieur, Napoleon, Novalis, Papist, Parnassus, President, Prince, Professor, Prussians, Romanoff, Segregate, Slavery, Socrates, Switzerland, Tzar, Victoria, Wagnerian, Waterloo, Whithersoever, Zeus, absinthes, acolyte, adventures, affrights, affront, afire, afoot, aforesaid, aggravated, album, analogy, anarchy, ankle, ape, aright, aristocracy, ataraxy, automatically, avalanche, avow, balustrade, bandboxes, bank, beastliest, beau, beauteous, billiards, biography, bodyguard, bosky, boyish, broadcast, bruited, bulldog, businesslike, bustle, calorific, casuistry, catkins, chaperons, chidden, cigarettes, clergyman, cloven, comet, compeers, coquetry, cricket, crinolines, custard, dandiacal, dapperest, decanter, devil, dialogue, diet, dipsomaniacal, disemboldened, disinfatuate, drunken, ebullitions, equipage, exigent, eyelashes, eyelids, farthingales, female, femininity, fishwife, fob, forefather, forerunners, freemasonry, furbelows, gallimaufry, goodlier, gooseberry, gorgeous, gypsy, haberdasher, halfpence, handicapped, handicraft, handiwork, handwriting, hearthrug, helpless, hip, hireling, honeymoon, housemaid, housework, hoyden, hussy, idiotic, impertinent, impudence, inasmuch, incognisant, insipid, insolence, insouciance, item, keyboard, landau, legerdemain, loathsome, luck, maid, maidens, manhood, manumission, matador, maunderers, model, mushroom, nasty, newspaper, noodle, nosegay, novel, oarsmen, omnisubjugant, ostler, otiose, parasol, pinafore, poetry, poltroonery, postprandially, prank, prestidigitators, propinquity, queer, romance, sackcloth, salad, sardonic, saucy, schoolmaster, seraglio, sex, skimpy, skirt, snuff, socialistic, streetsters, surcease, surcoat, swooned, teens, telegram, telegraphs, thistledown, thither, thou, threepenny, tomboyish, toys, tradesmen, treacle, ugly, uncouthly, unvexed, vassalage, waylay, welter, wigwam, witchery, withal, woe, woebegone, womanly, womenfolk, wonderfully, wonderingly, wretchedness, wrought, yacht, yesternight, zounds</p></blockquote>
<p>Exciting!</p>
]]></content:encoded>
			<wfw:commentRss>http://itp.nyu.edu/~ia303/thunk/2009/03/24/comparalator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A2Z midterm: Vocabu-lame</title>
		<link>http://itp.nyu.edu/~ia303/thunk/2009/03/10/a2z-midterm-vocabu-lame/</link>
		<comments>http://itp.nyu.edu/~ia303/thunk/2009/03/10/a2z-midterm-vocabu-lame/#comments</comments>
		<pubDate>Tue, 10 Mar 2009 11:30:44 +0000</pubDate>
		<dc:creator>India</dc:creator>
				<category><![CDATA[A2Z]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Mainstreaming Information]]></category>
		<category><![CDATA[midterm]]></category>
		<category><![CDATA[sketch]]></category>

		<guid isPermaLink="false">http://itp.nyu.edu/~ia303/thunk/?p=551</guid>
		<description><![CDATA[
Apparently, I have learned absolutely nothing all semester, because what seemed like a very straightforward project proved to be completely beyond my abilities.
The overarching goal is to generate data for the visualization I&#8217;m making for Lisa Strausfeld and Christian Marc Schmidt&#8217;s Mainstreaming Information class. Click the image above to see some slides (PDF, 128 KB) [...]]]></description>
			<content:encoded><![CDATA[<p><a href='http://itpindia.files.wordpress.com/2009/03/slides.pdf'><img src="http://itpindia.wordpress.com/files/2009/03/vocabulap_7.png" alt="vocabulap, slide 7" title="vocabulap, slide 7" width="450" height="338" class="alignnone size-full wp-image-31" /></a></p>
<p>Apparently, I have learned absolutely nothing all semester, because what seemed like a very straightforward project proved to be completely beyond my abilities.</p>
<p>The overarching goal is to generate data for the visualization I&#8217;m making for Lisa Strausfeld and Christian Marc Schmidt&#8217;s <a href="http://www.christianmarcschmidt.com/NYU2009/index.html">Mainstreaming Information</a> class. Click the image above to see some slides (PDF, 128 KB) explaining the gist of the project, provisionally called Vocabulap (<em>vocabulary</em> + <em>overlap</em>; not a handsome coinage). My specific goals for the A2Z midterm were as follows (with subsequent comments in all caps):</p>
<blockquote><p>For A2Z midterm<br />
===============<br />
Prep<br />
&#8212;-<br />
* Remove all blank lines<br />
    DONE<br />
* Remove all extra spaces<br />
    DONE<br />
* Break all lines &#8211; DONE<br />
* Rename all to number consecutively: A01, A02, . . . A10 (for old books); B01, B02, . . . B10 (for new books)<br />
    DONE</p>
<p>Compare major sets<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br />
* Extract the text from between the body tags in each file. Dump it out as a new file with the extension body.txt in the folder ../body.<br />
    THIS IS HARDER THAN IT LOOKS (FOR ME, AT LEAST). EASIER TO JUST CUT THEM OFF BY HAND.<br />
* Concatenate all the files in each set.<br />
    DID THIS FROM THE COMMAND LINE, USING CAT<br />
* Make a list of unique words in each concatenated set, with the number of times the word appears.<br />
    CAN GET THE UNIQUE WORDS, BUT NOT THE COUNT.<br />
* Strip out all words beginning with numerals.<br />
    DONE BY HAND<br />
* Create the following lists:<br />
    &#8211; Words shared by both major sets, with frequency counts<br />
    &#8211; Words unique to set 1, with frequency counts<br />
    &#8211; words unique to set 2, with frequency counts<br />
    I APPARENTLY CANNOT DO ANY OF THIS.</p>
<p>Find unique words in each book<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br />
For each book:<br />
* Concatenate all the files in that major set *except* the file for that book.<br />
* Make a list of the unique words, with frequency counts, in<br />
    &#8211; the current book<br />
    &#8211; the set of all books except the current one<br />
* Make three lists:<br />
    &#8211; Words shared by all books in the major set, with frequency counts<br />
    &#8211; Words that appear only in the current book, with frequency counts<br />
    &#8211; Words that appear only outside the current book, with frequency counts</p>
<p>Return lines surrounding specific words<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br />
For each word in a given list:<br />
* Get the line numbers on which it appears.<br />
For each appearance,<br />
* Print the line above<br />
* Print the line with the word, replacing it with itself wrapped in span tags to apply color<br />
* Print the line below</p></blockquote>
<p>The most essential piece of code that I could not get working is the comparison doodad. It almost worked for, like, five seconds, but it was generating a huge file of every unique word times however many words were in the document, or something like that. When I tried to fix it, it completely stopped working. The offending code is as follows:</p>
<pre class="brush: java">/*  1. Takes in a file name from the command line.
    2. Makes a string array out of the hard-coded comparison file.
    3. Imports the contents of the file whose name was passed in.
    4. For each line of the input file (i.e., each word), changes it to
       lowercase and checks to see if it&#039;s contained in the comparison file.
    5. If it&#039;s not in the comparison file, checks to see if it&#039;s in a hashset of
       unique words.
    6. If the word&#039;s not in the hashset, add it.
    7. Print the contents of the hashset.
*/

import java.util.ArrayList;
import java.util.HashSet;
import com.decontextualize.a2z.TextFilter;

public class CompareUnique extends TextFilter
{
    public static void main(String[] args)
    {
        new CompareUnique().run();
    } // end main

    private String filename = &quot;body/unique/allB_uci.txt&quot;;
    private HashSet uniqueWords = new HashSet();
    private HashSet lowercaseWords = new HashSet();

    // make a String array out of the contents of the comparison file
    String[] checkAgainst = new TextFilter().collectLines(fromFile(filename));

  public void eachLine(String word)
  {
    String wordLower = word.toLowerCase();
    for (int i = 0; i &amp;lt; checkAgainst.length; i++)
    {
        if (checkAgainst[i] != null &amp;amp;&amp;amp; checkAgainst[i].contains(wordLower))
		{} // end if
		else if (checkAgainst != null)
		{
            if (lowercaseWords != null &amp;amp;&amp;amp; lowercaseWords.contains(wordLower))
            { } // end if
            else if (lowercaseWords != null)
            {
                uniqueWords.add(wordLower);
                lowercaseWords.add(wordLower);
            } // end else
 		} // end else
    } // end for
  } // end eachLine

  public void end()
  {
    for (String reallyunique: uniqueWords) {
      println(reallyunique);
    } // end for
  } // end end

} // end class</pre>
<p><em>I know, it seems very simple, but you have no idea how long it took me to get this far.</em></p>
<p>So, basically, for the midterm I&#8217;ve got bupkis—just a big pile of text files, and a list of unique words for each.</p>
]]></content:encoded>
			<wfw:commentRss>http://itp.nyu.edu/~ia303/thunk/2009/03/10/a2z-midterm-vocabu-lame/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lies, damn lies, and statistics</title>
		<link>http://itp.nyu.edu/~ia303/thunk/2009/02/09/lies-damn-lies-and-statistics/</link>
		<comments>http://itp.nyu.edu/~ia303/thunk/2009/02/09/lies-damn-lies-and-statistics/#comments</comments>
		<pubDate>Tue, 10 Feb 2009 01:53:25 +0000</pubDate>
		<dc:creator></dc:creator>
				<category><![CDATA[A2Z]]></category>
		<category><![CDATA[Mainstreaming Information]]></category>
		<category><![CDATA[homework]]></category>

		<guid isPermaLink="false">http://itp.nyu.edu/~ia303/thunk/?p=479</guid>
		<description><![CDATA[
Today we presented ideas for our semester-long projects in Mainstreaming Information. The assignment, which apparently I was not the only person to be confused by, is over at Christian&#8217;s site (PDF, 36 KB).

Last week we had to bring in some &#8220;jaw-dropping statistics&#8221; to start considering working with, and because I&#8217;ve decided that I&#8217;ll get more [...]]]></description>
			<content:encoded><![CDATA[<p><a href='http://itp.nyu.edu/~ia303/thunk/wp-content/uploads/poster.pdf'><img style="border:1pt solid gray;" src="http://itp.nyu.edu/~ia303/thunk/wp-content/uploads/poster-400x600.png" alt="Mainstreaming Information project proposal poster" title="Mainstreaming Information project proposal poster" width="200" height="300" class="alignnone size-medium wp-image-480" /></a></p>
<p>Today we presented ideas for our semester-long projects in <a href="http://www.christianmarcschmidt.com/NYU2009/docu.html">Mainstreaming Information</a>. The assignment, which apparently I was not the only person to be confused by, is over at <a href=" http://www.christianmarcschmidt.com/NYU2009/components/090202_semester_project.pdf">Christian&#8217;s site (PDF, 36 KB)</a>.<br />
<span id="more-479"></span><br />
Last week we had to bring in some &#8220;jaw-dropping statistics&#8221; to start considering working with, and because I&#8217;ve decided that I&#8217;ll get more out of the rest of my time at ITP if I keep my schoolwork linked to&#8212;duh&#8212;stuff I&#8217;m actually interested in, I selected a couple of tidbits from Dan Poynter&#8217;s mass of <a href="http://BookStatistics.com/">book industry statistics</a>—</p>
<blockquote><p>1993–2003: The number of titles published increased 58% while fiction readers declines 14%,<br />
—Malcolm Jones in <cite>Newsweek</cite>. Sources: NEA and RR Bowker.</p>
<p>2004. 56.6% of adult Americans said they read at least one book, fiction or non-fiction, between August 2001 and August 2002 compared to 60.9% ten years prior.</p>
<p>2002. 57% of the US population read a book. See report.<br />
<a href="http://www.nea.gov/pub/readingatrisk.pdf">http://www.nea.gov/pub/readingatrisk.pdf</a></p>
<p>Most readers do not get past page 18 in a book they have purchased.</p></blockquote>
<p>—and John Kremer&#8217;s <a href="http://bookmarket.com/statistics6.htm">Recent Statistics Related to<br />
Book Publishing and Marketing</a>—</p>
<blockquote><p>In a survey of 4,000 adults in the United Kingdom, 55% said “they buy books for decoration, and have no intention of actually reading them.” (Teletext) This is another important reason why your books should be well-designed. They should look good on a buyer&#8217;s coffee table, bookshelf, bedside stand, etc.</p></blockquote>
<p>These served the purpose at hand, but they&#8217;re all just isolated data points. So over the weekend I spent several hours digging around for more information, but for none of these could I find enough reliable numbers to support a semester-long project. I was also looking for any compelling information about e-book sales versus print or audio books, and this morning I spent a while rummaging around on <a href="http://www.teleread.org/">TeleRead</a>. They had all sorts of statistics, none of which quite fit my needs, though but did give me a few more ideas about stuff I&#8217;d <em>like</em> to have statistics about. So around 11 a.m., with 3.5 hours left until class, I <a href="http://twitter.com/indiamos/statuses/1192187395">lazytweeted it</a>, as a last resort. And I immediately got a bunch of responses from my nice, nice friends! Erin pointed me to the completely bitchen <a href="http://labs.timesonline.co.uk/bookscraper/">Book Scraper</a>, from the London <cite>Times</cite>&#8217;s R&#038;D labs, and reminded me that the <cite>New York Times</cite> has an <a href="http://developer.nytimes.com/docs"><acronym title="Application programming interface">API</acronym> for its best-seller lists</a>.</p>
<p>In the end, I decided I&#8217;d better scale down from the macro to the micro view, so that I could use data I might actually <em>get</em>: vocabulary statistics scraped (using my mad new <a href="/category/a2z/">Programming from A to Z</a> skillz) from <a href="http://www.gutenberg.org/">Project Gutenberg</a> e-books, compared with those from recent <em>Times</em> best sellers. And then I went and found a Jaw-Dropping Statistic (which, not coincidentally, <a href="http://www.straightdope.com/columns/read/2724/does-the-average-american-student-have-less-vocabulary-today-than-in-days-gone-by">is bullshit</a>; favorite line in the Straight Dope article: &#8220;At times it&#8217;s been attributed to Gallup polls or even entomologists.&#8221;) that went with the data I was planning to gather. Kind of bass-ackwards, but the result is the poster-style project proposal above, which was deemed Not Entirely Stupid during the classroom critique, despite its having been printed way too large, in fifteen 8.5 &times; 11-inch tiles, and glue-sticked-together in class using the second-worst glue stick in the universe (the worst being the one I had brought from home, which, it turned out, had dried&nbsp;up).</p>
<p>Now, of course, I&#8217;m not even sure I can get files of contemporary best sellers to scrape, because of stupid !@%# <acronym title="digital rights management">DRM</acronym>, so I&#8217;m kind of hoping that the Data Fairy will come to my aid. But my project is at least <em>theoretically</em> possible. Developing&nbsp;.&nbsp;.&nbsp;.</p>
<p><em>Bonus: Find the typo in the poster!</em></p>
]]></content:encoded>
			<wfw:commentRss>http://itp.nyu.edu/~ia303/thunk/2009/02/09/lies-damn-lies-and-statistics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
