ITP Camp 2017

A Reasonable Introduction to Natural Language Processing & the Vectorized Word

Session Leaders: Allison P.


Tags: #programming #digital humanities #text #python #language #nlproc


Created By: Allison P.

Nothing is more essentially human than linguistic communication. But when programmers, data scientists, and computational linguists work with language, the abstractions they work with sometimes don't line up with your grade-school understanding of spelling and grammar. In this workshop, we'll investigate the state of the art of natural language processing, including:

  • A whirlwind tour of spaCy for parsing English into syntactic constituents;
  • a discussion of techniques for classifying and summarizing documents;
  • and an explanation and demonstration of "word vectors" (like Google's word2vec), an innovative language technology that allows computers to process written language less as discrete units and more like a continuous signal.

Workshop participants will develop a number of very small (and probably poorly-conceived but nevertheless hilarious) projects in text analysis and poetics using a public domain text of their choice. In becoming familiar with contemporary techniques for computational language analysis, critics and researchers will be able to reason better about language-based media on the Internet. Artists and writers, meanwhile, might just learn a few new techniques to add to their creative palette.

No previous programming experience is required. But if you've never worked with Python before, please show up 20-30 minutes early so we can get the necessary software installed on your laptop so you can follow along.


Comments

You must be signed in to comment.