« September 2007 | Main | November 2007 »

October 29, 2007

NIME Progress Report #2: Reading and discussion (still no sound)

This week was spent deep in research, both in the function and form of interactive fiction. I'm knee-deep in source code and documentation of Inform interpreters (and the Inform 6 compiler), of course, but I also took some time to familiarize myself with some interactive fiction theory—and to actually play some games (consider this a form of rehearsal). There's still no sound coming out of my ChucK patch, but I'm getting close.

Theory

My main bit of breakfast reading is The Z-Machine Standards Document, which defines how implementations of the Z-Machine should behave (this information was reverse engineered from vintage-era Infocom story files and interpreters, and then later used as the basis of Inform). The opcode table is especially elucidating. I think I have a better handle on what information is available to me as the story file is running in the virtual machine.

I've also been skimming the Inform 6 Designer's Manual, which is short on the kind of low-level technical details I need but contains no end of helpful information about what to expect from games designed for the Z-Machine (particularly those programmed in Inform)—and the practice of designing interactive fiction.

Nick Montfort's Twisty Little Passages has proven to be an invaluable source of theoretical insights, and has helped to fill in my knowledge of interactive fiction history. (Further notes on this forthcoming.)

Practice

At this point, I think the key to my instrument is going to be the Z-Machine's object tree. This is the component of the virtual machine most specific to the creation and interpretation of interactive fiction; its presence distinguishes the Z-Machine from its more generic brethren (e.g., the Java virtual machine).

Here's how it works: every Z-Machine game has a number of objects, each of which represents something tangible in the game—everything from items to locations to people to the player himself/herself. These objects are contained in a hierarchy which, by and large, corresponds to physical containment. For example, the child objects of a room are the items contained in that room; the child objects of the player are the items in the player's inventory; the child objects of a box are the items contained within it, etc. Player movement through the map is, in fact, usually represented in code as detaching the player object from one room object and attaching it to the next. These actions are all conveniently performed using Z-Machine opcodes, making them highly visible to my hacked version of Frobnitz.

This is a sufficiently generic method of getting my hands on the data that effectively makes up the "map" of the game; it's also probably enough information to automatically draw a graphical map of the as it's traversed. But the fact that it's generic is also a drawback: for example, because the player object is just one among many, its identity can't be determined simply by looking at the object tree. (Heuristics could be used, such as looking for the object identified with the word "you," but this would fail in some cases.) If I want to focus the operation of my instrument on player movement, I'll either need some way to interactively define this, or else limit the scope of the instrument's functionality only to games specifically written for it. I'd rather not succumb to the latter course of action—I really like the idea of performing Zork.

October 25, 2007

Minim DriveBy

Examples

Minimal: the simplest possible Minim sketch.
Sequencer: uses the AudioSample class to build a simple sequencer.
Looper: uses the AudioPlayer class to loop a sample. Loop points are defined interactively.
Effect: creates a simple bit crunch effect using Minim's AudioEffect interface.
Synth: a simple oscillator that creates sound based on waveforms drawn to the applet with the mouse. Extends Minim's Oscillator class.

Sound samples used in these examples (put these in your sketch's data directory): kick, snare, hihat, bleep, snippet (from the Advantage's cover of the Bubble Bobble theme song).

Where to go next?

Minim home page
Minim Javadoc and examples (much more useful than the manual, at least right now)
JSyn is a more robust synthesis library for Java (for which Sonia provides a Processing-native interface)
oscP5 will let you send OSC from Processing to any OSC host (such as ChucK, SuperCollider, or, yes, even Max/MSP).

Algorithmic Composition: Final Project Idea

One of my interests is how we structure text in order to make it computer-legible. Probably the most pervasive way of structuring text on the Internet today is HTML, which has several interesting properties that make it relevant to music-making.

First, it's recursive: most elements can themselves contain other elements. In this sense, an HTML document resembles an L-System, and it's possible to draw a tree from the flattened data structure that looks very much like a classical L-System visualization. (See, for example, Websites as Graphs.)

In addition to being recursive, HTML also repetitive: think of menu items in an unordered list, or table cells in a table. Most HTML documents contain small, repeating (though not necessarily identical) structures like this.

So here's my idea for a final project: a piece of software that sonifies HTML structure.

Specifically, I'd like to build a web browser plug-in/extension that generates sound from the web page that the user is currently looking at. This structure might be very simple or highly complex; it might change in the course of viewing the page (given the possibility of modifying the page in real-time with JavaScript). The idea is to provide another layer to the experience of browsing the web—an experience that HTML suggests, but is not (generally) planned for in the design of a web page. In this way, the piece will serve to expose and make more transparent the structure of the underlying data.

Challenges and extra credit after the jump.

The main technical challenge is simply to get some kind of sound-generating code inside of a browser extension. JavaScript doesn't natively support any kind of sound output, of course, and (from what I've read) getting any native code into a Firefox/Mozilla extension is pulling-teeth difficult. One possibility is using rtcmix within a Safari browser extension; Safari has a very clean interface for that kind of thing, but I'll have to learn Objective-C in order to use it.

Another challenge is figuring out how to map the data to sound. I'm pretty sure I'll follow a strategy like the one I used for Meditation #4, in which note values are stored in a stack and multiple layers of hierarchy are played against one another (I think a lot of methods used for sonifying L-Systems will be applicable to this project). As far as the aesthetics of the piece: I'm predisposed to droney, meditative stuff, but it might be fun to make something rhythmic and noisy. Maybe the piece can do both, or something in between, and this will be controlled by some aspect of the page being sonified.

Extra credit for myself: I'd like to make it a performable piece, in some sense. No one will want to watch me loading up web pages up on stage (although I do think that would be kind of an awesome performance!), but maybe it could be made more interesting by (as suggested above) making changes to the structure of a page in real-time using JavaScript or a DOM browser.

Meditation #4

In this meditation, we were directed to sonify the string of an L-System that draws a space-filling curve. I chose the Sierpinski-Square L-System, as illustrated on p. 88 of The Computational Beauty of Nature:

Axiom: F-F-F-F
Rule: F=FF[-F-F-F]F

(I suppose that this isn't technically a space-filling curve, but I think it'll do for the purposes of the assignment.)

This Processing applet displays the curve and will also (if you download it and run it on your own computer) generate the score, according to the algorithm given below. (Here's the original csound file, including sample score and instrument definitions.)

A note is generated and time advances in the score every time an F is found in the string. The - character moves the current note up one step (using a pre-selected scale); [ and ] push and pop note values off a stack. The real trick of this piece is that all generations of the L-System are played simultaneously: the duration of each note is equal to the (predetermined) length of the song divided by the number of Fs in that generation's string. For the axiom/ruleset given above, this leads to the notes of generation 0 being six times as long as those of generation 1, which are in turn six times the length of generation 2, etc. This strategy leads to a sort of rhythmic play between generations, which I think does a good job of relating the fractal nature of the underlying data.

A thirty-second excerpt of the piece is embedded below, or you can download the whole thing (192kbps MP3, 2'30").


Meditation #3

For this meditation, I made a fairly simple sonification of global earthquake data (obtained from here). The score contains a note for every earthquake with a magnitude of five or higher in the past ten years, played in a compressed amount of time (about 200ms for every hour). The depth of the earthquake corresponds to the note's pitch of the note (deeper depths equate to lower notes) and the magnitude corresponds to the number of overtones. I used csound's adsynt opcode, so the whole thing basically sounds like a mess of Tibetan singing bowls.

Here's the csound file and the python script I used to generate the score. Here's the data file I used (somewhat massaged from the output of the original USGS search).

An excerpt from the piece is included below. You can download the entire piece (192kbps MP3, 2'53") here.


October 21, 2007

NIME Progress Report #1: The Frotzophone

The screenshot above doesn't look all that exciting, but it's a big breakthrough: It shows my hacked version of Frotz sending Z-Code opcodes to ChucK using OSC. (The game being played is Adam Cadre's Photopia.)

Here's a breakdown of what's going on, for those of you whom the description above completely confounds (probably pretty much everybody). Frotz is a Z-Code interpreter; Z-Code is a variety of bytecode designed to run on a Z-Machine, which is a virtual machine (along the lines of the Java JVM) originally developed by Infocom to ease cross-platform deployment of their interactive fiction. (The format has remained a popular with creators of contemporary interactive fiction thanks to Inform, which compiles to Z-Code.) A Z-Machine implementation, like Frotz, looks and works a lot like a microprocessor emulator: it interprets opcodes that affect the state of the virtual machine in some way. My instrument will work by hooking into the Z-Code interpreter and sending OSC messages to ChucK (or some other OSC server that makes noise) based on the current state of the game.

The basics of that process are in place, and I've at least settled on an architecture, which is a big step forward. I'm not making any sound yet, though. Over the next week, I'm going to be reading up some more on Z-Machine internals, and trying to pin down those aspects of the interpreter that will best allow me to realize my initial idea; hopefully, there's some way to get a consistent snapshot of the player's inventory and location in the simulated space.

One central conceptual problem that I've yet to solve: interactive fiction is inherently turn-based, in that the program responds only to a complete unit of user input (e.g., "get lamp"), and is essentially inert until further input is received. This means that my ChucK patch will be receiving a huge chunk of data every time I press "enter," and nothing in between. The challenge is how (whether?) to convert this data chunk into a data stream that produces sound in a manner both aesthetically pleasing and (relatively) transparent to the audience. My current idea is to use a combination of buffering and slow attack times (e.g., buffer x bytes, play a byte from the buffer every 100-500ms, the sound takes 5-10 seconds to reach full volume), but obviously the optimal solution will depend on which data I decide to capture and how I'm capturing it.

Oh yeah, and the working title is now the FROTZOPHONE—aside from being the name of the interpreter I'm hacking, frotz is also a hacker term for "an unspecified physical object, a widget" (as defined under the word "frobnitz" in the Hacker's Dictionary). So frotzophone is appropriate. Dorky, but appropriate.

October 18, 2007

Meditation #2

In Meditation #2, we were directed to make a musical piece that recreates a random "walking tour" of the Seven Bridges of Königsberg. In my piece, each island and bridge is treated as a separate node and each node is associated with a note. The notes are played in quick succession—around 20 per second—and have a long decay time. The idea was to create something that illustrates the structure of the graph, something architectural: a wash that contains many events, but appears to move slowly (or not at all).

This Python script generated the score and this csound file generates the output (includes some sample score data). You can play a sample run of the program (1200 steps, about 0:55) below (or download it here).


October 10, 2007

Meditation #1

Procedure: Using Babelfish, translate a text from English into the language (other than English) with the largest number of speakers in the United States. Take the resulting translation, translate it back into English, and use this as the source text when repeating the process, this time using the language with the second largest number of speakers. Repeat until the text is mangled to your satisfaction. (If a language is missing from Babelfish, you can skip it.)

Suitable texts for this piece: any legislation or constitutional amendment that would make English the "official language" of the jurisdiction in question (municipal, state, federal). This is the text I used for my performance of the piece—an amendment to the United States Constitution, proposed during the 107th Congress:

The English language shall be the official language of the United States. As the official language, the English language shall be used for all public acts including every order, resolution, vote, or election, and for all records and judicial proceedings of the Government of the United States and the governments of the several States.

Here's the result:

Language of office S.U.A. It English. They entire danger of motions, is which it it does obtain by close one and that it benefits, language and the English witness of office are assumed? You will air order including/understanding differently, dissolution, voice or government S.U.A. of the document of the government of danger landscape architecture and also situation and a certain compensation everything certifyd it.

The translation chain: English -> Spanish -> Chinese -> French -> German -> Italian -> Korean -> Russian.

Here's a reading, performed by yours truly (download here).


October 03, 2007

Concept Presentation: Mapping mapping and playing playing

Here's my idea: A text adventure that makes noise. Why? Because I'm interested in maps. Let me explain.


From Somewhere Nearby is Colossal Cave

These two maps (pictured above) are both, after a manner of speaking, mapping the same thing: a cave in Kentucky—the Bedquilt region of Colossal Cave, to be precise. The top map is an overhead view, representing the spatial layout of the cave to scale. It's a map designed to help cavers navigate the space.

The bottom map is more abstract: it breaks the cave into specific rooms, and shows only the connections between them (not necessarily their physical layout). In fact, this map doesn't directly map Bedquilt cave at all; it's actually an excerpt from a schematic represention of a game called Adventure—widely recognized as the first "text adventure" game (play online here). Adventure was originally designed by Will Crowther, himself an avid caver and co-creator of the first map of Bedquilt Cave. Although Adventure has significant game-like elements, at its heart it's essentially a Bedquilt Cave simulator: an intimate recreation of the cave, designed to evoke some of the wonder experienced in its exploration.

Adventure was one of the primary inspirations for Zork (Infocom, 1980), which popularized the text adventure genre. A screenshot from a Zork play session is shown below.


Screenshot from Zork I

While the geography of Zork is fanciful, rather than (mostly) based on reality (as in the case of Adventure), the structure of the game retains a map-like quality. For this reason, some critics (such as Julian Dibbel) have called the Zork games—and text adventures in general—"interactive maps." This may or may not be a fair summation of the entire genre. But at the very least, I find myself drawing maps in the process of playing text adventure games—mainly as a navigational aid, but also because it's fun.


An excerpt from a map that I drew while playing Zork

The interesting thing about these maps is that they show the topography underlying the game. As the game progresses, structures emerge. The most obvious structure is the map itself: the area of explored territory grows, and as it does the connections between nodes becomes apparent. They form branching structures, loops, and rhizomes.


A map of Dungeon, predecessor to the Zork series (warning: spoilers!)

Another equally important structure is the path that the player takes through the map. This path is constrained by the map, but not defined by it: the player must choose which direction to go. Given a map that permits doubling-back, an infinite number of paths are possible. It's even possible to get lost. Movement is further constrained by the game's rules: puzzles that must be solved, obstacles and enemies that can hinder the player's progress, etc. A text adventure, In essence, is a "playable" map.


A "heat map" of Quake III level. Areas where players spend the most time (on average) are drawn with a brighter green color; purple dots indicate where shots were fired. From Orbus Gameworks.

The structures of such games—recursion, repetition, branching—are shared with music, as is the improvisatory nature of "playing" the space. My musical interface is about exploring these analogies. How can the process of exploring and make a map itself "map" onto musical expression? Can you play (in the musical sense) play (in the gameplay sense)?

The performance I imagine consists of the performer (me) playing a text adventure game, using software that has been prepared to react sonically to the state of the game. Possible variables for mapping include percentage of map explored, topology of the player's path, which items have been collected, location of items, etc. The text adventure map itself will be randomly generated, so each performance of the piece will be different. As part of the performance, I'll be drawing a pen-and-paper map of the game as I progress; this might be displayed to the audience (along with a real-time projection of the game in play).

From an aesthetic point of view, this process is appealing to me because it suggests sound with many elements that unfolds over a long period of time. Changes to the music can only be made by entering entire phrases of text, which confines the performer's control of the piece to broader, textural gestures. This is exactly what I'm after.

References:

A History of Zork
Somewhere Nearby is Colossal Cave
A Genealogy of Virtual Worlds
Orbus Gameworks: Quake III Heat Maps