NIME Progress Report #1: The Frotzophone
The screenshot above doesn't look all that exciting, but it's a big breakthrough: It shows my hacked version of Frotz sending Z-Code opcodes to ChucK using OSC. (The game being played is Adam Cadre's Photopia.)
Here's a breakdown of what's going on, for those of you whom the description above completely confounds (probably pretty much everybody). Frotz is a Z-Code interpreter; Z-Code is a variety of bytecode designed to run on a Z-Machine, which is a virtual machine (along the lines of the Java JVM) originally developed by Infocom to ease cross-platform deployment of their interactive fiction. (The format has remained a popular with creators of contemporary interactive fiction thanks to Inform, which compiles to Z-Code.) A Z-Machine implementation, like Frotz, looks and works a lot like a microprocessor emulator: it interprets opcodes that affect the state of the virtual machine in some way. My instrument will work by hooking into the Z-Code interpreter and sending OSC messages to ChucK (or some other OSC server that makes noise) based on the current state of the game.
The basics of that process are in place, and I've at least settled on an architecture, which is a big step forward. I'm not making any sound yet, though. Over the next week, I'm going to be reading up some more on Z-Machine internals, and trying to pin down those aspects of the interpreter that will best allow me to realize my initial idea; hopefully, there's some way to get a consistent snapshot of the player's inventory and location in the simulated space.
One central conceptual problem that I've yet to solve: interactive fiction is inherently turn-based, in that the program responds only to a complete unit of user input (e.g., "get lamp"), and is essentially inert until further input is received. This means that my ChucK patch will be receiving a huge chunk of data every time I press "enter," and nothing in between. The challenge is how (whether?) to convert this data chunk into a data stream that produces sound in a manner both aesthetically pleasing and (relatively) transparent to the audience. My current idea is to use a combination of buffering and slow attack times (e.g., buffer x bytes, play a byte from the buffer every 100-500ms, the sound takes 5-10 seconds to reach full volume), but obviously the optimal solution will depend on which data I decide to capture and how I'm capturing it.
Oh yeah, and the working title is now the FROTZOPHONE—aside from being the name of the interpreter I'm hacking, frotz is also a hacker term for "an unspecified physical object, a widget" (as defined under the word "frobnitz" in the Hacker's Dictionary). So frotzophone is appropriate. Dorky, but appropriate.