Provoke: Digital Sound Studies



A scholarly voice playground, Paperphone is an interactive audio application that processes voice and sound materials live and in-context. Designed to challenge the privileging of text over act in humanities scholarship, Paperphone is a performative platform for scholars to unravel the expressive potential of voice and audio in sharing academic works.

Paperphone is made to be discoverable. With Paperphone, we want to ignite a praxis-based opportunity for the scholar community to play with poetic and experimental forms of utterances and to explore the semiotics of voice-based effects.


Paper presentation is an important ritual in academia. It is a site in which scholars sound and resonate with knowledge, physically and intellectually. Deemed professionally significant, however, the practice of paper presentation is under-considered in almost all contexts of praxis. In humanities disciplines especially, scholars deliver their papers by reading them. The practice of reading prioritizes reason over emotion. The lack of a conscious engagement with the expressive, emotive, sense-based potentials of speech often makes audience’s experience of hearing a paper dull and uninteresting.

Reading reinforces the privileging of text over act, of print over speech. Print as a medium to express scholarly content connotes permanence; speech, by contrast, is a performance-based mode that is transient and inextricably tied to its temporal and spatial context. As a fixed object, a printed academic text can be used to construct a disembodied “reality”—the reality of an objective academic past, the grand narrative of the history of academic knowledge. The act of reading an academic text as it is written promotes the value of permanence and context-free interpretations.

A critical software project, Paperphone provokes emerging conversations related to scholarly communication that foregrounds sound and voice. Enacting the performative possibility of knowledge transmission, this project encourages scholars to interpret meaning of scholarly texts through a sense-based engagement. This modal redefinition will hopefully spark a fruitful interplay among various intellectual, emotional, and other imaginative and sense-based relationships to texts.

Designing Paperphone

Throughout our design process, we asked the following questions: How does sound rematierialize text? How does sound recontextualize the meaning of text? How do the sonic qualities of speech affect the meaning of text? We used these questions to guide our design decisions; and ensured that our design will eventually enable scholar users to explore these questions in their own research.

We took a two-prong approach to design and develop Paperphone: sound design and interface design. These two processes mutually informed one another. In sound design, we adopted an experiential, trial-and-error method. We selected a number of audio effects that we thought would yield semiotically interesting sonic outcomes. We thought that effects like Distortion would add emphasis to authorial intent or position. A counterargument, for example, could be delivered with a small amount of Distortion. Further, we integrated Reverb and Echo because we thought an interaction with sonic resonance would yield interesting results related to rhetorical space and presence. For instance, reverberation, a technique that creates an illusion of a sound within a resonant space, for provocations. A statement based on circular reasoning could be enhanced with the Echo effect at a high feedback setting. Rhetorical questions, potentially, could be emphasized with Echo. Additionally, the inclusion of an audio file player enables the user to playback recordings (field recordings, interview, or archival recordings) in order to recreate specific sonic spaces relevant to the content of the paper.

Sound design ideas became further concretized when we developed the fixed presets, a set of preprogrammed effect combination and settings with a suggested function or concept. In designing these presets, we imagined qualities of audio effects that are potentially useful for expressing humanistic scholarship. This phase required a thoughtful consideration of (the meaning of) physical, rhetorical, and associative qualities of audio effects. We created a preset called “Intercom” based in a combination of Reverb and Filter. This Intercom effect produces sounds as if they were coming out of small speakers in an empty room. The user could deploy this effect to emanate ideas associated with institutional buildings, analog broadcast systems, or other centralized structures. Furthermore, pre-established notions related to audio effects complicate our process of preset design. For instance, we named our vocoder preset “Robot.” The physical qualities of the vocoder technology, based in sound synthesis, don’t make this effect more robot-like than the other effect combinations necessarily. For example, there is no more, quantitatively, technological intervention in Vocoder than in the other effects. But the association between the vocoded sound and machine (and other related concepts like artificiality) has been established in pop culture by movie soundtracks and iconic artists like Kraftwerk and Laurie Anderson. We are interested to see how users may use this preset and alikes to continue reinventing the meaning of vocoded sounds and interact with this semiotic history.

Our interface design matured in parallel to the development of Paperphone’s sound design. Interface design was largely informed by our observations of existing platforms of sound technology such as guitar pedals and Digital Audio Workstations (DAWs). The goal of this design process was to create an intuitive interface that invites users without previous experience in audio technology to play. After studying Ableton Live’s user interface, we moved the master controls to the right edge of the frame, and rotated the volume indicators and controls to be vertical. We also took suggestions from our beta testers, particularly Mike D’Errico, a PhD student in musicology researching audio technology, in making controls and instructional text more in line with user interface standards.

Extending the software capacity of Paperphone into the physical realm, we ensured that our release is compatible with Mira, Max’s controller app available for the iPad. We realized that not everyone has an iPad and $50 to spend on the Mira app, so we added a series of key commands so that the user can control the settings by striking keys on the computer keyboard. For instance, the user can hit the first letter of the effect name to turn on/off effect 1 (e.g. click "r" to activate reverb). He or she could also navigate through the preset menu via the arrow keys on the keyboard. Specific instructions are included in the patch.

As we experimented with Paperphone, we realized that this system would be useful for creating audio essays or podcasts. As a response, we integrated a recording feature for the user to capture the audio output from Paperphone. This feature will hopefully spur implementation ideas beyond the mode of paper presentation.

Implementing Paperphone

We encourage creative application (and misuse!) of Paperphone. We believe that there are more creative ways of implementing Paperphone than what Jonathan and I could imagine. Please drop us a note to share how you experiment with Paperphone.

We offer the following implementation suggestions for Paperphone:

Space and Place

As you’re making a sonic space, think about how you are defining the space. Are you recreating a physical space? If you’re creating a rhetorical space (e.g. delivering a thesis statement, stating a provocation, or posing a counterargument) which sonic qualities can help you construct this space?

Paperphone’s suite of effects could create an illusion of a space or place. The "spacey" option on the fixed preset menu provides an instant large-room resonance to envelope your voice. If you want to emulate a sounding space of your own, play with the settings of the Reverb effect. Try Reverb in combination with the echo effect to make a more complex sonic space. If you want to recreate a specific place (with temporal and geographic references), playback location recordings using the Audio File Player.

Medium, Transmission, and Temporality

Sounds carry qualities of their medium type and transmission process. The materiality and process of sonic transmission can connote temporal and locative ideas related to the sound source. Sounds that come out of a stadium, for instance, have a different rhetorical and affective impact from sounds coming out of a small telephone speaker. Emulating sound sources and transmission process through timbral manipulations could be a creative approach to composing scholarship based in sound.

Effects such as Filter, Distortion, and Chorus could generate effects that emulate the medium of sound transmission. For instance, playing with the Filter settings could produce the illusion of a small speaker like a speakerphone or telephone. An addition of Distortion to the voice would create vintage, analog sound qualities like an old radio or record player. Try a combination of these effects to experiment with ideas related to temporality and medium of transmission. The Artifact preset, for example, is a result of combining Reverb and Chorus. This preset emulates a small amount of unwanted noise resulting from sound manipulation. The Megaphone preset consists of an amount of Filter set to remove the low frequencies of the voice spectrum, in addition to a low level of Distortion.


Vocal register is highly gendered in our society. Queer scholarship has troubled gender connotations by deconstructing the physical qualities of sex, gender, and sexuality. Paperphone’s Pitchshift provides a context for users to experiment with the gendering (or queering) of voice and scholarly content as it is presented and received. This exploration could contribute to the richness of meanings related to gendered notions such as authorship as well as authority and—per our tester Steph Ceraso's suggestion—to engage notions such genderqueer or speculate gender-neutral voice or audio.

In Paperphone, start by experimenting with the Pitchshift effect by sliding the frequency control up and down. Further, play with Pitchshift in combination with other effects like Vocoder, Distortion, Reverb, or Echo to create possibilities of gendering beyond the common binary and notions. Experiment with the presets appropriately name “Butch” and “Femme.” The Femme preset comes with a mid-range Pitchshift value accompanied by a slight amount of Reverb for a touch of flair elongating the tail of a phrase or sentence. The Butch preset setting comes with a low-register Pitchshift value with a bit of Distortion for a staccato coarseness.


Vocal synthesis can be a powerful tool to connote ideas related to machine, automation, synthesis, and artificiality. In addition to Paperphone’s Robot preset, Vocoder, and Chorus provide a playground to experiment with the relationship between sounds and meanings of machine/system. These effects could be useful in illustrating concepts related to human-machine interface, cyborg theory, and more broadly to the disciplines of digital humanities, software studies, and technology studies. We also recognize the cultural association between the vocoded sound and machine (and other related concepts like artificiality) establish in pop culture including movie soundtracks and iconic artists like Kraftwerk and Laurie Anderson. Given that, we encourage our users to continue reinventing the meaning of vocoded sounds and interact with this semiotic history through playing with Paperphone.

Intelligibility and Noise

Applying audio effects to the voice could challenge the intelligibility limits of speech. In spite of practical concerns, intelligibility can be a fresh avenue to explore the relationship between sound and meaning. This exploration perhaps can be framed by concepts related to transparency, opacity, density, obscurity, ambiguity, ambivalence, etc. In Paperphone, experiment with sliders as you speak into the microphone. An apparent example of this can be accomplished with the Chorus density slider set to the right, or with the Echo delay time control set toward the right side of the slider.

For a dramatic, noise-based outcome of intelligibility experimentation, play with the Feedback feature of Paperphone. While the Feedback feature is on, you can experiment with effect combinations to change the sound of the feedback loop. The Filter effect can be used to tune the frequency of the feedback. The Distortion can be used to make the feedback more intense. Reverb and Echo can be used to prolong the feedback indefinitely.

Part, Whole, and Layers

Richness in sound often comes with an experimentation with layering. Once layers of sounds are established, they resonate with one another and create a texture unique to the sonic content of each of the layers. Sampling as a technique can explore the dynamic between part and whole both in sound and in text. Think a word or a phrase that takes a particular conceptual role in your text. Isolating it and then playing it back could transform the meaning of that word or phrase, thus changing the semiotic relationship between this word and the entire text.

With Paperphone’s recording feature, the user can record a word, phrase, or even a paragraph or an entire essay. Then using the Audio File Player, the user can playback the recording while processing it with effects. Based on his or her semiotic intention, the user can further explore the sound with playback effects. For instance, looping the playback sound can yield a repetitive sound texture, potentially washing out the meaning of the recorded text. Additionally, experimenting with other effects can transform the semiotics of the recorded text. Using effects like Reverb or Echo can fabricate a distinct sound space. The user could talk over the played-back recording while adjusting it as a low volume. This approach would create a separation between foreground and background, emulating depth in sound.

Contexts of Implementation

There are other potential applications of Paperphone besides scholarly paper presentations. We encourage unconventional applications such as academic karaoke (e.g. "critical karaoke"), cross-genre production like scholarly poetry (i.e. blurring the boundaries between expository and creative writing), collaborative performance modes like scholar-artist tag teams (e.g. scholar talks while a sound artist interacts with the sound in real-time).

With the Feedback feature, the user can take a live signal from the microphone, and then feed that back into the system. Our tester Gabriele de Seta’s (@SanNuvola) suggested that this process, especially from working with sounds from the room, audience, or a particular vocal gesture could create a loop of "cumulative noises & silence." We imagine that this feedback capacity could radically question the distinction between academic presentation and sound performance, as it would heighten the audience’s awareness of sound during a performance/presentation.


The following video was filmed at UCLA, where Wendy Hsu and Jonathan Zorn launched Paperphone in May 2014.

Setup and Tutorials

First, download the latest version of Paperphone.

Getting started with Paperphone requires a few technical steps. Read our instructions to begin your setup.

Additionally, we made a series of tutorials—text and video—to help you get acquainted with the Paperphone app. These tutorials touch on the following features of the app:


We are committed to making this project as open-source as possible. We are making the Paperphone Max patch (both editable and standalone versions) available online. The work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

About the Contributors

The project's principal investigator is Wendy Hsu, an ethnographer, musician, and arts organizer who engages with multimodal research and performance practices informed by music from continental to diasporic Asia. She has published on Taqwacore, Asian American indie rock, Yoko Ono, Hedwig and the Angry Inch, Bollywood, and digital ethnography. As an ACLS Public Fellow, Hsu currently works with the City of Los Angeles Department of Cultural Affairs. She recently completed Mellon Digital Scholarship Postdoctoral Fellowship in the Center of Digital Learning + Research at Occidental College where she researched digital pedagogy and ethnographic methodology.

Lead developer Jonathan Zorn is a composer, performer, and curator of experimental, electronic, and improvised music. His electronic music pairs improvising musicians with interactive computer systems to create hybrid, human-machine ensembles. Zorn's interest in vocal utterance has resulted in a series of pieces in which spoken language is interrupted by electronic forces, drawing attention to the gap between speech and sound. Zorn has been active as an improvisor on bass and electronics for 15 years and has performed at Red Cat, Walker Art Center, Verona Jazz Festival, Library of Congress, Seattle Festival of Improvised Music, Line Space Line Festival, and Chelsea Art Museum.