Provoke: Digital Sound Studies

Susurrous Scholarship

Original Text Essays

Marvin, Carolyn. “Your Smart Phones Are Hot Pockets To Us: Context Collapse in a Mobilized Age.” Mobile Media & Communication, no. 1 (2013): 153-159. Translated into audio by Corrina Laughlin.

Jackson, Jr., John. “Peter Piper Picked Peppers, but Humpty Dumpty Got Pushed: The Productively Paranoid Stylings of Hip-Hop's Spirituality.” In Racial Paranoia: The Unintended Consequences of Political Correctness (New York: Basic Civitas Books, 2009). Translated into audio by Alejandro Gomez.

Geffen, Maria N., Judit Gervain, Janey F. Werker, and Marcelo O. Magnasco. “Auditory Perception of Self-Similarity in Water Sounds.” Frontiers in Integrative Neuroscience, no. 15 (2011). Translated into audio by Kevin Gotkin.

Mitchell, William J. “Boundaries/Networks.” In Me++: The Cyborg Self and the Networked City (Cambridge, MA: MIT Press, 2003). Translated into audio by Aaron Shapiro.

On Audio Translation & Digital Scholarship

The idea of “translation” is at the core of this project. Soundbox co-editor Darren Mueller moderated a conversation between three of the translators to discuss Susurrous Scholarship and their approach to digital sound scholarship.

Darren Mueller:

Calling these pieces translations strikes me as a statement of methodology and as a general statement about knowledge production in the digital humanities. It both embraces and productively uses the contradictions inherent in doing digital scholarship. Your original abstract stated that, “In each small decision about the translation, there is a profound ramification for the address, audience, and access of the new kind of scholarship.” Now that you’ve gone through the process, do any of you have a better sense of what those ramifications might be? And how are those ramifications related to a process of translation?

Kevin Gotkin:

I really like this idea of translation as both a practical tack and a way of framing digital scholarship generally. Translation can often imply some exact or neat order. It's a before-and-after process. But as you point out, Darren, it's not so simple in this project. I didn’t kid myself in thinking that translation is easy. In fact, precisely the opposite.

It's a central contradiction that guides this project: try to turn written work into sonic scholarship by preserving as much of the original as possible. This is an impossible starting place because we know that different vectors of knowledge themselves affect knowledge. Information is not the same between modalities. So you could say that what we did in this project was cautiously walk alongside this contradiction, knowing the whole time that we could never feel confident in calling our audio pieces "translations" of written material if "translation" usually suggests some formulaic correspondence between the original and its derivative. At the same time, we sensed that something really interesting might live in this contradiction. I hope what we've excavated here are some of the details (often prickly, hopefully productive) about what actually committing to digital scholarship means.

One of the most interesting things I faced in producing my sound piece was how layered the stakes were for each and every decision I made. When you produce a radio piece or a podcast or some other sonic text that has a discernible genre, there are accepted practices that you can lean on. Put music here and duck it under the narrator. Set the scene in a first-person account. What made our project so difficult was that we had no legible genre to turn to. And the only anchor we could rely on was the original written text, whose conventions we couldn't be sure actually existed in sound. What, for example, does a paragraph break sound like? When I was faced with the task of having to make decisions about each formal feature of the text, it was like picking up a paintbrush and suddenly not being able to find the ends of the canvas. I didn't even know where I was in relation to the canvas. I don't think it goes too far to say that I wasn't even sure there was a canvas. There was just an empty editing dock on my laptop screen. And a microphone.

Threaded through this doubt around how to make written features into sonic features was the ever-present echo of the real audience of the piece. Who would/will/might actually listen to this piece? And should I consider this hypothetical person as an actor on my decisions, even when the first imperative (just translate!) might dictate that I don't attend to the listener? This is a question of literacy. We have to train audiences to be able to consume new forms of digital scholarship. I think it's impractical to expect people to already know how to read a sonic text the way we read a written one, largely because we don't really know how this should be done yet.

The ramifications for our project, then, are contingent on the state of affairs elsewhere. It's contingent on whether journals are publishing sound. It's contingent on whether graduate students consider audio editing a research practice. It's contingent on whether we're assigning audio pieces on our own syllabuses and outlining listening rubrics as rigorous as our reading rubrics. All of these external factors affect the “address, audience, and access” of our project. But the tantalizing hope is that, conversely, we in turn affect those external factors too.

Corrina Laughlin:

Unlike Kevin, I initially (and naively) imagined that I could more or less directly translate a written article into audio. I chose Carolyn Marvin’s piece “Your Cell Phones are Hot Pockets to Us: Context collapse in a mobilized age” because I considered it a hypertextual work that referred to a recent cultural phenomenon, political sound bites and social media memes in an almost list-like way. Because of this, I felt that the piece would lend itself particularly well to a straightforward translation into audio.

As soon as I began gathering clips however, I realized that such a translation would be more difficult than I had anticipated. Logistically, finding sound bites that could act as adequate referents for the complex phenomena described in Marvin’s article proved complicated. Furthermore, reading the pertinent points and topic sentences aloud caused them to sound more declarative and pedantic than they appeared on paper.  Plus, the tone of my voice imbued a discernible feeling into the piece.  As much as I experimented with the recorder, I found I could not make my voice sound neutral.

As I put the piece together, I began to make concessions to my initial conception of direct translation. Like most translators, I was negotiating with two imagined audiences in my mind.  Firstly, I wanted to piece to be legible to the average academic listener. Second, I wanted to ensure I was honoring the author’s initial vision for the piece; Carolyn was a persistent, if imagined, interlocutor in my process.

Perhaps the most significant addition I made was that of music, which seemed like a necessary convention in the podcast format.  We tend to expect music to be included in audio pieces and perhaps because of this, without music, the piece did not have the flow it would have had on paper.  I imagined that musical accompaniment would help to capture the listener’s attention and add a kind of continuity that would make listening to the piece more akin to the experience of reading the piece. But music also powerfully conveys a particular tone, and I was concerned it might overshadow Carolyn’s words. I experimented with several styles of music, and eventually settled on Brian Eno’s “Late Anthropocene,” as I felt it most nearly approximated the tone of Carolyn’s article. Admittedly though, it perhaps appended a somber undertone.

Of the three pieces we produced for this project, mine was nonetheless the most direct “translation.” Almost immediately, however, I realized that my assumption that a written piece could be translated neatly into audio was deeply flawed. Instead of the process functioning like a language translation, in which ideas are kept mostly intact and tones preserved, my translation functioned more like an interpretive dance: it became a performance of Carolyn’s ideas that ended up taking on a life of its own.

Alejandro Gomez:

I find Corrina’s “interpretive dance” metaphor quite apt for describing my approach to the project. For my piece, I attempted to translate an academic intervention without incorporating any narration of its specific arguments. I challenged myself to represent the most compelling evidence within the book chapter, bound together not with rhetorical devices but with differing degrees of sonic adjacency. My hope was that through the careful ordering of key “samples” my composition could execute the same persuasive and informative ends as the original chapter. As the essay spoke of the important role of hip-hop as a vehicle for self-determination, it seemed an obvious choice to position the music to speak for itself.

John L. Jackson, Jr., author of Racial Paranoia, urged me to read the book Five Percenter Rap by Felicia Miyakawa in order to better grasp the content and cultural context of hip-hop’s spiritual and polemical undercurrents. This secondary text guided me through the selection of illustrative songs whose lyrics were characteristic of the genre’s socio-political subtext. Creating an “intellectual mixtape,” I selected examples of the empowering nationalistic rhetoric of Five Percenter rap, looping, blending and manipulating these samples to carry the listener through an oral history of racial identity and paranoia in hip-hop.

In this sense, I was also translating—representing, recapitulating, distilling—the dogma and identity of hip-hop Culture (the capital C here denotes the notion of a cohesive paradigm). The task of accurately conveying the nuances and major tenets of a cultural tradition so often misappropriated raised the political stakes of the project; the risk of mis-translation meant the potential to alienate the hip-hop community and tarnish my own reputation—as a scholar and as a DJ.

I felt the triangulation of the translation process—original text, key secondary reference, and songs-as-primary source—allowed me to trace a cycle between argument, artifact and history that reflected the illuminating nature of the book chapter as well as the pedagogic mode intrinsic to hip-hop’s designs. And in the process, I learned about the craft of constructing meaning from manipulated samples, much like the sonic discipline of hip hop music; that is to say, I took a crash-course in DJ software program Ableton Live in order to produce this piece. Conscripting others’ artistic work to imbue my own composition with the message I was communicating offered me a glimpse into the collage-like methodology of “dropping knowledge” in hip-hop.

I've also attached some scanned notes to illustrate the process of distillation referred to above. Quite a challenging process and one which reminded me of the value of alternative musical notation, such as Turntablist Transcription Methodology. Perhaps I'll have to research that method further in order to better compose my next sound piece!


These images are fascinating (and somewhat bewildering)! They also point to how the processes of sonic translation involve so many decisions about representation and organization. It also takes some knowledge of tools like Ableton Live, which nicely leads to another line of thought that I had. After listening to your essays, the first questions that came to mind were things like: What kind of equipment did you use? What program did you use to do the editing? What were your biggest production concerns as you made your essays?

I’m still curious about those questions, but I’ll approach it from a different direction. If someone were to listen to your essays and say, “wow, I want to do that,” what would your advice be? What equipment would you suggest? What things would people have to consider that they might not think of right away?


One of the things I find most interesting about audio production is how it is a kind of practical knowledge. Some folks have theorized this as "mētis”. James Scott uses it in Seeing Like a State to refer to the practical knowledge of folks whose activities cannot be abstracted appropriately by the state’s “synoptic” measures. He writes that the term “descends from classical Greek and denotes the knowledge that can only come from practical experience[…]” (6). In a very different domain, Jay Dolmage has made mētis central to rhetoric of disability and embodiment, using the term to mean a proper engagement with the body and embodiment that often gets lost as we abstract and theorize. I think the on-the-ground-ness of mētis helps describe one part of our project, which is to situate sensory (and maybe sensuous) knowledges into the production of the audio as a counterpart to the habits of textual scholarship that don’t implicate the body in this way.

I've found that command of the technical aspects of production is far easier to grasp than I thought it would be. And I think this owes generally to the principle that anything done well makes itself seem seamless, somehow wizardly. But it's not wizardly! A critical media literacy requires that we pull back the curtain on this stuff because it really is easier than you think.

So to answer the question more pointedly: I used a Zoom H2 Handy Recorder when I captured my own voice for the piece. I gathered audio from various online sources using Audio Hijack Pro, and I edited my piece in Adobe Audition. In addition to the many different ways to do what I did technically, there are also plenty of ways to have attempted my translations conceptually. And in general, I think writing a script is where production really starts.

Although writing for audio is a really hard thing to do because it forces you to use a textual mode to think through a different sense altogether. Writing for the voice is a really tricky thing to master. I've found NPR's Sound Reporting manual to be incredibly helpful in offering some examples of excellent work. also regularly features wonderful advice on writing. The more complicated thing about creating a script, I find, is that sometimes you can't write well without also hearing the piece. It's a style you write through, so I will often write a little bit (something I can "hear" as I write) and then actually record and edit it in with the other sounds or music I want to use. This gives me a little burst of inspiration and suddenly I can start hearing more of the piece as I write. The danger here is that your voice might sound a bit different with each take of the recording (especially with high quality microphones that are sensitive to how far away your mouth is from the recorder). So it sometimes gets hard to split up the recording over a few days because you start to hear subtle changes in your voice.

The most important source of insight might be other people's scripts. I'm always happy to share mine if my production style matches what a new producer is after.

I suspect that my process here might be quite different from what Corrina and Alex were up to.


I also used a Zoom recorder to record my voice.  For the sound bites, I downloaded YouTube videos, which I was able to import directly into the editing software.  For this project I used Hindenburg to edit. Hindenburg is more basic than Audition or GarageBand in terms of the features offered, and I probably will not use it again. However, I would recommend it for anyone aspiring to do (very) simple sound editing, as it is very easy to learn.

I too have been surprised by how easy it can be to put together a sound piece—that's not to say it is easy to put together a good one, but in terms of technological skill, the learning curve in my experience has not been exceptionally steep. My main advice for aspiring audio producers: do not be intimidated by the technology!

This piece came together fairly simply, and given that it was meant to be a direct translation of an existing piece, I did not write a script for it. Generally, I have not written formal scripts for my audio pieces. Instead, I typically start with a broad outline and experiment with the balance of sounds, music and analysis until it sounds right to my ear. This tends to be the way I write as well. I find I have to leave myself a lot of room to experiment, to add and delete, to improvise, across every step in the process. It is not a quick way to produce—or to write, for that matter—but it seems to be the only way that works for me.


Envisioning is an important part of my creative process. I'd recommend developing a strong sense of what you'd like to produce, then researching what software is involved in the production of sounds that are aesthetically or structurally similar to the composition you have in your imagination.

Since I knew that I would need to be able to manipulate music tracks in such a way that certain elements would become more salient than in the songs’ original form, I elected to use Ableton Live. I was familiar with the software's prevalence at live DJ shows, events that are full of such dramatic manipulations. I first discovered the program when learning about the remix artist Girl Talk, an adept artist whose seamless blends of ostensibly clashing genres of music crack open the conventions of listenership and song meaning. My ambition was to create an experience like that, so I turned to the tool that I knew would be powerful enough to express the intertextuality of my source text, Peter Piper Picked Peppers, but Humpty Dumpty Got Pushed. As I got acquainted with Ableton throughout the composition process I discovered new potentials for manipulating sound, which then influenced my subsequent vision for the piece.

That's another piece of advice: allow yourself to learn about the equipment as you create with it. This can help diversify your palette and deepen the internal drama of your composition.

Unfortunately it is tempting to allow such tinkering to lead to perfectionism, which can make it difficult to feel like a piece is ever "finished." To counter this, I'd recommend doing the following: once the piece is as close to complete as your skills will allow, listen to the entire composition without pausing or "fixing" something. Consider doing this with a friend who can help you determine if the piece is legible to outside ears (thanks Kevin!). Note the moments when you cringe, but do not return to your Digital Audio Workstation until you've heard the entire work.

Lastly, I'd like to add that Audio Hijack Pro was crucial for my sampling methodology. The ease of using this program helped me broaden the range of sources I imagined as cultural reference points for this sound collage. However, I'd also urge future sound scholars to consider the intended audience and mode of publication for their work; if your sources are protected by copyright and especially if these samples are collected by bootlegging, the ramifications for publication could put the creator at risk. If digital sound scholarship is meant to be shared with others through online publics it would behoove us to be familiar with the legal restrictions on the content of our compositions.

Transcript for "Context Collapse"

Music fades up. It is an upbeat song with electronic tones, but there is noise and sounds of audio decay. A female voice says:

Your Smartphones Are Hot Pocket to Us: Context Collapse in a Mobilized Age by Carolyn Marvin.

The song hits a stride, the noise fades a bit and there is a catchy melody. The music fades out and the female voice says:

The digital transformation unfolding at different velocities and levels of awareness around us is a game-changer of historic proportions. It has taken time to cross the threshold of visibility; it will chew up the old ways for a long time to come.

Music fades up, slow, almost eerie tones. The narrator continues. Her voice sounds like she is speaking over a PA system or on a phone.

Social distance is a powerful social fact. Each of us, members of interlocking, overlapping tribes, occupies social coordinates that register our relative exposure to others who occupy coordinates of their own. Such locations flow from contingencies of class,

[Ambient sound from a party, a British male voice: "Every Christmas the Duke and Duchess of Devonshire invite the children who live on their Derbyshire estate to a party at the painted hall at Chatsworth."]

Strong and weak ties,

[Music and a male voice: "Had the workers been slaves Kufu would not have honored them with burial so close to himself."]

Role expectations,

[A female voice, sounding like it's from a long time ago: "I read somewhere that a marriage without parents approval has two strikes on it from the start."]

And technological affordance,

[Commercial music and a male voice: Radio shack keeps you in constant communication with their affordable, transportable cellular telephone. [phone rings] Hello? Well yes, he's right here—it's for you!]

They may be elastic or relatively fixed. They vary historically and across groups.

The music fades up again, electronic tones with glitchy spurts underneath.

Though our sense of stability is rooted in the reliability of the social geography we inhabit, social distances change in the ebb and flow of group experience. Technology is a highly visible player in this process. When new forms put familiar social distances at risk, terrible anxieties ripple through the body politic.

The precise unease generated speaks volumes about the meanings of the distances under siege. Michael Wesch's nervous metaphor, context collapse (2008), nicely captures the process I am describing.

[Voice of Michael Wesch: So if you meditate on this just for a second and think about what this means, every time you talk on a web cam you're talking to someplace that's unknown you actually don't know who's going to be talking back to you. So you have sort of an invisible audience phenomenon. It's asynchronous so you never know when they're going to watch you. And you think about when you talk—every time you talk you're sort of sizing up the context and in this case you actually don't know what the context is, you could be launched in to many, many different contexts. Including your video could be remixed by somebody. So you don't actually know what's going on and this is what we came to call context collapse."]

Narrator with an echo effect on her voice: Social distance is a powerful social fact.

Digitally recalibrated social distance enters public consciousness through dramatic misfires. Popular media circulate shocking instances of social distance violated and high flyers brought digitally to heel. A deeply human fascination with breached social distance alerts us to new vulnerabilities and humiliations.

Some unintentional:

[Voice of Anthony Weiner: The answer is I did not send that Tweet. My system was hacked I was pranked. It was a fairly common one. People make fun of my name all the time when you're name is Weiner you kind of get that. I've asked a firm to take a look at this they're hiring an internet security operation we want to make sure that it doesn't happen again…]

Some impulsive:

[Voice of Charlie Sheen: Well of course I'm on a drug. It's called Charlie Sheen. Carlos Estevez. Carlos Sheen. Charlie Estevez. Janet Estevez. Ramon Sheen. Emilio Sheen. Mama-dad-me. Emili-Ra-mane-me. Carlos-me-Sheen. Sheen-a-me-ma.]

Some undeserved:

[Voice of a male newscaster: I want you to look at this young man. He's 18 years old. A freshman at Rutgers University. His name is Tyler Clementi. Well he went out and he killed himself. Did you hear? He jumped off the George Washington Bridge right into the Hudson River. Prosecutors say he did it after 2 of his classmates allegedly spied on him while he was having a sexual encounter and then they showed it to the whole world, they put it on the Internet. Live, in fact. They streamed it so everybody could see it.]

It also alerts us to new powers.

[Voice of a different male newscaster: Armed with little more than a laptop, Wael Ghonim does not look like the leader of a revolution but look how the people react when they spot him in the streets. Along with a group of young, Egyptian activist this 30-year-old marketing director for Google worked in his spare time to organize the January 25th protest that sparked a grassroots revolt.]

[Voice of a female, nearly screaming: Everyone on the Internet can see what you are doing right now. This is a farce. The Texas Legislature is a bunch of liars who hate women (Judge attempts to speak over the woman who is yelling in the distance)]

New social hierarchies, new forms of openness and reticence, new etiquette styles and social obligations – all these emerge from context collapse.

Four elements of context collapse emerging from mobile communication may be conceptualized as follows:

One, deep connectivity.

[Narrator sounding like she's speaking over a walkie-talkie: Digital and physical interfaces become denser, more complexly reticulated, more broadly distributed. Deep connectivity may be deployed to levy a kind of battering ram against siloization by authority. Consider Sina Weibo.]

[Voice of a British, male newscaster, ambient sound of a crowd underneath: January in Guang Jo and one of hundreds of daily protests across China. Just look at all the smartphones. Each one of these people is doing what the traditional and state-monitored can't or won't do and using an increasingly powerful web to expose their own suppression and their leaders' corruption.]

Eerie music returns.

Two. Temporal acceleration.

Narrator: Accountable time moves toward 24/7. Shrinking response time imperils the deliberative. Consider the search for the Boston bombers.

[From a newscast: This represents a whole new way of thinking.

Voice of a female newscaster: It was a stunning example of what can happen when actual information is released to Tweeters and Facebookers.

A male voice: They became investigators. Their stated goal was to find the bombers before the FBI.]

Three. Expanded legibilities.

Narrator with an effect on her voice: By ranking digital influence and participation, new aggregators of legibility create novel measures of social worth, performative obligations, and formats for surveillance. Consider: facial recognition.

[A male voice: VeriLook Surveillance STK is used for the development of biometric software that performs face identification in live video streams from high-resolution digital cameras. The STK is based on…]

Four. Asymmetric transparency.

There is certainly convenience and safety in digital filtering. But asymmetric transparency also tilts the surveillance advantage toward stalker/trackers and away from targets unaware of their own visibility. Consider India's biometric program.

[British female newscaster: Now the fingerprints and eye scans of every person in India are being gathered in the biggest biometric database in the world. More than 1.2 billion people's data will be in the system which will give access to the welfare state but opponents argue the scheme could be exploited by the Indian government…]

What is at stake is social trust itself, the fragile conviction that our shared world is manageable and safe.

Upbeat glitchy music fades up.

A vigorous response to the anxieties of context collapse means restraining the commercial and bureaucratic colonization of free human activity across an evolving landscape of embodied and disembodied spaces. It means resisting coercive manipulation by digital actors and surveillance by default – whether for profit or in the name of social order. It must regard with wariness the filter bubbles of digital gated communities, by turns heartless and paranoid. What is required above all are shared and visible commitments to dignified respect, obligations of care, and enforceable standards of public accountability for emerging digital architectures.

Music fades out.

Transcript for "Racial Paranoia"

Sounds from a playground. Children playing.

A man's voice screams: Yo! YO! A turntable scratches. Another man's voice says: You ain't any match for me and there is a sound of a whip. Another man says: I was born in Brooklyn.

A song fades up, with lyrics: I was sitting on the corner just wasting my time when I realized I was the king of the rhyme, I got on the microphone and what do ya see, the rest was my legacy. I was born to be the king of the Levi's swing, to have…

Another man's voice comes in but it is hard to hear him. The song behind continues: I'm like Shakespeare, I'm a pioneer because I made rap songs people wanted to hear, see before my reign it was the same old same, ba-ba-da-da that's me talking.

Another man comes in: He did not just play our music, he empowered us. The song continues underneath.

A woman says: It's like everybody else, everybody in this generation of hip-hop, you know we grew up taping Red Alert on the radio, like it was work, you had to stay up when Red was on, stay up late and make sure you tape this joint so you could have the rest of the week to listen to that and I did that and I was a straight fan.

A new song comes in, lyrics say: The party has already started. Let me introduce you to another type of rapper, emcee, well glamor and glitter don't matter gently.

A number of songs and voices overlap quickly.

A man laughs maniacally. A voice says: He was a mad scientist back in the day, he took all these Africans and he bred the lightest ones and imbred them more and more until it became the white man, which is the devil that walks the earth today.

Another song comes in. Over it, a voice says: We got our own rules and regulations and we know that God is alive and he is Black. These words repeat, warped.

A number of songs collide and repeat. A voice says: You are witnessing commitments in the world that are produced by real men, but you don't see the real cause of the effect of your own suffering because the blood-suckers of the poor make you think that God is some mystery.

We know that God is alive and he is Black.

Another voice says: The honorable Elijah Mohammed said to us…don't believe the lies of the 10%

A number of voices overlap. The music underneath starts to warp downward. So do the voices.

A voice repeats in differently warped repetitions: racial paranoia.

A man says: A new thing is happening today.

A new song comes in, faster. Well, it's time to start the revolution.

When the revolution comes.

The father…was a brother that felt that it was important that the urban community was being communicated to as far as you know the science of life in the way that was prioritized just as much as it was for Muslims that you know were born into being Muslim or has converted into the Nation of Islam. He felt like, you know, a lot kids in the community…that still didn't mean that they didn't deserve the science, that didn't mean that they still didn't deserve the truth, that still didn't mean that they didn't deserve the information that could empower them. So what he did was he went into the hood and he would see these kids on the corner. He felt it was necessary to sacrifice himself in the way where even if he had to partake in the activities that was improper to clean them up. If they smoked, he smoked with them. If they rolled dice, he rolled dice with them to feel comfortable to embrace him in their circle and be willing to listen to the information, you know what I'm saying? And that's what he did and he created the 5% Nation of Islam…

A man's voice says: Wake up! Wake up! Up you wake, up you wake, up you wake.

A different man says: Man, our school shit is a joke. The same people who control the schools control the prison system. And the whole social system. Ever since slavery. Know what I'm saying?

Lyrics of the song say: Schools ain't teaching us what we need to know to survive.

A man says: Schools train people to be ignorant with style. They give you the equipment you need to be a functional ignoramus. American schools do not equip you to deal with things like logic, they don't give you the criteria by which to judge between good or bad in any medium or format, and they prepare you to be a usable victim for a military-industrial complex that needs manpower. As long as you're just smart enough to do a job and just dumb enough to swallow what they feed you, you're going to be all right.

New song, new voice: I hate the way they portray us in the media. If you see a black family it says they're looting, if you see a white family it says they're looking for food and you know it's been five days because most of the people are black and even for me to complain about it I would be a hypocrite because I've tried to turn away the TV because it's too hard to watch. If I was down there and those were my people down there and anybody wants to do anything that we can help with the setup the way America is set up to help the poor, the less wall off, it's impossible. We already realized a lot of the people that could help are at war right now fighting another way and they've given a mission to go down and shoot us.

New song, people chanting. A number of voices and news reports overlap.

A man asks: Are you concerned that possibly it will affect box office or record sales because you're too close to the edge?

A man says: It's like this. The masses, the hungry people, they outweigh the rich. So as long as…it's all good. It's the rich people who are worried about…Everybody knows crime out there, everybody knows the situation we're in…Why get mad at the brother that's giving you news?...A crowd cheers.

A man says: There's a weird game that goes on because now as a result of your art, you're becoming rich.

Yes, the man responds.

A lot of people would say the race wars of the 1960s haven't vanished, they're just put on sheep's clothes. [George Bush doesn't care about Black people.] So the idea of racial paranoia is trying to find a conceptual framework for making sense of how people mine for seemingly innocuous interactions for glimpses at what racial wolves in sheep's clothing might look like.

And he goes, there's that rich nigger. And I was like, he said nigger! He said nigger! And everybody was like, so? And I was like oh my god! This is where I'm gonna be staying? They said nigger?

A women replies, well you got niggers in one of your records!

Nig-gas. We're talking about nig-gers. Nig-gers was the ones on the rope, hanging up. Nig-gas is the one with gold rope, hanging out at clubs.

The woman says, well maybe not everyone's aware of the differentiation.

They don't have to be. If you're not a nigga…

But the pixie was in black face. But the reason I chose black face at the time was that it would be the visual impersonation of black face…and it was the first time I ever got a laugh that I was uncomfortable with.

What was it about the laugh? Would you say you're paranoid?

First of all, what is a black man without his paranoia at hand?

A number of songs overlap, concluding.

Transcript for "Water Sounds"

[The piece opens with what sounds like the floor of a rainforest. There are insects chirping and birds calling. These sounds continue as a male voice comes in.]

How do we know when we are listening to a sound of nature? Some research has suggested that the way our brains perceive natural stimuli mirrors the statistics that are characteristic of the environment. For example, imagine you're in the middle of a forest and looking at various trees and other objects around you. These things appear to have different sizes, but we know that the apparent sizes depend on how close we are to them. This visual scene has what we call “scale-invariant” because the features of the object do not change fundamentally, only in scale.

[The rainforest sounds fade out.]

Natural sounds have also found to have scale-invariance, but in more complicated ways. And these sounds have been shown to have a neural correlate, which means the brain responds preferentially to sounds that exhibit scale-invariant features. We call this the efficient coding hypothesis. Determining the relationships between sounds and their perceptual correlates is essential in understanding sensory neural processing, that is, for understanding the way our brains react to our environment.

We want to know whether certain environmental sounds, like the sound of running water, are scale-invariant across a broad spectrum, where the sounds' structure is repeated on many levels. And if this is true, whether this internal architecture is important for recognizing the sound as natural.

In the first experiment, we took the sounds of running water at a brook. We played these sounds to a group of 30 people, 26 of whom were female, 4 were male, with an average age of 24.7, the youngest being 20 and the oldest being 36. We played them the water sounds at various speeds and the order of the sounds was alternated and counter-balanced across participants.

We played the original sound of the water for 7 seconds. [There is a sound of rushing water, heard very close as if you had leaned down next to a river. The water seems to be running fast and powerfully, but still shallow.]

And then we modified the playback speed, only adjusting for loudness. We played the water at one-quarter speed… [The sound is tinny, electric, unreal, almost like glitchy feedback.] speed...[The sound is still tinny and electric, but faster now and sounds a little more natural.]

...double speed...[The sound is very similar to the original, almost indistinguishable from it.]

...and finally, four-times as fast as the original. [The sound is still very similar to the original, but cluttered with many of the original sounds, like the water is rushing faster.]

Participants were asked to rate the quality of the sounds as natural on a 1 to 7 scale, with 1 being most unnatural and 7 being most natural. When a participant reported a 4 or above, s/he was asked to provide a qualitative description of the sound. All of the sounds were rated just above or below a 5, with the highly sped-up sound reporting slightly lower. The qualitative descriptions confirmed water-like sounds.

This confirms that water sounds are perceived as natural when played at different speeds, which is important because it tells us that the sounds must be scale-invariant around frequency, which gets dramatically modified when the speed is changed. We found an equation that captures the structure of the sound around its perceptually relevant features.

Next we wanted to find out more about the scale-invariance of the sounds themselves and we used a number of statistical processes to modify the original sounds so that we could observe its properties. We produced a surrogate for the original sound. [It sounds like white noise, a constant noise that is almost like the audio static on a television channel that doesn't exist.]

We found that the sound is indeed self-similar across many dimensions of the sound like different spectral bands, but that the sound has a secondary statistical structure that makes the sounds' internal architecture rare and complex but still conforming to a coherent structure.

In our second experiment, we wanted to see which features of the natural sounds are essential for them to be perceived as natural. Based on what we found about the water sounds' structure, we created a library of synthetic sounds that could be systematically modified so we could compare them after our participants rated them. We used what's called a “random chirp stimulus” to identify certain features of the original sound and turn them into a new kind that could be tested as natural. The three sounds were: [The first sound is like a boiling liquid, except with a viscous material. The second one has an electronic twinge but sounds more like water, lighter. The last one seems to have an echo, almost electronic but in a different register.]

The listeners rated the synthetic sounds as natural on the whole, although there was a bell-curve distribution. The sounds modified intermediately were found to be the most natural and the ones modified at the highest levels were less so. We used control tests to make sure that the modified sounds were not violating the sound's scale-invariance.

Our research produces a new model of water sounds, unlike previous ones that looked at the physical effects of air bubbles in the water. Our model identifies an overarching statistical principle of scale-invariance that allows us to dramatically reduce the numbers of parameters we need to describe the full structure of the sound. Our research also tells us about important components of our auditory systems' processing of these sounds. We know that the auditory cortex in the brain responds strongest to stimuli with natural statistics, which opens the door to a novel library of sounds for researchers doing neuroimaging through receptive field mapping.

[The rainforest sounds return and continue under the concluding lines.]

In our study, we've used formal properties of sound to extrapolate to the neural basis of human auditory perception, giving way to more advanced research both about the sounds themselves and also about our perception of them.

[The rainforest sounds fade out.]

Transcript for "Boundaries/Networks"

Consider me, if you will, Me++.

I consist of a biological core surrounded by extended, constructed systems of boundaries and networks. The boundaries define a space of containers and places, while the networks establish a space of links and flows. Walls, fences, and skins divide; paths, pipes, and wires connect.


My natural skin is just layer zero of a nested boundary structure. When I shave, I coat my face with lather. When I'm nearly naked in the open air, I wear—at the very least—a second skin of SPF-15 sunblock.

My clothing is a layer of soft architecture, shrink-wrapped around the contours of my body. Beds, rugs, and curtains are looser assemblages of surrounding fabric—somewhere between underwear and walls.

My room is cast into a more rigorous geometry, fixed in place, and enlarged in scale so that it encloses me at a comfortable distance. The building that contains it has a weather- proof exterior shell.

Before modern mobile artillery, fortified city walls would have provided a final, hardened, outermost crust. In the early years of the Cold War, outer defensive encasements reemerged, in extreme form, as domestic nuclear bunkers. The destruction of the Berlin Wall in 1989 marked the end of that edgy era. But still, if I end up in jail, an internment camp, or a walled retirement community, the distinction between intramural and extramural remains brutally literal.

All of my boundaries depend, for their effectiveness, upon combining sufficient capacity to attenuate flow with sufficient thickness.

If I want acoustic privacy, I can retreat behind a closed door, or I can simply rely on the attenuation of sound waves in air and move out of earshot.

In sparsely populated territories, distance creates many natural barriers, while in buildings and cities, efficient artificial barriers subdivide closely packed spaces.


Crossing the various boundaries that surround me, my enclosures are leaky.

I am, as Georg Simmel observed, a "connecting creature who must always separate and who cannot connect without separating."

To create and maintain differences between the interiors and exteriors of enclosures—and there is no point to boundaries and enclosures if there are no differences—I seek to control these networked flows. So the crossing points are sites where I can survey what's coming and going, make access decisions, filter out what I don't want to admit or release, express desire, exercise power, and define otherness. Directly and indirectly, I employ doors, windows, bug screens, gates, cattle grids, adjustable apertures, valves, filters, prophylactics, diapers, face masks, receptionists, security checkpoints, customs and immigration checkpoints, traffic signals, routers and switches to determine who or what can go where, and when they can go there. So do you, of course, and so do others with the capacity to do so in particular contexts.

Through the interaction of our efforts to effect and control transfers among enclosures and our competition for network resources, we mutually construct and constrain one another's realms of daily action. Within the relatively stable framework of our interconnecting, overlapping, sometimes shared transfer networks, our intricately inter-woven demands and responses create fluctuating conditions of freedom and constraint.

And as networks become faster, more pervasive, and more essential, these dynamics become increasingly crucial to the conduct of our lives; we have all discovered that a traffic jam, a check- in line, a power outage, a server overwhelmed by a denial-of-service attack, or a market crash can create as effective a barrier as a locked door.

The more we depend upon networks, the more tightly and dynamically interwoven our destinies become.


The archetypal structure of the network, with its accumulation and habitation sites, links, dynamic flow patterns, interdependencies, and control points, is now repeated at every scale from that of neural networks (neurons, axons, synapses) and digital circuitry (registers, electron pathways, switches) to that of global transportation networks (warehouses, shipping and air routes, ports of entry).

And networks of different types and scales are integrated into larger network complexes serving multiple functions. Depending upon our relationships to the associated social and political structures, each of us can potentially play many different roles at nodes within these complexes—owner, authorized user, operator, occupant, occupier, tenant, customer, guest, sojourner, tourist, immigrant, alien, interloper, infiltrator, trespasser, snooper, besieger, cracker, hijacker, invader, gatekeeper, jailer, or prisoner. Power and political identity have become inseparable from these roles.

With the proliferation of networks and our increasing dependence upon them, there has been a gradual inversion of the relationship between barriers and links. As the ancient use of a circle of walls to serve as the ideogram for a city illustrates, the enclosing, dividing, and sometimes-defended boundary was once the decisive mechanism of political geography. Joshua got access the old-fashioned way; when he blew his righteous trumpet, the walls of Jericho came tumbling down. By the mid-twentieth century, though, the most memorable ideogram of London was its underground network, and that of Los Angeles was its freeway map; riding the networks, not dwelling within walls, was what made you a Londoner or an Angeleno.

The unbelievably intricate diagram of Internet interconnectivity has become the most vivid icon of globalization. Now you get access by typing in your password, and IT managers dissolve the perimeters between organizations by merging their network access authorization lists. Today the network, rather than the enclosure, is emerging as the desired and contested object: the dual now dominates. Extension and entanglement trump enclosure and autonomy. Control of territory means little unless you also control the channel capacity and access points that service it.


Different places may simply run on their own clocks, or their timekeeping systems may be standardized and synchronized.

When there was little communication between spatially separated settlements, local time sufficed, and there was no need for such coordination. But linkage by long-distance railroad and telegraph networks eventually made it imperative.

In 1851 the Harvard College Observatory began to distribute clock ticks, by telegraph, to the railroad companies. As transportation and telecommunication capacities have increased, we have entered the era of globalized network time.

Computers have added additional layers of complexity to the construction of time. The first computers—constructed according to the elegant principles of Turing and von Neumann—were strictly sequential machines, executing one operation at a time; programming was a matter of specifying these operations in precise order.

If you take advantage of fast machines to compress processes, you can elide the distinction between simultaneity and sequence. It no longer makes sense to think of a computer as a compact, discrete object, or to distinguish between computers and networks.

Eventually, we will approach the physical speed limit, and its associated paradox; information cannot travel faster than light, so spatially distributed events that seem simultaneous from one node in a light-speed network may seem sequential from another, and vice versa.

The more we interrelate events and processes across space, the more simultaneity dominates succession; time no longer presents itself as one damn thing after another, but as a structure of multiple, parallel, sometimes cross-connected and interwoven, spatially distributed processes that cascade around the world through networks. Once there was a time and a place for everything; today, things are increasingly smeared across multiple sites and moments in complex and often indeterminate ways.


In the fast-paced, digitally mediated world that we have constructed for ourselves, what exists between zero and one, a pixel and its neighbor, or a discrete time interval and the next? The answer, of course, is nothing—profoundly nothing.

Our networks are similarly discontinuous structures; they have well-defined access points, and between these points things are in a kind of limbo. We experience networks at their interfaces, and only worry about the plumbing behind the interfaces when something goes wrong.

If you transfer yourself through a network, you directly experience this limbo. It is, perhaps, most dramatic on intercontinental night flights. You have your headphones on, there is darkness all around, and there is no sensation of motion. The video monitor constructs a local reality, and occasionally interrupts it to display current times at origin and destination. It is best not to worry too much about how to set your watch right now, precisely where you are, or whose laws might apply to you.

The discontinuities produced by networks result from the drive for efficiency, safety, and security. Engineers want to limit the number of access points and provide fast, uninterrupted transfers among these points. So you can drink from a stream anywhere along its length, but you can only access piped water at a faucet. You can pause wherever you want when you're strolling along a dirt track, but you must use stations for trains, entry and exit ramps for freeways, and airports for airline networks—and your experience of the terrain between these points is very limited. You experience the architectural transitions between floors of a building when you climb the stairs, but you go into architectural limbo when you use the elevator.


We are not fully contained within our skins; our extended networks and fragmented habitats make us spatially and temporally indefinite entities.

The ancient distinctions between user and tool, building and inhabitant, or city and citizen, no longer serve us well. We would do better to take the unit of subjectivity, and of survival, to be the biological individual plus its extensions and interconnections.

So I am not the Vitruvian man, enclosed within a single perfect circle, looking out at the world from my personal perspective coordinates and, simultaneously, providing the measure of all things. Nor am I an autonomous, self-sufficient, biologically embodied subject encountering, objectifying, and responding to my immediate environment. I construct, and I am constructed, in a mutually recursive process that continually engages my fluid, permeable boundaries and my endlessly ramifying networks.

I am a spatially extended cyborg.

About the Contributors

Susurrous Scholarship is a project by doctoral students of Annenberg School for Communication at the University of Pennsylvania. It is helmed by Kevin Gotkin, Corrina Laughlin, Aaron Shapiro, and Alex Gomez, whose research interests span from disability to religion to youth culture.