Skip to content
December 31, 2014 / Randy Coppinger

Machine Voices

A friend of mine was trying to make a voice sound like it was coming from a toy. He referenced the Extreme EQ article I wrote a while back. Him telling me about it, combined with some recent projects I’ve been doing, inspired me to assemble more ideas about treating voice recordings for machine-like effect.

I first heard the term futz from some film mixing colleagues, which refers to changing a recording so that it sounds like it’s on the phone, an intercom, a megaphone, or other mediated delivery of a voice. We can extend this idea to any kind of talking machine, whether it transmits a human voice, or represents a sentient machine like Hal 9000, C3PO, or Optimus Prime.

Some of the earliest practical voice treatments were made by placing telephone speakers, megaphones, etc. inside a sound isolation box with a microphone. The interior of the box was often lined with sound absorptive material to help reduce audible reflections inside. Sound was fed into the emitter of choice and then recorded using the microphone. But you don’t have to build the box unless you plan to do this kind of re-recording on a regular basis — having these things isolated might be helpful if you do it often enough.

radioshack-mini-audio-amplifierI love re-amping, especially when I’m going for realism. You know, playing a sound file of someone speaking on my mobile phone sounds very convincingly like that person talking on my phone! Sometimes old school re-recording is the better option: a quick and convincing method to get a machine-like voice treatment. And you don’t have to be a purist about it; you can add other forms of manipulation before and after.

(See also: Re-Amping Mix Tips One, Two, and Three)

Transducers found in machines often have a specific frequency response that we hear as machine-like. Using extreme roll offs and obnoxious, narrow boosts can help simulate an android, toy, or other talking machine. Listen to these kinds of sounds in the cinema, on TV, and real life talking devices to help you decide when your EQ settings help create a convincing treatment. For some specific ideas see: Extreme EQ.

Some of the earliest robot voices included the sound of the speaker inside the chassis of the machine. A really tight delay can simulate that kind of reflection. And you can create some great metallic resonances by cycling the result back into the delay again and again in a time smeared feedback loop. Delays under 30ms or so can create comb filters, also known as “phasing.” If you slowly increase the delay time then let it recover you can create flanging: comb filtering with variable notches and peaks. A closely related effect called chorusing also varies delay times back and forth for moving comb filters that can sound synthetic and hollow. @r0barr likes to use a ring modulator, which could be considered a frequency effect plus a time effect, because it smears specific frequency ranges over time by driving a filter into oscillation.

All of these forms of delay can have a synthetic, manufactured kind of sound. If you’re going for a vintage machine, or something subtle, a simple time based effect may be all you need. Or combine with other manipulations to mashup old and new sonic characteristics.

Some of my favorite plugins for these kinds of legacy effects are made by Soundtoys: EchoBoy, Phase Mistress, and Crystallizer.

(See also: Phase)

Both @daviddas and @recordingreview mentioned the TAL Vocoder by name. Breaking speech into smaller components, vocoding was originally created to reduce the bandwidth needed to transmit a voice. It was also used to encrypt voice communications including military use. By the nineteen sixties artists and technicians collaborated to make several different models of a “talking synthesizer”, or put another way, a singing machine. Because the artistic use of vocoders was based on music, using one may benefit from knowledge of music. If not a musician yourself, consider collaborating with your composer and musician friends.

There’s a fascinating history about vocoders on this wiki page if you’d like to read more.

@MikeHillier made a really obvious suggestion that I had completely overlooked: “get Appletalk to say it.” @chewedrock recommended “Atari speech synthesis – software automatic mouth.” Speech synthesis is a great option for a machine voice.

Intentional misuse and reinvention can be incredibly fun. Some of our beloved music processors such as pitch correction can be applied to speaking dialog instead of music for some very tasty synthetic voices. @grhufnagl said, “I really love using Melodyne to control pitch & time, alongside its formant control.”

Convolution and noise reduction may have been intended to emulate the real world and clean noise out of recordings, respectively. But we can choose to apply these in creative ways to generate interesting artifacts. Freeware and low cost software tends to cut corners, making them more prone to audible errors that sound unnatural and weird. Almost any audio tool can be used in ways it wasn’t intended to produce ear catching flaws.

Good sound design often features the prominent sound that we notice, with layers of quieter elements adding color and flavor. This can work for machine voices too. We can use the voice signal as a key to open a gate on other sounds — static, digitization artifacts, droning guitars, and many, so many more sonic clues that the voice is being mediated. Add samples, such as servos to move a mechanical mouth, or the hum of a power supply. These finishing touches are like highlights and shadows in visual art that add believable, three dimensional characteristics.

Next time: more treatments, more plugins, plus voice acting ideas and tips/tricks in Part 2

October 27, 2014 / Randy Coppinger

Cool Gear from the 137th AES 2014 Los Angeles

Here are some of the interesting things I saw on the exhibit floor.

(1) Triad Orbit was showing a very cleaver clamp with 5/8” threads for putting microphones is less traditional locations. The foam inside the clamp makes it safer to crank down on pretty fixtures, plus adds gripping power to keep it from sliding. It’s called the IO-C Mounting Clamp and I need several!
New Triad Orbit IO-C Mounting Clamp

(2) I like to stop by the Latch Lake booth in case they are giving away their fabulous Jam Nuts, which they were. I used both of them on a recording gig immediately following the convention. Latch Lake introduced a burley new tripod mic stand with the same boom clutch as found on their weighted base models. Want.
Latch Lake introduce the Mic King 1100 stand at AES 2014 Los Angeles

(3) I saw and heard the new Cliff Mics ribbon. It was impressive on a number of levels. The magnets were so massive and strong, I thought they were going to pull the hair off my face. Interestingly, the cover was made of mesh cloth rather than metal.
The new ribbon microphone by Cliff Mics unveiled at AES 2014 Los Angeles

(4) On recommendation I took some time to check out Miktek. Apparently the late, great Oliver Orchut of TAB Funkenwerks designed most of their microphones. I was especially interested to hear the figure 8 of their multi-pattern mikes, with insanely good off axis rejection and an even transition from on to off axis. Impressive.
Miktek C7e Large Diaphragm Multi-Pattern FET Condenser with highly accurate bi-directional pattern

See also New Microphones at AES 2014 from RecordingHacks
Bobby Owsinski’s AES Show New Gear Wrap Up Part 1Part 2Part 3

Did you see something at AES that belongs on this list? Let me know, won’t you?

October 22, 2014 / Randy Coppinger

Dynamic Mixing for Games

My notes from Game Audio 7 Dynamic Mixing for Games at 137th AES Convention 2014, Los Angeles
Presented Oct 10 by Simon Ashby

Dynamic Mixing defined: A system that dynamically changes the audio mix based on currently playing sounds and game situations.

Middleware such as Wwise creates channels between the game engine and the audio engine. In Wwise these are called Sends.

Simon Ashby of Audiokinetic discusses  Dynamic Mixing for Game Audio

Dynamic mixing can help keep things interesting by modifying sounds on the fly to keep from hearing the exact same thing repeatedly. But it can also provide feedback to the user such as volume going down to indicate greater distance from the player / camera.

In the same way live sound mixers can use snapshots to quickly go from cue to cue, middleware mix snapshots can be attached to triggers / mechanics in the game.

Side-chain is not just for ducking. It can drive other parameters such as EQ, pitch, sends, etc. In other words, you can drive a parameter setting based on the audio level of a different channel.

HDR: High Dynamic Range. The audio version of HDR photos. Inputs have high dynamic range going into the HDR buss. Delivery system ducks lower volume inputs to allow the louder ones to be heard. It is more complex than buss compression or ducking. The result is actually lower dynamic range but seems to have more range than compression or unmastered audio.

Audiokinetic has a YouTube Channel that includes some information about HDR.

Adaptive Loudness and Compression, as heard in mix by Rob Bridgett in Zorbit’s Math Adventure. Mix snapshots are triggered by output state of device: headphones, speaker, AirPlay. This can help protect user’s hearing and otherwise optimize for the listening scenario. Compression was also applied based on the volume measured at the mic input of the device. This was designed to help the listener hear better when playing in a loud environment. This was only applied to headphones because speaker and AirPlay would form a loop back into the microphone.

October 20, 2014 / Randy Coppinger

Game Audio Middleware

My notes from Game Audio 5 Audio Middleware for the Next Generation at 137th AES Convention 2014, Los Angeles
Presented Oct 10 by Steve Horowitz and Scott Looney

The key thing that separates linear media production from interactive is: indeterminacy. Middleware helps us manage this difference.

Justification of middleware:
(1) Puts more audio control in the hands of audio people, and
(2) Simplifies work for coders.

Steve Horowitz of Game Audio Institute presents at AES 2014 Los Angeles

Middleware for multiple development platforms: FMOD, Wwise
Unity specific middleware: Fabric, Master Audio

FMOD Studio is now sample based, not frame based.

It was suggested during this discussion that Master Audio seems ideally suited for 2D and casual games. It supports all systems to which Unity can publish, including web. It has better documentation than Fabric.
UPDATE Jan 5, 2015:
I originally reported Master Audio as Open Source, but it is not. When you buy, you get access to all of the source code — true. But game makers do not submit new code to Dark Tonic to update the product, rather Dark Tonic takes responsibility to write and publish Master Audio and allow game makers access in case they want to add code for their game. Brian Hunsaker of Dark Tonic clarified that Master Audio is used by AAA game studios, not merely 2D or indie developers. It is intended for any Unity based product that does not require realtime audio parameter changes.

Middleware Resources: Game Audio Institute,, IGDA, Game Sound Con

October 9, 2014 / Randy Coppinger

Game Audio Sound Design and Mix

My notes from Game Audio 3 Sound Design and Mix: Challenges and Solutions – Games, Film, Advertisement at 137th AES Convention 2014, Los Angeles
Presented Oct 9 by a panel of industry veterans: Charles Deenen, John Fasal, Tim Gedemer, Csaba Wagner, and Bryan Watkins.

There were many comparisons in working between game audio and film sound.

The short timelines and high quality standards seem similar for both.

Game audio folks seem less set in their ways, more collaborative with sound professionals, than people who make film trailers. This may be related to the veteran status of film trailer folks (typically 20+ years) to game audio (typically under 20 years of experience). One area of overlap: when people who are good at game trailers expand their career, there is a somewhat natural transition to film trailers.

Game audio source is typically pretty clean: studio recordings, ADR. Film source is typically noisy production audio that may need significant cleanup.

The expected playback system for a game trailer is a desktop or laptop computer, with limited frequency response, especially in the low end. Film trailers enjoy higher fidelity in movie theaters, home theaters. Also, volume measurement standards and best practices for loudness are different for theaters than the internet.

Deenan mentioned a unifying concept for his work in both film and game audio is trying to reduce source elements to cleaner, more fundamental sounds. Layers seem to combine better when they are stripped down, simplified.

Fasal showed a picture where a bike rack was attached to a car as a microphone mounting system. He also said that every car seems to need attention to where mikes are placed to get the best sounds… that there is no “tried and true” recipe. Though recording technology has changed in the last 30 years, the opportunity to listen, choose great sources, and carefully place microphones remains.

October 9, 2014 / Randy Coppinger

Interactive Music Systems

My notes from Game Audio 2 Effective Interactive Music Systems: The Nuts and Bolts of Dynamic Musical Content at 137th AES Convention 2014, Los Angeles
Presented Oct 9 by Winifred Phillips

We tend to think interactive music started with video games like Frogger, but Mozart held public concerts where musicians would play a score arranged by audience participation of throwing dice (“Musikalsches Würfelspiel”). What was true then is also true now: people enjoy influencing how music is played, or interacting with proxies that cause changes in music. Interactive music is often employed to reduce listener fatigue, because people tend to spend more time in a given play pattern than listening to the same music cue in linear media.

Horizontal Sequencing: using crossfades to switch between two different streams, or rearranging the order in which different musical “chunks” are presented.

Vertical Layering: Additive, where one, some or all layers can play simultaneously and everything still works; or Interchange, where some layers are mutually exclusive.

Music Data: Individual notes/samples are available to play and a separate instruction stream (a la player piano roll) sequences how to play them. Examples include MIDI and MOD.

Generative Music, also known as Algorithmic Composition
Some random element is introduced to support indeterminancy, like rolling dice. Rules govern how likely different musical events may happen. Wind chimes are a kind of generative music system.

As powerful and compelling as these interactive music forms are, linear music continues to play an important role, often being the best solution for a given situation in a video game.

Winifred Phillips presents about Interactive Music

Dig deeper in Winifred’s new book A Composer’s Guide to Game Music. You can also follow her on Twitter and read her blog.

October 9, 2014 / Randy Coppinger

All About The Decibel

My notes from Tutorial 7 All About The Decibel at 137th AES Convention 2014, Los Angeles
Presented Oct 9 by Alex U. Case

In the same way that Octave describes a ratio between two numbers, Decibel is a ratio of two numbers.

Although logarithms may seem intimidating, they do something really valuable for recordists as part of the decibel formula: they convert multiplication to addition and exponents to multiplication. Significantly vast quantities are “ranged down” using logarithms, providing a measurement that audio engineers can more readily communicate.

The decibel is useful for people concerned with intricate amplitude variations over time (recording engineers), not necessarily artists. Recordists help bridge the gap between art and technology for artists and the folks who make our recording gear. Decibel is the practical measure for those of us who’s job it is to be technically minded.

Alex U. Case presentation on decibel

Did Alex post more details from his presentation on his blog recordingology? You can also follow him on Twitter.

July 1, 2014 / Randy Coppinger

Headphone Alternatives for Voice Actors

If an actor is self recording an audition, the choice to use headphones is pretty simple: use them or don’t. But if there are other people involved — voice director, client, engineer, etc. — headphones may be useful for hearing those folks. As previously discussed, headphones may be causing some problems, but they might also help an actor hear collaborators. The decision to completely ditch headphones may not be simple. Some additional options would be helpful.

If the headphones let one hear the actor’s mic, the director’s talkback mic, and the engineer’s talkback mic, then one simple solution would be to lower or mute the signal from the actor’s mic. Most of the distractions from hearing oneself go away if the signal from the actor’s mic doesn’t feed the actor’s headphones. A professional recording engineer can configure it if you simply ask.

Sometimes the headphones just need to go away. A speaker in the room with the actor can be used to replace the headphones. BUT — and this is important — the audio from the actor’s mic should NOT feed the speaker or feedback may occur. Again, the recording engineer is responsible to set this up correctly.

On some big sound stages I’ve seen a table for the director, script supervisor, and others. The actor doesn’t need headphones to hear these people because they’re all in the same room. The engineer (in another room) may use a talkback speaker and/or headphones may still be helpful, but much of the critical communication over headphones goes away with lots of folks together in the same room. And you don’t need a major film studio budget. You can ditch the actor’s headphones in small scale recording setups if you just put everyone in the same room.

One Ear Off Technique for Voice ActorsONE EAR OFF
Another compromise is to work with one ear off, or use a single eared headphone. This may be enough of a change to minimize the distraction while still being able to hear collaborators. Just taking a side off is a quick fix that a voice actor can make without help from others and without taking time away from the session. Studio singers, musicians, and radio announcers have been doing this effectively for years.

When someone takes an ear off from stereo headphones and there is sound coming out, this makes feedback more likely. If the actor “wears” that open headphone speaker on the head behind the ear, it covers it up as if it were being worn on the ear. Placing the open speaker behind the ear helps the actor and engineer by keeping feedback to a minimum.

An engineer may proactively mute the signal to the open ear using a pan, mute, dead patch, etc. No output from the unused headphone means the actor doesn’t have to cover it to help prevent feedback.

June 24, 2014 / Randy Coppinger

Voice Actors and Headphones

Headphones are commonly used in recording situations. When recording voice, what are some effective uses for headphones? What are some pitfalls to avoid?

Headphones for a Voice Actor to Record AuditionsSometimes recording technology can exaggerate the sounds that voice actors make: plosives, sibilance, proximity effect, mouth noise. If it isn’t too distracting, it may be valuable for a voice actor to hear these undesirable sounds. Actors can perform some sounds more softly, change their mic technique, or have a drink of water if they know they are contributing to an issue because they can hear it in their headphones. When actors are recording themselves (for an audition for example) headphones can save time by allowing the actor to adjust during the recording, rather than re-recording because the problem was heard afterward.

Veteran voice actor and favorite human Jennifer Hale made the point that headphones are important for actors to monitor their performance when voice matching. If actors can listen to the target voice then hear their own performance in headphones it helps them get closer.

But if voice actors are not matching, monitoring their own performance may be more of a distraction than an aid. Jennifer reminded that voice director Kris Zimmerman intentionally asks recording engineers to NOT put out headphones for actors so that they give a better performance. But why? I believe it is because active listening requires brain power. If actors can be free from the burden of listening they have more attention to give their acting.

Likewise, hearing technical problems and worrying about them can be distracting. First and foremost actors need the space and comfort to act. Instead of helping, headphones may work against a great performance by focusing attention on problematic plosives, sibilance, proximity effect, mouth noise, etc. instead of crafting a believable character.

In addition, headphones may provide the illusion that an actor is speaking loudly. Some people find it difficult to project while wearing headphones. Sometimes the engineer can lower the actor’s level to the headphones to encourage the actor to perform more loudly, but projection is typically restored by simply removing the headphones all together.

For any of these distractions the simple solution may be: take off the headphones. If headphones are uncomfortable (don’t fit well, are too hot, cause listening fatigue) then removing them may be appropriate. Don’t let anyone tell you that you should wear headphones. Use them if they are helpful, make a change if they are not helpful.

Of course there are situations where headphones seem like a problem, but there are also good reasons to wear them. Are there any good options? Next: Headphone Alternatives for Voice Actors.

Additional topics: dealing with headphone cables, headphone sizing, clicky/rattling headphones, and more.

On Facebook, Sean Hebert brought Balanced Armature technology to my attention. It’s fascinating stuff you can read about, along with other useful information about how headphones work, on this Wikipedia entry.

April 29, 2014 / Randy Coppinger

Headphone Comparison: Audio Technica ATH-M50x vs. Shure SRH 840

A friend of mine received a pair of Audio Technica ATH-M50x headphones and asked me to evaluate them. My Shure SRH 840 headphones were handy, so I used them as a comparison to the AT headphones.

Review of Audio Technica ATH-M50x headphones with comparisons to Shure SRH 840 headphonesThe first thing I noticed were smaller ear pieces. They were just big enough to be circumaural – pads resting against my head and completely covering the outer ear. From the inside of the ear piece my pinna touched the pads top and bottom; not as roomy inside as my SRH 840s.

Listening back and forth I found the ATH-M50x headphones a bit brighter. They seemed like they could be harsh and fatiguing over time, though I did not listen to them for an extended period to confirm this.

The Shure headphones had a fairly even bass extension. Nothing amazing, but balanced and fairly true. By comparison the Audio Technica headphones were thin with deep bass, and seemed to be compensating with a noticeably louder upper bass. On spoken word most voices had a honk, sounding chesty or nasal. The ATs exaggerated mild room resonances making them seem like a bigger problem than they really were on the Shure headphones or on speakers.

I noticed outside noises were easier to hear wearing the ATH-M50xes than the SRH 840s. I prefer the isolation of the Shure headphones, which was much better than this Audio Technica model.

Audio Technica does a good job with build quality in their products, from their cheapest to their highest quality models. The ATH-M50x headphones looked sturdy. The detachable cable would make it fairly easy to replace for those who put a lot of wear and tear on wires.

At nearly $50 less the Audio Technica ATH-M50x headphones made some noticeable compromises to reach the price point. They are certainly usable and – like most transducers – one could learn how to listen reliably on them with enough time and experience.

See also: Shure SRH 840 vs. Sony MDR-7506, and
18 Headphone Brands Ranked from Worst to First