How Good Can Stereo Be? – Try Ambiophonics and See

Speaker Stereo v. Speaker Binaural

by Robin Miller ©2008

We’re not fooled; recordings aren’t real. With two speakers spaced 60° in front of us – let’s call that “speaker stereo” – the three most key qualities of audio reproduction are distorted. Defects are due not only to your listening equipment or acoustics, but simply to the positional relationship of the speakers and your ears. Unless that relationship is corrected to produce no acoustic artifacts, correct localization, spatiality, and tone color remain imprisoned in much of your recorded collection. Even 5.1 surround is still 60° stereo in front plus two side-rear speakers to provide envelopment, but still we do not experience being “transported” to the extent that captured auditory scenes are unlocked, and our speakers and listening space “disappear.”

Speaker-stereo, invented in the early 1930s and popularized after World War II, initially wowed listeners. Today we’ve had to settle, or continue trying this and that, wishing for better sound. Yet the myriad recordings made over this time period have often in fact captured auditory scenes that remain unauralized when reproduced using the conventional stereo triangle due to crosstalk. Blumlein knew this, but in 1932 the compromise was small. Today we have orders-ofmagnitude more linear transducers and electronics. Most weak links in the chain have been strengthened, so that crosstalk is by far the weakest, but now eliminated by using Ambiophonic systems. Are you asking: What is Ambiophonics?

Are you asking: What is Ambiophonics? And how can I have it in my home? The short answers are: It’s “speaker-binaural” – and it’s a better way to listen to stereo. Ambiophonics is so named as to suggest that it is the ultimate replacement for stereophonics, just as stereophonics replaced monophonics. All the detailed papers and tools you need are available free for personal use on this web site. Read here how Ambiophonics unlocks worlds of sonic reality waiting in your present CD/LP/DVD collection.

60 Degrees of Seperation

Unlike headphone listening, with separated stereo speakers, each ear hears both speakers. Each reproduced channel suffers from crosstalk in the contra-lateral (opposite) ear. With conventional 60° positioning, each single sonic arrival is accompanied by a delayed artifact that confuses our perception of direction and discolors timbre (Fig.2).

For important central sources, the equal delays in both ears result in comb-filtering, notching mid frequencies on up. Moreover, the brain is confused because the outer ears (pinnae) tell it centered signals are actually coming from speakers at the sides. Sounds at either side are confused by extra “early reflections” from the added contra-lateral delays. And everything is unnaturally narrowed to 60° in front. Listening on iPod®-like devices may satisfy more for lack of crosstalk and therefore binaural purity rather than for high fidelity – a case of psycho-acoustic perception trumping high data rates or high budget speakers. 

Turning Stereo Inside-Out

Ambiophonics, championed by Ralph Glasgal and the Ambiophonics Institute, accomplishes “speaker binaural” by advantageously positioning speakers and by crosstalk cancellation. Early on, this was accomplished simply, if uncomfortably, by using a physical barrier to isolate each speaker-to-ear path. Instead of confining phantom images between two stereo speakers, Ambiophonics turns stereo inside out, placing speakers close together and creating more correctly localized and uncolored sonic images outside the speakers (Fig. 3 & 4). For surround sound, a second pair of speakers is located in back. Since front imaging is excellent, no separate center speaker is required for 5.1 (the DVD player is set for “no center” in order to mix the C signal to L & R channels).

Digital signal processing (DSP) technology enables quite elegant methods for crosstalk cancellation using an ordinary PC and soundcard. Newly developed is Recursive Ambiophonic Crosstalk Elimination (RACE), with MIDI control of real-time adjustable parameters, and available free for noncommercial use on this web site (see menu above for “Software Tools”), or implemented for a price in the TACT RCS 2.2XP pre/processor.

While Ambiophonic DSP solves stereo’s biggest problem, like stereo and 5.1, it degrades away from the sweet spot. Wavefield Synthesis (WFS) notwithstanding, any practical reproduction system for home use has a sweet spot or area. The more capable the reproduction and precise the recording, the more clearly we perceive a sweet point, line, or plane. Ambiophonics achieves crosstalk cancellation only along the median line between the speakers, but you want to be on that sweet line because it sounds best there in any case. (Moving off the median line is helped by ambience convolution as described later.) A workstation or PC fixes the listener in the sweet spot.

To leave the choice of compromises to the user, RACE is adjustable in real time. Settings trade coloration artifacts with stage width, which can exceed 120° – twice the width of conventional stereo, and with no bunching of sounds at the speakers (or hole-in-the middle). Minimizing side-to-center coloration differences, but with fixed parameters, is the Choueiri optimized crosstalk cancellation convolution solution that uses separated frequency bands.

What about surround sound? Experimental PanAmbio 4-channel/4-speaker surround recordings are able to localize sounds around 360° within ±5° accuracy. With that comes improved spatiality and undistorted tone color in 2D. Experimental full-sphere recordings based on Ambiophonics deliver complete natural hearing in 3D realism especially compared to stereo, 5.1, 7.1, etc. (Demonstrations of RACE, Choueiri optimized crosstalk cancellation, or High Sonic Definition 3D are available by appointment at Filmaker Technology.

Restoring Ambience

For reasons described in the sidebar “Microphones for Stereo and Ambio,” commercial recordings must be produced with lower ambience than would be heard in reality. For restoring ambience, Ambiophonics as an option uses extra speakers and convolution with acoustic impulse responses (IR) of your choice, collected from concert halls, caves, etc. Ambience convolution is especially “green” because ambience channels do not have to be recorded or take up delivery bandwidth. Many released stereo recordings benefit from such ambience enhancement, especially if the IR is available of the exact or similar space in which the recording was made. For example, a necessarily “dry” release made in Der Musikverein in Vienna can be restored using that very hall’s IR.

Note that hall IR convolution is decidedly not just “reverb.” Actual spaces are measured, made available for purchase, downloadable from Websites, then uploaded to the user’s DSP/PC where it is selected and adjusted to merge perfectly with the recording.

While the benefits of ambience IR convolution are many, it applies to staged performances, where all source voices are in front. Direct surround recording and reproduction is preferable if direct sounds come from all around. Exact early reflections obtained this way are integrated by the pinna-brain system into lifelike localization, spatiality, and tone color.

Ambiophonics for All

The results? Scientific studies with blind listening tests concluded that Ambiophonics produces preferred localization, spatiality, and tone color compared to speaker stereo (Fig.5). Ambiophonics has been demonstrated and well received at the Ambiophonics Institute outside New York City and at Filmaker Technology in Bethlehem PA USA, at trade shows CES and HE2007, and club meetings such as the New York Audiophile Society and Boston Audio Society, and in papers presented at engineering conferences of AES, ASA, SMPTE, and VDT Leipzig.

Is Ambiophonics the perfect reproduction we seek? Perhaps not yet – work toward “perfection” is ongoing. However, for serious listening by one or two seated on the median line between Ambiodipole speakers, 2-channel Ambiophonics can unlock qualities captured in most of your recorded collection, including LPs and CDs. 4-channel PanAmbio produces more exciting surround sound for 5.1 DVD movies and games, and is especially compelling for opera and other multi-channel concert DVDs.

As for the emerging formats, the high definition sound of Ambiophonics is desirable to complement high definition pictures. But many recordings over stereo’s 75 year history await your rediscovery. And your renewed enjoyment.

Fig.1 – Ambiophonics in a home environment. To speakers in conventional
stereo positions is added a closely spaced Ambiodipole in front.
Speaker switching in receiver selects mode. A second pair in back adds 5.1 compatible surround.
Fig.2 – Although recording captures properly timed stereo signals at top, during conventional stereo replay, speaker
crosstalk creates pinna confusion and comb filtering of the center voice – artifacts avoided using Ambiophonics.
Fig.3 – Ambiophonics for “speaker binaural” replay of stereo recordings using LF & RF. PanAmbio 4.0 adds speakers LB
& RB for 5.1-compatible surround, or for enhancing stage width when playing existing two channel CDs or LPs.
Fig.4 – Ambiodipole FL & FR “speaker binaural” broadens stage to 120° or more.
Adding BL & BR envelops 360° (allowing for “cone of confusion” at sides) for 2D surround.
Fig.5 – Blind study comparing a live oboe trio to replay of recording in stereo,
Ambiophonics, and front+back Ambiophonics shows significant preference for Ambio.

Appendix 

Microphones for Stereo, Ambio, & Surround 

by Robin Miller ©2008

Suspended over the tenth row of a moderately live hall, the “dummy head” microphone was well outside the “critical radius.” Live, the concert sounded glorious. Back home, the binaural recording would recreate the event with compellingly like-like reproduction over headphones.

However a binaural recording (one involving that part of HRTF due to the pinnae) would be unusable over speakers conventionally positioned in the stereo “triangle.” For one thing, the recording would be confusingly “encoded” with filtration by pinnae other than your own. Moreover, too much ambience would be captured in the recording than would make sense to our ear-brain system coming entirely from only 60° in front, instead of all around. Reproducing a natural level of ambience using surround channels makes sense to the ear-brain. Natural hearing includes not only the entire horizontal circle, but a sphere of sound (including height cues) – anything less isn’t totally real. Thirdly, while binaural recordings work over headphones, they may not for more comfortable speaker listening, where the soundstage is external to the listener’s head, and stays put with head rotation (We unconsciously perform tiny head rotations to confirm localization).

No wonder that recording acceptable stereo over speakers has “devolved” from use of a main microphone, acting as surrogate ears for the listener, to many close-in microphones panned channel to channel, as though interaural level differences alone (ILD) made for the spatial sound we experience in real life. An entire class of coincident main microphones, from Gerzon’s ingenious Ambisonics and its mathematical subsets, including Blumlein’s XY figure eights and Mid-Sides (MS), also lack the interaural time differences (ITD) of natural hearing. At the other extreme, widely spaced pairs and moderately spaced Decca-Trees and variants capture time differences many multiples of the 640µs between our ears, and so deliver highly spacious but ambiguous, phasey sound that diffuses the image so totally as to make any “sweet spot” undetectable (i.e. the result sounds imperfect everywhere; no better in any one spot). Popularly employed, these microphones are usually placed well within the critical radius of acoustic spaces, where no one in the audience would want to be (often over the conductor’s head). Here, the ambience impression is purposely weakened for normal stereo release. But by the relative inverse squares of their distances, close instruments are too close and distant ones too distant. Fixing these resulting imbalances has led to a cure often worse than the illness – namely many spot microphones that increase costs for producers and artifacts for consumers.

Truthful microphone techniques for stereo reproduction lie somewhere in between. “Pinnaless dummy head” recordings played over stereo speakers are still too ambient: the auditory scene seems receded. The ORTF technique is an outstanding compromise, as the cardioid microphones it employs attenuate offstage ambience, although cardioids also lack bass response. Bruck’s pinnaless sphere microphone with bi-directional outriggers (Schoeps KFM-360) separates soundstages front and back, compatible with 5.1 surround. The baffled ellipsoidal Ambiophone developed by the writer separates the front 120° from the back 240° and ceiling for precise stereo and surround localizing, spatiality, and tone color, especially in PanAmbio (4-channel) as well as 5.1 surround. Sonic reality is completed by full-sphere 3D (with height) recordings, such as the author’s High Sonic Definition 3D


An internationally recognized engineering consultant and Peabody award-winning producer, Robin Miller has presented advanced 2D and 3D audio solutions worldwide to the Audio Engineering Society, Society of Motion Picture & Television Engineers, Acoustical Society of America, Canadian Acoustical Association, and German Tonmeisters. As an invited panelist at the AES 2007 Italia conference in Parma, he demonstrated Ambiophonics, 5.1-compatible PanAmbio 2D surround, and full-sphere 3D using ten speakers using his original recordings. His company, Filmaker Technology, engages in applied research, systems design & integration, and has a patent for a system of full-sphere 3D recording & reproduction.

Leave a comment