Comparison of Cross-Talk Cancellation Filters

By Angelo Farina, May 28, 2010

Preamble

The following technical note on the nature of crosstalk cancellation was prepared by Professor Angelo Farina of the University of Parma, Italy. Dr. Farina, a fellow of the Audio Engineering Society, has been an active supporter of Ambiophonics from its conception and has written many papers on this and similar topics. His website is http://pcfarina.eng.unipr.it His is an invaluable website to anyone doing research in this field. In the following paper, RACE stands for Recursive Ambiophonic Crosstalk Elimination. Unfortunately there are no frequency or dB numbers on the graphs shown. However, the first peak is normally at about 4000 Hz and the bass droop begins at about 100 Hz. To avoid confusion, in all the graphs ignore the dashed curves. The solid curves represent what each competing crosstalk cancelling algorithm can do. The blue solid line is essentially what the left ear will hear with a left only input and the red solid curve is what the right ear will hear also with a left only input. Ideally the blue curve should be a flat straight line and the red curve (the undesirable crosstalk) should approach inaudibility. These curves only show the improvement in ILD (Interaural Level Difference) and not the equally important improvement in ITD (Interaural Time Difference) or the improvements in pinna functionality due to the reduction in comb filtering. Basically, what has been confirmed in this paper, is that it is indeed possible to have excellent crosstalk cancellation without an audible or significant loss of dynamic range. In the post amble below are some additional observations as to the audibility and applicability of some aspects of these types of measurements or calculations.

Ralph Glasgal, December, 2010

This is the typical flow diagram of a XTLK cancellation filtering system:

This is the typical frequency-response of acoustical transfer functions h

The traditional Stereo Dipole cross-talk cancellation filters are designed for flattening the ipselateral signal, and destroying completely, at every frequency, the controlateral (crosstalk) signal:

This results in a significant loss of gain, which compromises the dynamic capabilities of the system. The loss of gain is caused by the fact that at low frequency the ipselateral and controlateral response were very similar, so for cancelling the controlateral, the ipselateral must be weakened significantly.

The possible solutions are:

1) Employ a separate pair of powerful woofers for low frequencies; they can be spaced more than the main loudspeakers (OSD) >
It must be noted that this causes a gain loss which is still much greater than RACE, albeit not as large as with the original Kirkeby approach.

2) Employ a processing algorithm which does not cause a significant gain loss, but with reduced cross talk at low frequency (RACE) as shown here below:

It must be noted how RACE leaves substantially unchanged the ipselateral response, with just minimal loss of gain, whilst the controlateral (crosstalk) is strongly attenuated at mid-high frequencies.

3) The modified Kirkeby inversion as proposed by Farina, where the regularization parameter is >increased at low and high frequencies, so that the inversion is less accurate in these regions:

It must be noted that this causes a gain loss which is still much greater than RACE, albeit not as large as with the original Kirkeby approach.

4) And now, finally, the new Farina-Binelli approach, in which the Kirkeby’s inversion is applied again, but changing the target function, which is not flat any more and does not attempt to cancel the crosstalk at low frequencies.

It must be noted that the new ipselateral target function, with bass boosted between 80 and 300 Hz, corresponds to the “Italian taste” — the result of psychoacoustics tests conducted in Italy. It is known that such a bass boost is not so appreciated in northern countries, whilst in Brazil they love it even more than in Italy. They probably would like to have bass boost extended down to 30 Hz and boosted a few more dB. The gain/loss is now similar to RACE at low frequency and, similarly, the XTLK cancelation is poor at low frequency. At high frequency, the response is now flattened, but of course the gain must be lower, as the equalized curve has to always lie below the original one. Listening tests are planed during summer 2009, for assessing comparatively the performance of all these cross-talk-cancellation filters in real conditions.

Postamble

Since the ear cannot localize low bass frequencies from about 90 Hz down there is no reason to do crosstalk cancellation in this frequency region. Thus in most RACE implementations, this band is simply bypassed around the crosstalk elimination processor. However, in practice this bass response irregularity is much less than is found in standard stereo where bass is boosted by 3dB in power because both speakers equally illuminate each ear because there is no attenuation around the head. Using RACE, the bass variation is about plus or minus 1.5 dB in the lowest octave depending on the bass channel distribution in the recording. Room modes easily swamp such variations and in the bass such a small change in low bass level is not detectable and is much less a variation than found in concert halls. But, again, the bass can be bypassed and is then heard as in normal stereo. Subwoofer controls can also be used to compensate for any perceived bass error in Ambiophonics or stereo. The basic problem with all such curves is that they do not indicate what actually happens psychoacoustically. These curves represent the time averaged sum of the two acoustic signals reaching each ear. In essence, imagine a microphone at an ear measuring pink noise coming from each speaker and displaying the response curve on a screen. This measurement or calculation scheme is an averaging method. But is that the way the ear works? In RACE, for say a left only input, the left speaker delivers this left channel input unaltered to the left speaker. This is followed by some delayed and about 6dB attenuated acoustic energy from the right speaker and even longer delayed and more attenuated energy coming then from the left speaker and so on for milliseconds for each sample and similar trains from earlier samples. These late arriving decaying replicas of the original left input are like early reflections off heads and seats at a concert. There is no evidence that the brain hears these later arrivals as changes in frequency response even if a meter shows them as such. This is like a precedence effect where early echoes do not confuse the brain as to the position of a sound source. Does the brain recognize trains of samples delayed on the order of 400 microseconds to milliseconds and greatly attenuated as changes in timbre? One can easily see that the answer is no. Just move up toward the speakers in a RACE system until the 60 degree stereo loudspeaker angle is reached and you will hear normal stereo. But more research is needed to more fully document this phenomena. But the issue is easily avoidable. At higher frequencies above 2000 Hz or so the pinna exclusively begin to control localization. It thus makes no sense to do crosstalk cancellation at frequencies over about 4000 Hz. So if this region is not crosstalk cancelled one hears these frequencies as in stereo. But in the normal stereo case, the speakers, being at 30 degrees to the side, provide a false pinna localization cue especially for central sounds, and also engender comb filtering starting at about 1000 Hz. With RACE the speakers are close together and the pinna cues are correct for much of the central region. Also, the start of the comb filtering is moved up more than two octaves which helps the pinna. It is also possible to eliminate the high frequency peak via equalization or by correcting the inverter/delay of the RACE system so that it truly inverts at higher frequencies. That is, one compensates the inverter or adjusts the delay to compensate for the phase shift engendered by a fixed delay so that the crosstalk is always completely cancelled rather than enhanced accidentally at high frequencies. Obviously, simple bypass is more practical.

Ralph Glasgal, December, 2010

By Angelo Farina, May 28, 2010

Share this:

Related

Leave a comment Cancel reply