Cancellation is also an effective strategy for exploiting binaural (two channel) cues. It is used by humans, according to the Equalization-Cancellation theory of Durlach. Cancellation can provide infinite interference suppression (for point sources, in the absence of reverberation), whereas 2-channel beamforming (the binaural equivalent of "harmonic enhancement") is limited to a 6 dB improvement at the very best. In the presence of multiple interfering sources or reverberation, however, 2-channel cancellation is much less effective.
Before discounting binaural cancellation, it is worth investigating a new model by Culling and Summerfield (1995), in which cancellation is performed independently in different frequency channels. In principle, this could allow simultaneous cancellation of multiple interfering sources, as long as they don't have all the same spectral envelope.
Multiple microphones allow much more flexibility: beam-forming can have more gain, and multiple sources can be cancelled. The problem is perhaps best viewed in terms of blind separation [using techniques designed for delayed and convolved, a opposed to statically mixed, sources]. Blind separation can be viewed as a technique for finding the separation matrix that produces N outputs that are well decorrelated. Supposing that one corresponds to the target and each of the N - 1 others to an interfering source (or part of the reverberation), the best decorrelation occurs when all interfering sources are canceled in the target's channels. Blind separation serves as a technique to find the best cancellation parameters.
It is likely that, beside the many approaches of blind separation, there are others that are less mathematical but nevertheless useful. For example one can exploit the statistics of temporal speaking patterns, and the fact that, in normal multi-speaker conversation, most speakers are silent most of the time. When a single person speaks, the system can adjust to cancel him/her more easily than when several people speak (with multiple microphones this can be done for a subspace of parameters). This may make cancellation of multiple speakers easier. Or else the system can learn the position parameters of each speaker (or interfering source). It then only has to search the relatively small space of speaker combinations to find appropriate separation parameters. Audiovisual methods might be quite useful here.
It should be stressed that, whatever the technique, the separation matrix introduces spectral distortion, as was the case for harmonic cancellation. Cancellation, whether it involves blind separation or not, should be used together with missing feature methods.