See also my list of publications
I'll review several recent models of pitch perception, multiple pitch
perception (of concurrent instruments), and source segregation. All are
based on a hypothetical neural mechanism involving temporally precise
inhibition. In their simplest form, the models involve a gating neuron fed
by two pathways, one direct and excitatory, and the other indirect and
inhibitory. All spikes arriving along the direct pathway are transmitted,
unless they coincide with a spike arriving along the delayed pathway. This
affects the statistics of inter-spike intervals, and suppresses the
correlates of periodic sources with a period equal to the delay. Tuning
this "cancellation filter" to the period of a source suppresses that source
and allows other sources to emerge. The filter can also be used to
estimate the period (and therefore the pitch) of a sound, by exploring the
delay parameter space in search of a minimum. In that case the model is
formally equivalent to the classic autocorrelation pitch perception model
of Licklider. However the cancellation model can easily be extended to
explain the perception of multiple pitches evoked by simultaneously playing
instruments. Recent psychophysical results on the pitch shifts of mistuned
harmonics can be used to predict the topology of the neural circuits
involved.
References:
(1) A. de Cheveigné (1999). "Pitch shifts of mistuned partials: a time-domain model," J. Acoust. Soc. Am. 106,887-897. [abstract]
(2) A. de Cheveigné and H. Kawahara (1999). "Multiple period estimation and pitch perception model," Speech Communication 27,175-185. [abstract and PostScript]
(3) A. de Cheveigné (1998). "The auditory system as a separation machine.", Proc. ATR workshop on events and auditory temporal structure, 1-7. [PDF]
(4) A. de Cheveigné (1998). "Cancellation model of pitch perception," J. Acoust. Soc. Am. 103,1261-1271. [abstract]
(5) A. de Cheveigné (1997). "Concurrent vowel identification III: A neural model of harmonic interference cancellation," J. Acoust. Soc. Am. 101,2857-2865. [abstract]
(6) A. de Cheveigné (1993), "Time-domain comb filtering for speech separation," ATR Human Information Processing Laboratories technical report TR-H-016.
Alain's Home page