Perceptual pitch shift of sounds with a similar waveform autocorrelation

KXX-ABX sound examples

These examples illustrate all the stimuli studied in the ARLO paper: "Perceptual pitch shift of sounds with a similar waveform autocorrelation", by Daniel Pressnitzer, Alain de Cheveigné and Ian M. Winter. The perception of two types of sounds termed KXX and ABX, respectively, was investigated. Please refer to the full text for details.

Two versions of the sound examples are provided here: the sounds used in the actual experiment that were sampled at 20 kHz; plus the sounds resampled at 44.1 kHz using Matlab. The 20 kHz version is probably best suited for displaying and analysing the sounds, the 44.1 kHz version should play on most platforms.

All sound sequences have the same structure: [KXX-ABX-pause] played three times, with different random realisations of KXX and ABX. The pitch shift effect described in the paper predicts that KXX should sound lower than ABX for the high periodicities, 200 and 400 Hz. This happens in spite of very similar waveform autocorrelation peaks for both stimuli.

All conditions, 20 kHz sampling frequency
Broadband, 100 Hz periodicity 3 kHz high-pass, 100 Hz periodicity 6 kHz high-pass, 100 Hz periodicity

Broadband, 200 Hz periodicity 3 kHz high-pass, 200 Hz periodicity 6 kHz high-pass, 200 Hz periodicity

Broadband, 400 Hz periodicity 3 kHz high-pass, 400 Hz periodicity 6 kHz high-pass, 400 Hz periodicity

All conditions, 44.1 kHz sampling frequency
Broadband, 100 Hz periodicity 3 kHz high-pass, 100 Hz periodicity 6 kHz high-pass, 100 Hz periodicity

Broadband, 200 Hz periodicity 3 kHz high-pass, 200 Hz periodicity 6 kHz high-pass, 200 Hz periodicity

Broadband, 400 Hz periodicity 3 kHz high-pass, 400 Hz periodicity 6 kHz high-pass, 400 Hz periodicity

Why would this be interesting? Well, even though we know that pitch perception by humans is rather complicated, an intuitive belief is that the pitch information is visible somewhere in power spectrum-like or autocorrelation-like representations of the sounds. In the present case, we get almost no difference in these representations when we analyse the high-pass filtered versions of KXX and ABX. We do recover a difference, however, after passing the stimuli through a very crude auditory model (see full text).

KXX, 3 kHz high-pass, 200 Hz periodicity ABX, 3 kHz high-pass, 200 Hz periodicity

Daniel.Pressnitzer@ircam.fr

All conditions, 20 kHz sampling frequency
Broadband, 100 Hz periodicity	3 kHz high-pass, 100 Hz periodicity	6 kHz high-pass, 100 Hz periodicity
Broadband, 200 Hz periodicity	3 kHz high-pass, 200 Hz periodicity	6 kHz high-pass, 200 Hz periodicity
Broadband, 400 Hz periodicity	3 kHz high-pass, 400 Hz periodicity	6 kHz high-pass, 400 Hz periodicity

KXX, 3 kHz high-pass, 200 Hz periodicity	ABX, 3 kHz high-pass, 200 Hz periodicity