Ten experiments on vowel segregation

de Cheveigné, A. (1997), "Ten experiments in concurrent vowel segregation," ATR Human Information Processing Research Labs technical report, TR-H-217.

This report describes ten experiments on concurrent vowel segregation and identification. Experiments are numbered from 1 to 10.

Experiment 1 was designed to be sensitive to a variety of hypothetical mechanisms by which frequency modulation (FM) might affect identification. The results were mostly negative, in the sense that no effect was found that could not be attributed to other factors. The only "FM effect" observed was that identification was better for incoherent than for coherent modulation. However this effect was small, and one cannot rule out that it was caused by unavoidable differences in the pattern of instantaneous \deltafo\ between FM conditions.

Experiment 2 explored the identification of 3 concurrent vowels. As for in the case of 2 concurrent vowels, a difference in \fo\ between vowels aided identification.

Experiment 3 explored the effects of \deltafo\ and amplitude differences between vowels over a relatively wide range. Presence of a \deltafo\ helped identification when the target/competitor amplitude ratio was low (down to -25 dB). The effect disappeared at -35 dB. In general identification was better at 3 \% than at 0 \%, but there was little difference between \deltafo\ = 3 \%, 6 \% or 12 \%. One might have expected larger \deltafo s to be more effective at low target amplitudes. Such was not the case.

Experiment 4 explored the region of very small \deltafo s, while controlling for phase effects and beats. As it turned out, the smallest \deltafo\ used, 0.375 \%, was sufficient to cause segregation. This did not seem to be the consequence of beat patterns caused by the \deltafo.

Experiment 5 explored \deltafo\ effects at short durations (125 and 62.5 ms), while again controlling for phase effects. \deltafo\ effects were somewhat weaker at 62.5 and at 125 ms than at 250 ms, but they were still quite large and significant.

Experiment 6 attempted to find evidence for harmonic enhancement. Double-vowel stimuli were divided into two short pulses separated by a silence. The \fo s of the target and competitor shared the same value in the first pulse, and one, the other or both could differ from this value by 6 \% in the second pulse. It was expected that a jump in target \fo\ might impair harmonic enhancement and reduce the identification rate. No such effect was found.

Experiment 7 reproduced the 3-vowel experiment with a 3-vowel forced response task, instead of the 1,2 or 3 response task of Exp. 2 . The 3-vowel forced response task is less affected by "multiplicity" cues. A comparison between Exp. 2 and Exp. 7 allows other cues to be factored out, so the role of multiplicity cues can be assessed.

Experiment 8 was an extension of Exp. 4 to even smaller \deltafo s (0.1 and 0.2 \%). An additional intervowel phase relation (antiphase) was also included. As in Exp. 4, effects of \deltafo\ were observed at 0.8 and 0.4 \% (equivalent to the 0.75 and 0.375 \% conditions of Exp. 4) but not at 0.2 or 0.1 \%. Phase (same phase vs antiphase) had little effect at 0.2, 0.4 or 0.8 \%. It had some effect at 0 and 0.1 \%.

Experiment 9 investigated the effect of formant bandwidth on segregation. Formant bandwidth is known to have surprisingly little effect on vowel identification, but it affects the "peakiness" of the spectrum and so is likely to affect the way a vowel's features emerge from the spectrum of a concurrent vowel pair. Such was indeed the case: in general a vowel was much better identified if its formant bandwidths were narrower than normal (by a factor of 2), rather than wider than normal (by a factor of 2). Somewhat unexpectedly, identification was better if the interfering vowel had wide bandwiths rather than narrow. Narrowing the formant bandwidths of a vowel has effects similar to raising its RMS amplitude.

Experiment 10 attempted to find evidence of harmonic enhancement (improved identification based on the harmonic structure of a target) by measuring identification of static or frequency modulated diphthongs (sequential vowel pairs) that were partially masked by a noise (harmonic or inharmonic) with a vowel-like spectrum. Enhancement was expected to cause better identification of targets with a static \fo. No such effect was observed.

[Alain's home page]