next up previous
Next: STRAIGHT Up: Time/frequency resolution in feature Previous: Optimal smoothing of short-window

Optimal FFT window

In the previous scheme temporal resolution was limited by two factors: the fixed size of the FFT window and the variable size (1/F0) of the smoothing window. Slightly better resolution could be obtained by setting the FFT window itself to 1/F0 (with a square window), and removing the subsequent temporal smoothing stage. In practice, the signal must be resampled so the period 1/F0 fits a power of two. After the FFT, the spectrum must be resampled.

Advantages of this scheme are: (a) Voice-related fluctuations in the time-domain are removed perfectly, at the smallest possible cost in terms of temporal resolution. (b) Voice-related ripple of the spectrum is also eliminated. The spectral envelope is accurately described by the spectrum produced by the FFT. (c) There are no window artefacts.

The scheme also has disadvantages. Resampling the waveform and spectrum introduces an extra computational cost. More troublesome, much more accurate F0 estimation is required. A small error in F0 estimation will cause the zeros of the integration window to be shifted in a way that can severely distort the spectrum in the high frequency region. A subharmonic error (doubling or tripling of the period estimate, hard to avoid) will introduce zero-amplitude coefficients in the spectrum (which will look spiky rather than smooth). This interferes with spectral resampling.


next up previous
Next: STRAIGHT Up: Time/frequency resolution in feature Previous: Optimal smoothing of short-window
Alain de Cheveigne
1998-02-16