Next: 9.3 Integration in Diphone Up: 9. Applications Previous: 9.1 Controlling Additive Synthesis

9.2 Synthesis of the Singing Voice

One of the primary applications of the spectral envelope handling developed in this project is the high quality synthesis of the singing voice. In the additive synthesis paradigm, synthesis is a resynthesis of the previously analysed and modified voice. To effect the modifications in a sensible manner, the constraints posed by the speech organs have to be taken into account.

For example, as demonstrated in section 2.3.1, transposition of the voice quickly sounds very unnatural when the spectral envelopes are not corrected, because they reflect the configuration, especially the length, of the vocal tract. To avoid this, the transposition program which is part of the spectral envelope library has the possibility to automatically estimate the spectral envelope of the original sound and reconstitute it by applying it to the transposed sound.

Also, many aspects of the expressivity of the singing voice (as well as prosody in speech--see section 5.2) depend on the spectral envelope, i.e. on timbral variations, rather than on pitch and amplitude alone.

With the methods of interpolation between spectral envelopes and formants, a new type of high quality additive synthesis of voice is possible. To preserve the rapid changes in transients (e.g. plosives), and the non-formant shaped noise spectral envelopes in fricatives, these are best synthesised with the harmonic sinusoids + noise model, controlled by envelopes in spectral representation. For precise formant locations in the steady part of vowels, the formant representation of spectral envelopes can be specified.

Next: 9.3 Integration in Diphone Up: 9. Applications Previous: 9.1 Controlling Additive Synthesis

Diemo Schwarz
1998-09-07