Voice-transformation examples presented at the
35th International Conference of the Audio Engineering Society "Audio for Games", London, 11-13 February 2009

Natural transformation of type and nature of the voice for extending vocal repertoire in high-fidelity applications

Snorre Farner, Axel Röbel, and Xavier Rodet

Analysis/Synthesis Group, IRCAM


Abstract

Natural voice transformation will reduce the need for authentic voices in many situations, ranging from vocal services via education and entertainment to artistic applications. Transformation of one voice to correspond to that of another person has been studied for decades but still suffers from limitations that we propose to overcome by an alternative approach. It consists in modifying pitch, spectral envelope, durations etc. in a global way. While it sacrifices the possibility to attain a specific target voice, the approach allows the production of new voices of a high degree of naturalness with different sex and age, modified vocal quality (soft, breathy, and whisper), or another speech style (dullness and eagerness). The transformation of sex and age has been evaluated by a listening test.

[PDF version of the presentation], sound examples below.

Signal transformation

Original sound

Modification of

Transformation of sex and age

See also demo below.

Transformations of voice quality and speech style

Video demonstration of voice transformation

You need to upgrade your Flash Player to at least version 9 to view this video content.
Click here to download the latest version of Flash Player
© 2008. Characters by Cantoche and voices by Ircam. Flash player by LongTail Video

The video above shows the effect of applying four different voice transformations (man, old woman, little girl, and young man) to a single voice (play original: ) and assigned each to a cartoon character.

Acknowledgements

The work was done as a part of the projects VIVOS and Affective Avatars with financial support from Agence Nationale de la Recherche (ANR). We are grateful to Chinkel (SonicVille) for recording of actors, and not least the two actors Christine Paris and Matthieu Rivolier for their engaged interpretations of a great number of voice qualities.
Home: http://recherche.ircam.fr/equipes/analyse-synthese
Snorre Farner: http://www.pvv.ntnu.no/~farner