Main.AHMM-BasedSpeechSynthesisSystemUsingANewGlottalSourceAndVocal-TractSeparationMethods History
Hide minor edits - Show changes to markup
A HMM-Based Speech Synthesis System using a New Glottal Source and Vocal-Tract Separation Methods (with G. Degottex)
- This work introduces a HMM-based speech synthesis system which uses a new method for the separation of vocal-tract and Liljencrants-Fant model plus Noise (SVLN) proposed by G. Degottex.
- The glottal source is separated into two components: a deterministic glottal waveform Liljencrants-Fant model and a modulated Gaussian noise.
- This glottal source is first estimated and then used in the vocal-tract estimation procedure.
- Then, the parameters of the source and the vocal-tract are included into HMM contextual models of phonems.
- The synthesis results were subjectively evaluated here
- A HMM-Based Synthesis System Using a New Glottal Source and Vocal-Tract Separation Method,
P. Lanchantin, G. Degottex and X. Rodet,
ICASSP2010 Proceedings, Dallas, USA, 2010.
(:table border=1 cellpadding=2 cellspacing=0 align=center:) (:cellnr bgcolor=#cccc99 align=center:) Pair (:cell bgcolor=#cccc89 align=center:) Pulse (:cell bgcolor=#cccc89 align=center:) STRAIGHT (:cell bgcolor=#cccc89 align=center:) SVLN (:cellnr bgcolor=#cccc99 align=center:) 2 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 4 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 6 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 8 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 11 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.lf.mp3 width=62 height=18:) (:tableend:)
Transformation examples
- SVLN is promising for voice transformation in synthesis of expressive speech since it allows an independent control of vocal-tract and glottal-source properties.
F0 scale | VTF scale | Rd scale | Audio (HTS) | |
---|---|---|---|---|
1 | 1 | 1 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp01.mp3 width=200 height=18:) | Original voice |
0.6 | 1 | 1 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp02.mp3 width=200 height=18:) | |
0.6 | 0.85 | 1 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp03.mp3 width=200 height=18:) | |
0.6 | 0.85 | 0.5 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp04.mp3 width=200 height=18:) | Baryton voice |
2.5 | 1 | 1 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp05.mp3 width=200 height=18:) | |
2.5 | 1.7 | 1 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp06.mp3 width=200 height=18:) | |
2.5 | 1.7 | 3 | (:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp07.mp3 width=200 height=18:) | Little girl voice |