Main.AHMM-BasedSpeechSynthesisSystemUsingANewGlottalSourceAndVocal-TractSeparationMethods History

Hide minor edits - Show changes to markup

February 24, 2011, at 11:50 AM by 129.102.64.91 -
Added lines 1-49:

A HMM-Based Speech Synthesis System using a New Glottal Source and Vocal-Tract Separation Methods (with G. Degottex)

  • This work introduces a HMM-based speech synthesis system which uses a new method for the separation of vocal-tract and Liljencrants-Fant model plus Noise (SVLN) proposed by G. Degottex.
  • The glottal source is separated into two components: a deterministic glottal waveform Liljencrants-Fant model and a modulated Gaussian noise.
  • This glottal source is first estimated and then used in the vocal-tract estimation procedure.
  • Then, the parameters of the source and the vocal-tract are included into HMM contextual models of phonems.
  • The synthesis results were subjectively evaluated here
  • A HMM-Based Synthesis System Using a New Glottal Source and Vocal-Tract Separation Method,
    P. Lanchantin, G. Degottex and X. Rodet,
    ICASSP2010 Proceedings, Dallas, USA, 2010.

(:table border=1 cellpadding=2 cellspacing=0 align=center:) (:cellnr bgcolor=#cccc99 align=center:) Pair (:cell bgcolor=#cccc89 align=center:) Pulse (:cell bgcolor=#cccc89 align=center:) STRAIGHT (:cell bgcolor=#cccc89 align=center:) SVLN (:cellnr bgcolor=#cccc99 align=center:) 2 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.2.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 4 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.4.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 6 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.6.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 8 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.8.lf.mp3 width=62 height=18:) (:cellnr bgcolor=#cccc99 align=center:) 11 (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.bu.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.st.mp3 width=62 height=18:) (:cell align=center:)(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/test_mp3/Xavier2007.11.lf.mp3 width=62 height=18:) (:tableend:)

Transformation examples

  • SVLN is promising for voice transformation in synthesis of expressive speech since it allows an independent control of vocal-tract and glottal-source properties.
F0 scaleVTF scaleRd scaleAudio (HTS) 
111(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp01.mp3 width=200 height=18:)Original voice
0.611(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp02.mp3 width=200 height=18:)
0.60.851(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp03.mp3 width=200 height=18:)
0.60.850.5(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp04.mp3 width=200 height=18:)Baryton voice
2.511(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp05.mp3 width=200 height=18:)
2.51.71(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp06.mp3 width=200 height=18:)
2.51.73(:flash http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/dewplayer.swf?son=http://recherche.ircam.fr/equipes/analyse-synthese/lanchant/uploads/Main/icassp07.mp3 width=200 height=18:)Little girl voice