A HMM-Based Speech Synthesis System using a New Glottal Source and Vocal-Tract Separation Methods (with G. Degottex)

  • This work introduces a HMM-based speech synthesis system which uses a new method for the separation of vocal-tract and Liljencrants-Fant model plus Noise (SVLN) proposed by G. Degottex.
  • The glottal source is separated into two components: a deterministic glottal waveform Liljencrants-Fant model and a modulated Gaussian noise.
  • This glottal source is first estimated and then used in the vocal-tract estimation procedure.
  • Then, the parameters of the source and the vocal-tract are included into HMM contextual models of phonems.
  • The synthesis results were subjectively evaluated here
  • A HMM-Based Synthesis System Using a New Glottal Source and Vocal-Tract Separation Method,
    P. Lanchantin, G. Degottex and X. Rodet,
    ICASSP2010 Proceedings, Dallas, USA, 2010.
Pair Pulse STRAIGHT SVLN
2
4
6
8
11

Transformation examples

  • SVLN is promising for voice transformation in synthesis of expressive speech since it allows an independent control of vocal-tract and glottal-source properties.
F0 scaleVTF scaleRd scaleAudio (HTS) 
111 Original voice
0.611
0.60.851
0.60.850.5 Baryton voice
2.511
2.51.71
2.51.73 Little girl voice