Pierre Lanchantin | Main / TowardImprovedHMM-basedSpeechSynthesisUsingHigh-LevelSyntacticalFeatures browse

A major drawback of current Hidden Markov Model-based speech synthesis is the monotony of the generated speech which is closely related to the monotony of the generated prosody.
This work presents a linguistic-oriented approaches in which high level linguistic features are extracted from text in order to improve prosody modeling.
A linguistic processing chain based on linguistic preprocessing, morpho-syntactical labeling, and syntactical parsing is used to extract high-level syntactical features from an input text.
Rich linguistic features are then introduces into a HMM-based speech synthesis system to model prosodic variations (f0, duration, and spectral variations).
Subjective evaluation reveals that the proposed approach significantly improve speech synthesis compared to a baseline model, even if such improvment depends of the observed linguistic phenomenon.
Toward Improved HMM-Based Speech Synthesis Using High-Level Syntactical Features,
N. Obin, P. Lanchantin, M. Avanzi, A. Lacheret-Dujour and X. Rodet,
Speech Prosody 2010 Proceedings, Chicago, USA, 2010.

example 1
example 2