Dynamic Model Selection for spectral Voice Conversion
- Statistical methods for voice conversion are usually based on a single model selected in order to represent a tradeoff between goodness of fit and complexity.
- In this work we assumed that the best model may change over time, depending on the source acoustic features.
- We present a new method for spectral voice conversion called Dynamic Model Selection (DMS), in which a set of potential best models with increasing complexity - including mixture of Gaussian and probabilistic principal component analyzers - are considered during the conversion of a source speech signal into a target speech signal.
- This set is built during the learning phase, according to the Bayes information criterion. During the conversion, the best model is dynamically selected among the models in the set, according to the acoustical features of each frame.
- Subjective tests show that the method improves the conversion in terms of proximity to the target and quality.
- Dynamic Model Selection for Spectral Voice Conversion,
P. Lanchantin and X. Rodet,
''Interspeech 2010 Proceedings, Makuhari, Japan, Sept 2010.
VC from real source voice
- Converted source envelope
VC from a commercial TTS source voice
Warning: many artefacts are due to artefacts already present in the speech generated by the TTS system
- Converted source envelope