Previous Contents Next

13   TCTS Circuit Theory and Signal Processing Lab

MISCtcts:www [TCTS99]
KeyTCTS
TitleTCTS (Circuit Theory and Signal Processing) Lab, Faculté Polytechnique de Mons
HowpublishedWWW page
Year1999
urlhttp://tcts.fpms.ac.be
group-urlhttp://tcts.fpms.ac.be/synthesis/synthesis.html
pub-urlhttp://tcts.fpms.ac.be/publications.html
Notehttp://tcts.fpms.ac.be


INPROC.tcts:euspico98 [DMD98]
Author
O. Deroo, F. Malfrere, T. Dutoit
TitleComparaison of two different alignment systems: speech synthesis vs. Hybrid HMM/ANN
BooktitleProc. European Conference on Signal Processing (EUSIPCO'98)
AddressGreece
Year1998
Pages1161--1164
Notewww [TCTS99], same content as [MDD98] (but less references)
urlhttp://tcts.fpms.ac.be/publications/papers/1998/eusipco98_odfmtd.zip
AbstractIn this paper we compared two different methods for phonetically labeling a French database. The first one is based on the temporal alignment of the speech signal on a high quality synthetic speech pattern and the second one uses a hybrid HMM/ANN system. Both systems have been evaluated on French read utterances from a single speaker never seen in the training stage of the HMM/ANN system and manually segmented. This study outline the advantages and drawbacks of both methods. The high quality speech synthetic system has the great advantage that no training stage (hence no labeled database) is needed, while the classical HMM/ANN system allows easily multiple phonetic transcriptions (phonetic lattice). We deduce a method for the automatic constitution of large phonetically and prosodically labeled speech databases based on using the synthetic speech segmentation tool in order to bootstrap the training process of our hybrid HMM/ANN system. The importance of such segmentation tools will be a key point for the development of improved speech synthesis and recognition systems. All the experiments reported in this article related to the hybrid HMM/ANN system have been realized with the STRUT [3] software.


INPROC.tcts:tsd98 [DMP+98]
TitleEULER: Multi-Lingual Text-to-Speech Project
Pages27--32
Author
T. Dutoit, F. Malfrère, V. Pagel, M. Bagein P. Mertens, A. Ruelle, A. Gilman
BooktitleProceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98
Year1998
Editor
Petr Sojka, Václav Matousek, Karel Pala, Ivan Kopecek
AddressBrno, Czech Republic
MonthSeptember
PublisherMasaryk University Press
Notewww [TCTS99]Electronic version: tcts/tsd98tdfmvppmmbarag.ps.*
Remarksmodularity
AbstractText-to-speech systems requires simultaneously an abstract linguistic analysis, an acoustic linguistic analysis and a final digital processing stage. The aim of the project presented in this paper is to obtain a set of text-to-speech synthesizers for as many voices, languages and dialects as possible, free of use for non-commercial and non-military applications. This project is an extension of the MBROLA projects. MBROLA is a speech synthesizer that is freely distributed for non-commercial purposes. A multi-lingual speech segmentation and prosody transplantation tool called MBROLIGN has also been developed and freely distributed. Other labs have also recently distributed for free important tools for speech synthesis like Festival from University o f Edinburgh or the MULTEXT project of the University de Provence. The purpose of this paper is to present the EULER project, which will try to integrate all these results, to Eastern European potential partners, so as to increase the dissemination of the important results of MBROLA and MBROLIGN projects and stimulate East/West collaboration on TTS synthesis.


INPROC.tcts:icslp98-fmodtd [MDD98]
Author
F. Malfrere, O. Deroo, T. Dutoit
TitlePhonetic Alignement : Speech Synthesis Based Vs. Hybrid HMM/ANN
BooktitleProc. International Conference on Speech and Language Processing
AddressSidney, Australia
Year1998
Pages1571--1574
Notewww [TCTS99], same content as [DMD98] (with more references)
urlhttp://tcts.fpms.ac.be/publications/papers/1998/icslp98_fmodtd.zip
AbstractIn this paper we compare two different methods for phonetically labeling a speech database. The first approach is based on the alignment of the speech signal on a high quality synthetic speech pattern, and the second one uses a hybrid HMM/ANN system. Both systems have been evaluated on French read utterances from a speaker never seen in the training stage of the HMM/ANN system and manually segmented. This study outlines the advantages and drawbacks of both methods. The high quality speech synthetic system has the great advantage that no training stage is needed, while the classical HMM/ANN system easily allows multiple phonetic transcriptions. We deduce a method for the automatic constitution of phonetically labeled speech databases based on using the synthetic speech segmentation tool to bootstrap the training process of our hybrid HMM/ANN system. The importance of such segmentation tools will be a key point for the development of improved speech synthesis and recognition systems.


INPROC.tcts:iscas97 [MD97a]
Author
F. Malfrere, T. Dutoit
TitleSpeech Synthesis for Text-To-Speech Alignment and Prosodic Feature Extraction
BooktitleProc. ISCAS 97
AddressHong-Kong
Year1997
Pages2637--2640
Notewww [TCTS99]
urlhttp://tcts.fpms.ac.be/publications/papers/1997/iscas97_fmtd.zip
RemarksRecent developments in prosody generation have highlighted the potential interest of machine learning techniques such as multilayer perceptrons [Tra92], linear regression techniques [SK92], classification and regression trees [Hir91], or statistical techniques [MPH93], based on the automatic analysis of large prosodically labeled corpora. Only the segmental features of the reference signal used in alignment. Assumption: the segmental and suprasegmental features are approximately uncorrelated. Keep only the perceptually relevant F0 cues, perceptual stylization, based on a model of tonal perception [alessandro95]. Robust cepstrum by sinusoidal weighting [GL88]. Derivative of cepstrum [SR88].
AbstractThe aim of this paper is to present a new and promising approach of the text--to--speech alignment problem. For this purpose, an original idea is developed : a high quality digital speech synthesizer is used to create a reference speech pattern used during the alignment process. The system has been used and tested to extract the prosodic features of read French utterances. The results show a segmentation error rate of about 8%. This system will be a powerful tool for the automatic creation of large prosodically labeled databases and for research on automatic prosody generation.


INPROC.tcts:eurosp97 [SDS97]
Author
Yannis Stylianou, Thierry Dutoit, Juergen Schroeter
TitleDiphone Concatenation Using a Harmonic Plus Noise Model of Speech
BooktitleProc. Eurospeech '97
AddressRhodes, Greece
MonthSeptember
Year1997
Pages613--616
Notewww [TCTS99]Electronic version: tcts/hnmconc.ps.*
RemarksImportant! HNM (Marine) basis paper, pitch synchronous. Diphone smoothing in region of quasi-stationarity. Additive better for concatenation than PSOLA. References: [DG96] (non pitch-synchronous hybrid harmonic/stochastic synthesis, real-time generation of signals from spectral representation), [SLM95] (phase treatment, modifications), [Mac96] (non pitch synchronous harmonic modeling).
AbstractIn this paper we present a high-quality text-to-speech system using diphones. The system is based on a Harmonic plus Noise (HNM) representation of the speech signal. HNM is a pitch-synchronous analysis-synthesis system but does not require pitch marks to be determined as necessary in PSOLA-based methods. HNM assumes the speech signal to be composed of a periodic part and a stochastic part. As a result, different prosody and spectral envelope modification methods can be applied to each part, yielding more natural-sounding synthetic speech. The fully parametric representation of speech using HNM also provides a straightforward way of smoothing diphone boundaries. Informal listening tests, using natural prosody, have shown that the synthetic speech quality is close to the quality of the original sentences, without smoothing problems and without buzziness or other oddities observed with other speech representations used for TTS.


INPROC.tcts:speechcomm96 [DG96]
Author
T. Dutoit, B. Gosselin
TitleOn the use of a hybrid harmonic/stochastic model for tts synthesis by concatenation
BooktitleSpeech Communication
Number19
Pages119--143
Year1996
RemarksCited in [SDS97] for non pitch-synchronous hybrid harmonic/stochastic synthesis, real-time generation of signals from spectral representation. TO BE FOUND


INPROC.macon-thesis96 [Mac96]
Author
Michael W. Macon
TitleSpeech Synthesis Based on Sinusoidal Modeling
BooktitlePhD thesis
PublisherGeorgia Institute of Technology
MonthOctober
Year1996
RemarksCited in [SDS97] for non pitch synchronous harmonic modeling. TO BE FOUND


INPROC.stylianou:eurospeech95 [SLM95]
Author
Y. Stylianou, J. Laroche, E. Moulines
TitleHigh Quality Speech Modification based on a Harmonic+Noise Model
BooktitleProc. EUROSPEECH
Year1995
RemarksCited in [SDS97] for phase treatment, modifications, maximum voice frequency. TO BE FOUND


INPROC.Malfrere_HighQual_EURO97 [MD97b]
Author
Fabrice Malfrere, Thierry Dutoit
TitleHigh Quality Speech Synthesis for Phonetic Speech Segmentation
BooktitleProc. Eurospeech '97
AddressRhodes, Greece
MonthSeptember
Year1997
Pages2631--2634


INPROC.Olivier_SimpAnd_EURO97 [vdVOPD+97]
Author
van der Vrecken Olivier, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere
TitleA Simple and Efficient Algorithm for the Compression of MBROLA Segment Databases
BooktitleProc. Eurospeech '97
AddressRhodes, Greece
MonthSeptember
Year1997
Pages421--424


INPROC.Dutoit_TheMbro_ICSLP96 [DPP+96]
Author
T. Dutoit, V. Pagel, N. Pierret, F. Bataille, O. V. der Vrecken
TitleThe MBROLA project: Towards a Set of High Quality Speech Synthesizers Free of Use for Non Commercial Purposes
BooktitleProc. ICSLP '96
AddressPhiladelphia, PA
MonthOctober
Year1996
Volume3
Pages1393--1396


INPROC.Dutoit_HighQual_ICASSP94 [Dut94]
Author
T. Dutoit
TitleHigh Quality Text-to-Speech Synthesis: a Comparison of four Candidate Algorithms
BooktitleProc. ICASSP '94
AddressAdelaide, Austrailia
MonthApril
Year1994
PagesI--565--I--568



Previous Contents Next