Finally, here are some demos of my latest researches and applications on creative speech technologies.

Voice Conversion

The True Voice of Marilyn Monroe

‘‘The ghost of the American icon Marilyn Monroe haunts a new video by the artist Philippe Parreno, which is due to be shown at the Fondation Beyeler in Basel this summer (2012, 10 June-30 September)", reports the Art Newspaper.

The project consisted in resurrecting the voice of Marilyn Monroe for the video by the artist Philippe Parreno - which also includes the use of a mechanical robot that mimics the handwriting of Marilyn Monroe.

The project required the latest advances in speech technologies to recreate the voice of Marilyn Monroe from historical true (movies, interviews) and false imitations (actresses) recordings. The technologies used voice conversion techniques to mimic the timbre, voice quality, and intonation that were unique to Marilyn Monroe.

conversion of intonations
Dans la Vallée de l'Etrange, Conversation entre Anri Sala et Philippe Parreno, Les Cahiers du Cinéma, Novembre 2012.

Text-To-Speech Synthesis

Expressive Speech Synthesis

Text-To-Speech synthesis (TTS) consists to convert a text into the corresponding speech signal. The main systems for speech synthesis are unit-selection speech synthesis and parametric speech synthesis. In unit-selection speech synthesis, the speech signal is obtained by concatenating speech units selected in a large speech database; in parametric speech synthesis, the speech signal is obtained through modelling the statistical characteristics of speech over time. The ircamTTS (unit-selection) and ircamHTS (HMM-based) systems have been developed in collaboration with Christophe Veaux and Pierre Lanchantin in the context of expressive speech synthesis.

André DussolierJonathan Roullier

ircamTTS: story-telling speech synthesis of the first paragraph of the Little Red Riding Hood fairy-tale with the voice of French professional actors André Dussolier and Jonathan Roullier.

Creative Speech

Here I present some creative applications of text-to-speech synthesis. This goes from the manipulation of speech units (phonemes, syllables, words) in concatenative sound synthesis, to the creation of pseudo-languages, i.e. the creation of artificial voices synthesized from a non-sense text. Creative applications of text-to-speech synthesis have been used in various productions, e.g. HyperMusic: Prologue (Hector Parra/Lisa Randall, 2008-2009) and Luna Park (Georges Aperghis, 2011). These applications have been conducted in a close collaboration with Pierre Lanchantin and Christophe Veaux.

phonemespseudo languagepseudo poem

ircamTTS: synthesis of pseudo-languages.

Speech Synthesis in Various Speaking Styles

Each speaker has his own speaking style that contributes as a part of his identity. But a speaker also adapts continuously his speaking style to some specific situations of communication constrained by conventions that are shared among speakers. I have developed in collaboration with Pierre Lanchantin a system to synthesize speech in various speaking styles, based on the average modelling of multiple speakers.

original speech samples
ircamHTS: speech synthesis of average speaking style
Discrete/Continuous Modelling of Speaking Style in HMM-based Speech Synthesis: Design and Evaluation, N. Obin, P. Lanchantin, A. Lacheret and X. Rodet, Interspeech 2011, Florence, Italy, 2011.

Exploiting Alternatives for Expressive Speech Synthesis

Humans are not robots. In particular, humans have a large variety of strategies to pronounce a sentence, and can easily vary their speech prosody from one time to the other. One of my current research directions is to model the variety in speech prosody of a speaker so as to vary the speech prosody for natural and expressive speech synthesis.

alternative #1alternative #2alternative#3
speech synthesis with variants
Making Sense of Variations: Introducing Alternatives in Speech Synthesis, N.Obin, C. Veaux, P. Lanchantin, Speech Prosody, Shanghai, China, 2012.