From Pierre Lanchantin

Main: Here2

Voice Conversion

Goal of this subjective test

The goal of this subjective test is to evaluate different methods used for converting a source voice into a target voice. The language of the speech utterances used in this test will be French but you can participate to this test even if you are not a French speaker. The method will transform only the timbre qualities of one voice so that it resembles another one. The prosodic characteristics will not be modified.

In this evaluation we will test the conversion of the voice of a french speaker into the voice of 2 differents speakers with differents accent (hispanic and french canadian). For each conversion, 2 test will be made, one about the proximity of the converted voice to the target and one about the quality of the conversion.

It should take you between 5 and 10 minutes to complete the test.

By completing this short questionnaire you are contributing to research on voice conversion, carried out at the Analysis Synthesis team of IRCAM.

Thanks in advance !

Pierre


First voice conversion

We want to evaluate the conversion from a Voice A to an other Voice B with a hispanic accent.

Here you can find example utterances of two different speakers voices:

A

B

Please listen to them in order to get familiar to their different timbre qualities.

Now, for each of the following file, vote whether it is perceived as closer to the Voice A or to the Voice B.

A Perceived as voice A
<- Perceived as closer to voice A
0 Perceived as between voice A and voice B
-> Perceived as closer to voice B
B Perceived as voice B
File A <- 0 -> B

We now ask you to listen to and compare a pair of short utterances and decide which of the two utterances is perceived as more natural by attending to sound quality, i.e., presenting less sound degradation.

1. For each line on the tab, listen carefully to File 1 and File 2. Both sounds will correspond to the same source-target conversion, but processed by using slightly different methods. The differences requires careful listening so please use headphones if you can.

2. Then give a preference score about according to the following grades tab:

Much better +3
Better +2
Slightly better +1
About the same 0
File 1 +3 +2 +1 0 +1 +2 +3 File 2

Second voice conversion

We now going to test the conversion from the same voice A to an other voice B with a french canadian accent. Here you can find example utterances of the two different voices:

A

B

As for the conversion into the first voice speaker, for each of the following file, vote whether it is perceived as closer to the Voice A or to the Voice B.

A Perceived as voice A
<- Perceived as closer to voice A
0 Perceived as between voice A and voice B
-> Perceived as closer to voice B
B Perceived as voice B
File A <- 0 -> B

Now, as for the first speaker, we ask you to listen to and compare a pair of short utterances and decide which of the two utterances is perceived as more natural by attending to sound quality, i.e., presenting less sound degradation.

1. For each line on the tab, listen carefully to File 1 and File 2.

2. Then give a preference score about according to the following grades tab:

Much better +3
Better +2
Slightly better +1
About the same 0
File 1 +3 +2 +1 0 +1 +2 +3 File 2

A few more questions :

Comments

Please, verify that you gave a preference to all questions, then press this button

All recordings are Ircam's property.

Thanks to Gilles Degottex for the php script.

Retrieved from http://recherche.ircam.fr/anasyn/lanchant/index.php/Main/Here2
Page last modified on May 05, 2010, at 04:59 PM