Listening test

Evaluation of expressivity and singing style modelisation for singing voice synthesis
(Not active anymore) contact : luc.ardaillon@ircam.fr

Thank you for your time ! Please read carefully the following informations, even though you are used to doing such tests !

Objectives of this test

This listening test aims at evaluating the expressivity and modelisation of singing styles in singing voice synthesis, based on variations of pitch and phonemes durations.
We do not want to assess the overall quality of the synthesis.
However, the sounds below may contain artefacts which are not related to this modelisation (e.g. unatural timbre, unexpected noise, ...).
Please try to ignore these artefacts and focus only on the pitch variations (attacks, transitions between notes, vibrato, ...) and phonemes durations.

You will be presented 2 tests below. The first one aims at evaluating the expressivity (or liveliness, musicality) of singing voice synthesis using various settings.
The second one aims at evaluating the modelisation of the singing style.
Detailed explanations on the evaluation procedure are given below for each test.

General recommendations

If there is any technical problem (e.g. sound not playing) with one sound, select Prob. (please note that it may take a few seconds for all sounds to load)
Do the test in a quiet place.
Use absolutely headphones or earphones.
Verify that the sound level is loud enough to hear the sound details properly.
Please, take the time to listen!
No need to try to find a repetition or order among the sounds. There is none, their order is randomized.
Please try to do the whole test, but if you don't have time to answer all questions, please send your partial answer anyway, it will still be useful!

Test I

For each pair of recordings below (each line) select one button depending on your preference about the expressivity of the two interpretations.
By the term "expressive", we mean an interpretation that sounds lively, with musical intentions, as opposed to a more mechanical or static interpretation.

If the left sound is much better than the right one, select the leftmost button (+3)
If the left sound is better than the right one, select the second button (+2)
If the left sound is slightly better than the right one, select the third button (+1)
If you do not hear any difference or you have no preference, select the middle button (0)

... and the same on the other way.

There is no "correct" answer. It is only about your subjective preference.

Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
1
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
2
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
3
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
4
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
5
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
6
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
7
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
8
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
9
Pair	File1	+3	+2	+1	0	+1	+2	+3	File2	Prob
10

Test II

In this second test, you are asked to assess, among 2 different synthesis, which one presents a singing style which is the most similar to the style of a "Target style" from an original recording.
For each line, first listen to leftmost sound in the column "Target style" to get an idea of the main characteristic of this style.
Then, listen to the 2 other sounds in columns "File 1" and "File 2", and choose a button (similarly to Test I) according to whether you think that the target singing style is more similar to that of File 1 or File 2.
Please try to focus mainly on differences in the pitch variations (attacks, transitions between notes, vibrato, ...) and phonemes durations. (other features are not modeled here).