VaSAB: THE VARIABLE SIZE ADAPTIVE INFORMATION BOTTLENECK FOR DISENTANGLEMENT ON SPEECH AND SINGING VOICE

Frederik Bous, Axel Roebel
UMR9912 STMS | IRCAM - CNRS - Sorbonne Université - Ministère de la Culture | Paris, France



















This website contains supplementary material for our paper ``VaSAB: the variable size adaptive information bottleneck for disentanglement on speech and singing voice'', currently under review for ICASSP 2024.

Samples for Singing Voice

Below are samples from the test set of the singing dataset. Click on a sample to open a summary page for that sample, containing transpositions from different models. None of the voices below were included in the training sets.

VocalSet female 3

13-2-78 , 13-2-79 , 13-2-81 , 13-2-82 , 13-2-83 , 13-2-84 , 13-2-85 , 13-2-86 ,

VocalSet male 5

13-15-80 , 13-15-81 , 13-15-82 , 13-15-83 , 13-15-84 , 13-15-85 , 13-15-86 , 13-15-87 ,

Samples for Speech

Below are samples from the test set of our speech dataset. Click on a sample to open a summary page for that sample, containing transpositions from different models. None of the voices below were included in the training sets.

VCTK Female p361

11-205-1 , 11-205-2 , 11-205-3 , 11-205-4 , 11-205-5 , 11-205-6 , 11-205-7 , 11-205-8 ,

VCTK Female p362

11-207-1 , 11-207-2 , 11-207-3 , 11-207-4 , 11-207-5 , 11-207-6 , 11-207-7 , 11-207-8 ,

VCTK Male p374

11-213-1 , 11-213-2 , 11-213-3 , 11-213-4 , 11-213-5 , 11-213-6 , 11-213-7 , 11-213-8 ,

VCTK Male p376

11-215-1 , 11-215-2 , 11-215-3 , 11-215-4 , 11-215-5 , 11-215-6 , 11-215-7 , 11-215-8 ,