VaSAB: THE VARIABLE SIZE ADAPTIVE INFORMATION BOTTLENECK FOR DISENTANGLEMENT ON SPEECH AND SINGING VOICE

Frederik Bous, Axel Roebel
UMR9912 STMS | IRCAM - CNRS - Sorbonne Université - Ministère de la Culture | Paris, France













13-15-84

Model Input WORLD Nosp Rasi Hivo Ravo
1600
800
0
-800
-1600

Legend

Input Input file
WORLD Transposition using the WORLD vocoder
Nosi Baseline model using a classical bottleneck based on dimensionality reduction. Here nb = 3. Trained only on singing.
Hivo VaSAB bottleneck using the hierarchical dropout. Trained on both, speech and singing.
Ravo VaSAB bottleneck using random (classical) dropout. Trained on both, speech and singing.
Rasi VaSAB bottleneck using random (classical) dropout. Trained only on singing.

The sounds used on this page are under Copyright © 2021 Ircam, Institut de recherche et coordination acoustique/musique.