Automatic Generation of Visual Summary/Audio Summary from Signal Analysis
1. Description g�n�ral
D�veloppement d'algorithmes permettant d'extraire automatiquement la structure d'un morceau de musique
� partir de son signal audio.
Cette structure est ensuite utilis�e pour la g�n�ration d'un r�sum� sonore
(court segment musical r�sumant les diff�rents contenus d'un morceau).
2. M�thode propos�e
2.1. Features extraction
Des descripteurs sont d'abord extrait � chaque instant du morceau de musique (description du timbre,
de l'�volution du timbre, de la heuteur).
2.2. Structure estimation
L'estimation de la structure du morceau est bas� sur une mesure de r�p�tition:
- de s�quences qu cours du temps: s�quences (d'�l�ments non-similaires) r�p�t�es au cours du temps
- d'�tats au cours du temps: segments (d'�l�ments similarires) r�p�t�s au cours du temps
a) Sequence representation
The music audio signal is considered as repetitions of sequences of events.
- Lines are derived from the so-called similarity matrix using 2D structuring filter.
- Detected lines are then analyzed in order to detecte sequences repetitions in the audio track.
Illustrations of the results on the title "Love me do" from the artist "The Beatles" are indicated in Figure 1.
b) State representation
The music audio signal is considered as a succession of states so that each state represents a (somehow) similar information found
in the different parts of the music. The states are found using a two-pass algorithm based
- on segmentation and
- unsupervised learning algorithm (k-means + hidden Markov Model).
Illustrations of the results on the title "Oh so quiet" from the artist "Bjork" are indicated in Figure 2.
2.4. Audio summary generation
Signal represented by a successions of sequences/states AABABCAAB
Which sequences/states use for the summary ?
- unique example of each of the sequence/states
- reproducing sequence/state succession
- most important sequence/class (A)
- in term of number of occurrences
- in term of global time extension
- audio example of states transitions
Short fragment of audio signal corresponding to chosen states
Information continuity: Overlap-add + Tempo/beat
2.5. Examples
Figure 2. State approach on the title "Oh so quiet" from the artist "Bjork"
Figure 1. Sequence approach on the title "Love me do" from the artist "The Beatles"
3. Related publications
- Geoffroy Peeters, Amaury La Burthe, Xavier Rodet Toward Automatic Music Audio Summary Generation from Signal Analysis
ISMIR 2002 Paris (France) October 2002
- Geoffroy Peeters, Xavier Rodet Deriving Musical Structures from Signal Analysis for Audio Summary Generation: "Sequence" and "State" approach
CMMR03 Montpellier, France 2003 May 26-27
- Geoffroy Peeters, Xavier Rodet Music Structure Discovering Using Dynamic Features for Audio Summary Generation: "Sequence" and "State" approach
CBMI03 Rennes, France 2003 September 22-24
- Geoffroy Peeters, Xavier Rodet Signal-based Music Structure Discovery for Audio Summary Generation
ICMC03 Singapore, Singapore 2003 29 Sep - 4 Oct
- Patent