Geoffroy Peeters

Project leader - Audio and Music Indexing Research Projects - IRCAM R&D

Main Technologies Publications Supervision Commitees Projects Teaching Test-Sets

Sound classification

1. Description general

Classification automatique des sons
Developpement d'un systeme general de classification comprenant:

Evaluation pour le cas particulier de la classification des echantillons d'instruments de musique

2. Methode proposee

Figure 1. Overall schema of sound classification module

Le systeme developpe dans le cadre du projet CUIDADO, doit permettre la classification automatique des sons a partir de definition de classes prealable, ainsi que l'apprentissage de ces classes en lignes (par definition de l'utilisateur). Pour cela, un systeme complet d'apprentissage/evaluation a ete creer:

2.1. Feature extraction:

Figure 2. Features extraction

For each sound, a large set of sound descriptors are extracted including

The evolution along time of a specific features is then modeled by a temporal modeling modules.

Each sound is then represented by a feature vector.

2.2. Learning

Feature selection: Inertia Ratio Maximization with Feature Space Projection (IRMFSP)

Figure 3. Feature selection

In order to determine the most appropriate features to describe a specific taxonomy, a "feature selection" module is used.

  1. This module select the best features according to the value of the Between class inertia to the Total inertia. The largest this ratio is, the most discriminant is the feature.
  2. The whole feature space is then projected on the first selected feature (the one with the largest ratio value) and the process repeated for the selection of the next value.

Feature transform:

Before class modeling,

  1. a Box-Cox transform is applied in order to maximize feature's gaussianity
  2. a Linear Discriminant Analysis (LDA) is applied to the feature space in order to maximize class separation

Class modeling:

Different kind of classifiers are compared

2.3. Evaluation:

Figure 4. Instrument sounds taxonomy

Six different databases have been used for the evaluation, for a total of 4163 sounds, covering 27 instrument classes. Three different classes taxonomy have been used:


Figure 5. Confusion matrix

The best results are obtained using the

Recognition rates for 27 instruments:

Related publications