Project leader - Audio and Music Indexing Research Projects - IRCAM R&D
1. Generic description
ircamsummary is a C++ software and a C++ library which performs
- the automatic generation of music audio summaries.
It uses various strategies: the most representative extract (in terms of content repetition and content position), down-beat synchronous concatenation of the most representative parts.
The summary can also be parameterized by the user in terms of duration of the summary (from 10s to 30s).
- the estimation of the structure of music files in terms of repetition of parts (such as verse, chorus bridge ... but without explicit labeling of the parts).
For this, Ircamsummary extracts the timbral, harmonic and rhythmic content of a music file over time and analyzes content repetition using two strategies: sequence repetition and state repetition. The generation of the audio summary is parametrizable in type (continuous summary/or summary obtained by concatenating the most informative parts) and in duration. The estimation of the structure is parametrizable in terms of number of parts and part's type (sequence or state).
ircamsummary is based on the algorithms described in
- G. Peeters, A. Laburthe, and X. Rodet. Toward automatic music audio summary generation from signal analysis. In Proc. of ISMIR (International Society for Music Information Retrieval), pages 94-100, Paris, France, 2002.
- G. Peeters. Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In Proc. of ISMIR (International Society for Music Information Re- trieval), Vienna, Austria, 2007.
- F. Kaiser and G. Peeters. A simple fusion method of state and sequence segmentation for music structure discovery. In Proc. of ISMIR (International Society for Music Information Retrieval), Curitiba, PR, Brazil, November 2013.
ircamsummary is implemented in C++.
It is available available for Linux, Mac-OS, Windows.
It is available either as a command-line executable (see documentation below) or as a library (in this case the API is adapted to the needs of the integrator).
4. Documentation of ircamsummary command-line version
Current version: 1.7.1
4.1 ircamsummary usage for summary generation
ircamsummary takes as input the path to an audio file (the audio file must be in a non-compressed format, for example: .wav or .aiff) and output the generated summary in an audio file.
ircamsummary -i /Users/jeremy/audio.wav -o /net/data/results/audio_summary.wav
will analyze the content of the audio file "/Users/jeremy/audio.wav" and write the audio summary in "/net/data/results/audio_summary.wav".
By default, ircamsummary will create a 30s. duration summary using only one extract of the music track
If you want a specific summary duration, you can use the flag "--summary_length". For example "--summary_length 15" will provides a 15s. durration summary.
If you want more than one extract, you can use the flag "--summary_nb_extract". For example "--summary_nb_extract 3" will use 3 extracts for the summary.
4.2 ircamsummary usage for music structure estimation
ircamsummary takes as input the path to an audio file (the audio file must be in a non-compressed format, for example: .wav or .aiff) and output the results in an XML file.
ircamsummary -i /Users/jeremy/audio.wav --xml_struct /net/data/results/audio_structure.xml
will analyze the content of the audio file "/Users/jeremy/audio.wav" and write the results (the estimated structure in "/net/data/results/audio_structure.xml".
XML format description: