Next: 4. Estimation of Spectral Up: 3. Basic Concepts Previous: 3.4 The Source-Filter Model

3.5 The Software-Environment at IRCAM

This section will give a brief overview of the software systems for sound analysis, synthesis, and processing developed at IRCAM which are related to spectral envelopes. First, the systems which will use spectral envelopes will be shortly described. Then, the systems the spectral envelope library is based upon will be presented. Later, chapter 8 will explain how each system could benefit from spectral envelope handling. In fact, most of the programs will make use of the spectral envelope library developed in this project.

ADDITIVE: The ADDITIVE program [Rod97b] performs the additive analysis and resynthesis described in section 2.2. It analyses a sound file according to the sum of harmonic sinusoids (harmonic partials) model whose frequencies are integer multiples of the fundamental frequency f₀. Note that it is crucial to know the (time-varying) fundamental frequency as exactly as possible to be able to recognise the harmonic partials. Therefore, a first analysis step consists of pitch estimation, after which the parameters of the partials (number, frequency, amplitude and phase) are estimated and written to a parameter file called format file . The number parameter groups the resulting partials into tracks or partial trajectories . The synthesis stage takes a partial parameter file as input and computes a synthetic signal which is close to the original signal. In fact, substracting this resynthesised sinusoidal signal from the original leaves the residual part of the signal, i.e. everything which can't be represented by harmonic sinusoids. This proves the tremendous accuracy of the additive analysis method used.
HMM: The HMM program [DGR93] uses a more generalized approach to additive analysis. The underlying model is no longer restricted to harmonic sinusoids, but incorporates inharmonic sinusoids (at fractional multiples of the fundamental frequency) as well. Partial tracking is done by a purely combinatorial Hidden Markov Model , using the Viterbi algorithm . A partial trajectory is considered as a sequence of peaks in time which satisfies continuity constraints on the slopes of the parameters. The method even allows the frequency lines of partial trajectories to cross.
XTRAJ: To display partial parameter files generated by ADDITIVE or HMM graphically, the program XTRAJ, a descendent of XGRAPH , plots the partial trajectories in the time-frequency plane, while the amplitude of the partials is coded by colour (see figure 2.34).
CHANT: The CHANT project [RPB84,RPB85] was originally intended for the analysis and synthesis of the singing voice, but was quickly expanded to cover general sound synthesis by rule . It is based on the FOF model of synthesis (see section 4.5), a flexible and fast time-domain additive synthesis method. Today, CHANT is implemented in the CHANT-library [Vir97], which is controlled by DIPHONE (see below).
DIPHONE: DIPHONE [RL97] is a graphical sound composition environment which controls additive synthesis and CHANT. It runs on Apple Macintosh and is expandable by plugins (each synthesis method is in fact a plugin). The central concept of the program is that of concatenating diphones: A diphone is a segment of a parametric description of sound. When diphones are combined to sequences, the overlapping parts between them will be interpolated, allowing e.g. for astounding morphing between completely different sounds.
FTS / MAX / JMAX: FTS (Faster Than Sound ) [Puc91b] is IRCAM's real-time signal processing system, controlled by the graphical programming environment MAX [Puc91a]. It was first developed to run on the IRCAM Signal Processing Workstation (ISPW ), a custom-built DSP-card to plug into NeXT-workstations. It has been ported to run natively (i.e. without special signal processing hardware) on SGI workstations and on the Linux operating system [DDPZ94], the new improved user interface jMax is based on Java. It is a modular, extensible system, which allows for the setup of any sound synthesis and signal processing algorithm.

= pics/xtraj.gif

The following software systems are used by the spectral envelope library:

UDI: The Universal DSP Interface [WRD92] is a portable library of digital signal processing routines. It provides the commonly needed vector and signal processing operations, which will run on a general purpose computer, as well as on fast specialized DSP hardware, if present.
PM: PM [Gar94] is a library for additive analysis, transformation, and synthesis. It is the basis for the ADDITIVE program. To the spectral envelope library, it provides functions and data abstractions to handle and manipulate sets of additive partial parameters, time-frames of sets of partials, and break-point functions.
STTOOLS: The STTOOLS library handles the input, output, and conversion of sound files. Both standard file formats like AIFF (Audio Interchange File Format) and the IRCAM's sf sound file format can be used. This library is also the basis for the command-line tools to convert, query, and play sound files.
SDIF: The SDIF (Sound Data Interchange Format ) [Vir98] is a file format developed by IRCAM and CNMAT (Center for New Music and Audio Technologies, Berkeley) which standardizes and unifies the various representations for sound data which surpass a simple time-domain representation as a sampled signal. SDIF is an open, extensible, frame based format. It can combine multiple time-tagged frames of data of different types, and is optimized both for archiving and for streaming. At IRCAM, it is already used for CHANT and additive synthesis. An SDIF-library exists which offers functions to read and write data, and define new data types. This project added the definition of new data types for spectral envelopes. Which are described in detail in section 7.5.

xspect: [] [], FFT-1 []

Next: 4. Estimation of Spectral Up: 3. Basic Concepts Previous: 3.4 The Source-Filter Model

Diemo Schwarz
1998-09-07