|
|
Equipe Analyse/Synthèse
|
Statistical Modeling of Sound Aperiodicities
Shlomo Dubnov & Xavier Rodet
to appear in
ICMC97, Thessaloniki, Grece, September 1997
Abstract
Acoustical musical instruments which are considered to produce a well
defined pitch, emit waveforms which are never exactly periodic. The
aperiodicities supposedly originate in some not well known fundamental
mechanism of their sound production. This effect, which for time
scales shorter than 100 or 200 ms is beyond the control of the player,
is expected to be typical of the particular instrument or maybe of the
instrument family. Several methods which investigate aperiodicities
in the waveform of musical sounds have recently appeared in the
literature, such as examination of variations in the waveform of the
sound between consecutive periods, Fourier transforms of sonagrams
that reveal the presence of subharmonic modulations or correlograms
which correlate in time the outputs of auditory models.
In an earlier work we have shown that a particular aspect of coherence
of fluctuations is strongly related to non-linear properties of the
time series model of the signal. These properties are measured by
Higher Order Statistics (HOS) or polyspectra and were shown to be
important for characterisation of musical instruments in the sustained
portion of the sound. It should be noted that the particular
statistical property of coherence/incoherence can not be easily
revealed by the other analysis methods.
The purpose of this work is to further extend this research, both
theoretically and practically, combining our HOS results with the
other aforementioned methods. Specifically, our goal is to define a
statistical model for fluctuations of the sound parameters in the
sustained portion of the sound, which could be incorporated into
existing analysis/synthesis methods, such as the additive method.
A comparison of HOS properties of real signals versus synthetic ones
re-synthesized via additive analysis/synthesis method, shows that HOS
are preserved. This supports the notion that HOS are related to phase
jitter of the actual harmonic partials and this suggests also that
sinusoidal signal models are appropriate for modeling it.
By using a mechanism of random frequency modulations (jitter), applied
either independently or with correlation to the various harmonic
partials, synthetic signal with various desired HOS properties can be
simulated. Gradually increasing the amplitude and bandwidth of the
jitter takes the sound from perfect pitch to noise in two routes: for
the coherent case, it increases the perceived random pitch
fluctuations of a single sound. On the contrary the incoherent route
is perceived as increasing the amount of added noise, while
maintaining the sense of a more or less stable pitch. The difference
between the two sounds, although it can not be observed in a long term
spectral analysis, is clearly revealed by the different decay of HOS
as a function of increasing jitter parameters.
A detailed look at the waveform of a coherent signal shows that the
effect of jitter is equivalent to a local time scaling, thus
stretching or contracting the original waveform's shape. The HOS are
not affected by this jitter and remain constant. On the contrary, for
the case of independent jitters, the waveform varies in time, and does
not preserve locally the phase relations. This causes HOS to decay,
with a rate proportional to the bandwidth of the jitter.
This interesting observation suggests that differences/similarities
between successive portions of a signal might be better represented by
scalograms in the coherent case, contrary to spectral representation
that is better suited for the non coherent case. This matching could
be considered also as a search for best correlation between two
consecutive segments of sound, which matches spectral amplitude only
in the non coherent case, versus match both in magnitude and phase
over a certain range of possible scale differences in the coherent
case. The second method, which requires scaling invariance, seems to
be closer to the auditory modeling approaches.
We have studied several applications of the jitter model. Let us
mention a few:
- modeling of realistic jitter for additive and source filter
(``Chant'') methods.
- reducing the amount of short time analysis frames in the
sustained portion of the sound, by separating the jitter from
other spectral information.
- morphing between sounds by their jitter properties.
- investigating the behaviour of jitter for different modes of
playing and ``expressivity'' control.
Results of real and synthetic sound analyses will be detailed in the
paper. Examples of sound synthesis will be demonstrated in the
presentation.