7 Tools
7.1 Evaluation
7.1.1 How to evaluate the quality of a score-follower?
--- Subjectivity vs. Objectivity
A subjective or qualitative evaluation of a score-follower means that the
important performance events are recognised with a latency that respects the
intention of the composer, which is therefore dependent on the effect (sound
synthesis or transformation) that is triggered by this event. Independent of
the piece, it can be done by assuming the hardest case, i.e. all notes have to
be recognised immediately. The method is to listen to the click output at each
recognised event and watch the currently recognised note in the sequence
editor, verifying that it is correct. This automatically includes the human
perceptual thresholds for detection of synchronous events in the evaluation
process.
A sort of subjective evaluation is definitely needed in the concert situation
to give immediate feedback whether the follower follows, and before the concert
to catch set-up errors.
An objective or quantitative evaluation, i.e. to know down to the millisecond
when each performance event was recognised, even if overkill for the actual use
of score following, is helpful for debugging and comparison of score following
algorithms, quantitative proof of improvements, automatic testing in batch,
making statistics on large corpuses of test data, and so on.
7.1.2 Evaluation Framework
We choose to implement the evaluation outside of the score following objects,
instead of instrumenting the code inside of the score follower. This
black box testing approach has the advantages that it is then possible
to test other followers or old versions of the score following algorithm, to
run two followers in parallel (in a kind of suivi shoot-out), and that
evaluation can be done for Midi and audio, without changing both objects (which
also keeps the code cleaner).
However, with the opposite glass box testing approach of adding
evaluation code to the follower, is is possble to inspect its internal state
(but which is not comparable with other score following algorithms!) to debug
and optimise the algorithm.
Objective evaluation needs reference data that provides the correct alignment
of the score with the performance. In our case this means a reference
cue-track with the cues (sometimes also called labels) at the points in
time where they should be output by the follower. For a performance given in a
Midi-file, the reference is the performance itself. For a performance from an
audio-file, the reference is an alignment of the audio with the cue track. By
the way, midified instruments are a good way to obtain the
performance/reference pairs because of the perfect synchronicity of the data.
The reference cues r are then compared to the cues s output by the score
follower. The offset d is defined as the time lapse between the
output of corresponding cues:
d = tr - ts
(4)
Cues with their absolute offsets greater than a certain threshold (e.g. 100
ms), or cues that have not been output by the follower, are considered an
error. The values characterising the quality of a score
follower are then:
-
the percentage of non-error cues,
- the average offset for non-error cues, which, if different from zero,
indicates a systematic latency,
- the standard deviation of the offset for non-error cues, which shows the
imprecision or spread of the follower, and
- the average absolute offset of non-error cues, which shows the global precision
There are other aspects of the quality of a score follower not expressed by
these values: e.g. the number of cues detected more than once, by zigzagging
back to an already detected cue.
7.1.3 Evaluation Object
To perform evaluation in jMax, the suivieval object has been
developed, which takes as input the note and cue outputs of the score follower,
the note and cue outputs of the reference performance, and the same control
messages as the score follower (to synchronise with its parameters). While
running, it outputs abovementioned values from a running statistics to get a
quick glance at the development and quality of the tested follower. On
reception of the stop message, the final values are output, and four text files
are written (with a user-settable base name): Two contain the protocol of all
the events at the score and reference input, and two match files detail the
offset for each cue, one human readable with comments, one readable as a Matlab
matrix for further analysis.
7.2 Cue Maker
Cues are integer events on a separate track in a jMax sequence
object. They are associated with the notes in the score that are to trigger
sound synthesis or transformation in a performance with score following.
The object suivimakecue was developed to facilitate the generation
of cue-events for using or testing the score follower. The usual process was
either to set cues by hand, or to set one cue per note by playing the score
through a special patch that generates cue numbers incrementally, and
re-recording these cues.
The object suivimakecue can do the same in one bang by
adding the cues to the sequence editor directly. Additionally, the
max_diff quantisation window is taken into account, such that only one
cue is generated for chords, and none for pauses shorter than max_diff
between almost legato notes.
7.3 Cue Mapper
Andrew Gerzso remarked that while preparing a piece with score following, often
last minute changes in the score occur, e.g. notes are inserted or deleted. To
avoid having to renumber all the cues and changing all the patches triggered by
them, the object suivimapper has been developed, that maps a range
of input cues (from suivimidi or suiviaudio) to a cue given by a map.
This object can also be used to increase the robustness of the follower, by
mapping a range of cues of a fast sequence of notes to their first cue number.
Thus, even if some notes are not recognised, as soon as any note is, the right
cue is output.
7.4 Model Dump
From the report of Vincent Goudard [Gou02]:
Le développement d'algorithmes d'entraînement du modèle nécéssite d'avoir des
tableaux de données pour pouvoir faire une analyse statistique sur un nombre
maximal de valeurs. La version ``recherche'' de ces algorithmes étant
développée en MATLAB, il était utile de concevoir des objets permettant
d'adapter les données utilisées dans jMax pour qu'elles soient lisibles dans
MATLAB. Ainsi, la bibliothèque d'objets pour le suivi de partition (package
suivi) s'est enrichie de deux nouveaux objets, destinés à
l'entraînement du modèle. Ce sont les objets suiviaudiofeat, et
suivimakeref.
L'objet suivimakeref est également utile dans le cadre de
l'entraînement du modèle. A partir d'une séquence jMax, l'objet
suivimakeref écrit dans un fichier texte des données relatives au
modèle créé.
7.5 Audio Feature Extractor
L'objet suiviaudiofeat est un objet de traitement du signal, utile
dans le cadre de l'entraînement du modèle. A partir d'une performance audio
reçue en entrée, suiviaudiofeat calcule et sort en temps réel sur
ses outlets, les différentes caractéristiques audio utilisées par suiviaudio
pour le calcul des observations.
Ces données sorties en temps réel peuvent être écrites dans des fichiers au
format SDIF, grâce à l'objet writesdif du package jMax
sdif développé par Diemo Schwarz, en vue d'être analysées par un
programme MATLAB (voir section 5).
Il est à noter que le calcul des caractéristiques audio par suiviaudiofeat est effectué à l'aide des algorithmes de suiviaudio et avec
les mêmes paramètres de sampling rate, window size, hopsize. Un
changement de ces paramètres dans suiviaudio entraîne un changement
équivalent dans les données sorties par l'objets suiviaudiofeat.