Previous Contents Next

8   Future Work

This section presents the concrete next steps in the score following project, and directions of future research and development. It concludes in section 8.7 with a calendar of the upcoming events about score following, as known of today.

Three conferences are planned about score following: two internal Ircam seminars on January 15 and 22, 2003, one more from the musical point of view, the other more scientific; a pedagogy conference for the general public on the February 24, 2003 with Diemo Schwarz, Philippe Manoury, Serge LeMouton.

8.1   Spoken Voice Following

A possible project with following of the spoken voice, similar to [LCB99a, LCB+99c], could start in 2003. It was suggested by Gilles Grand for a theatre piece by Olivier Cadiot. Spoken voice following will in any case also help following of the singing voice, because with a phonetic recognition the remaining problems of the singing voice (fricatives/repeated notes) are automatically taken care of.

8.2   Stages DEA ATIAM

Two stages for the DEA ATIAM were offered and presented to the students and already found great interest. Their announcements were:

Suivi de voix parlée en temps réel

Application et extension des techniques de HMM (Modèles de Markov cachés) actuellement employés pour le suivi de partition (instruments solo, voix chantée) pour le suivi de la voix parlée en vue d'une application pour un spectacle de théatre :

Apprentissage de HMM (Modèles de Markov cachés) pour le suivi de partition et de la parole

Ce stage comprend une activité de recherche et de développement pour la partie de l'entraînement pour le suivi, un projet utilisé pour nombre d'oeuvres p.ex. de Philippe Manoury et Pierre Boulez.

  1. Développement d'une routine temps réel permettant l'adaptation du suivi de partition à une performance spécifique (l'interprète, conditions acoustiques, etc).

  2. Développement d'une méthode d'entraînement des HMM en temps différé sur une base d'enregistrements audio pour améliorer la robustesse du suivi en temps réel.

8.3   Port to Max/MSP

The port of the suivi package to Max/MSP is scheduled for June 2003 with a minimal interface. At the same time, the decision will be taken, what kind of graphical interface should be developed, where to base it on (internal Midi editor, communitation with external programs such as OpenMusic or commercial score editors) and how to integrate it into Max/MSP.

Serge LeMouton kindly offered his support for development and testing under Max/MSP.

8.4   Redesign of the Software Architecture

For the port to Max/MSP and the addition of another module for voice-following, a redesign becomes indispensable, to disentangle the dependencies on the run-time environment from the implementation of the model, and from the different modules of score following (audio, Midi, voice).

To further illustrate this need, the following diagrams illustrate the software architecture of the suivi package and its evolution. They use the UML class diagram notation, however, as the package is written in C, there aren't actual classes, but only data structures, which can be taken for these. In reality, the architecture is much less clear than depicted here, because many functions were not clearly associated with a structure, such that too many dependencies exist. In the diagrams, the classes of the suivi package have a white background, and external classes (jMax classes) have a grey background.

The old architecture shown in figure 3 had completely separate audio and Midi code, with a reduplication of the model in the structures netlev_t and evthmm_t, and of the dependencies on the sequence editor.



Figure 3: UML diagram of the software architecture before restructuring


The current architecture in figure 4 unifies the model implementation in netlev_t and the HMM state class evthmm_t, and centralises the dependencies on the sequence editor.



Figure 4: UML diagram of the software architecture of the current releases


The proposed new architecture in figure 5 puts almost all the code unique to the run-time system into the class suiviobject, so that porting to Max/MSP will be easier. The subclassing of the HMM state class will completely disentangle audio from Midi following and allow easy extension for voice following. The dependencies to the sequence editor will be replaced by the two interfaces score-in for the input to the score parser, and gui for the output, or more generally the user interface. Before, they happended to be both realised by the sequence editor, which first will be continued using an adaptor class (formerly suiviref_t), for backwards compatibility. However, this separation allows easier change of both: for spoken voice following, for instance, the score input would rather be just a file with phonetic text or a message box, and the output a text display with a cursor.



Figure 5: UML diagram of the proposed new software architecture


For the farther future, a totally modular score following architecture could be envisioned, a sort of score following toolbox. It would provide independent input analysis Max objects (the object audioanalysis that would calculate the features from the sound), an object that computes the HMM suivimodel, and a statistics and training object. Figure 6 shows a sketch of how a combined audio and Midi score following patch could look like. This modularity would easily allow multimodal input following, addition of new features, and the like.



Figure 6: Pseudo-patch of a totally modular score follower.


8.5   Define the ``List of Things to Follow''

When this list is complete, we can proceed to the definition/specification of a Midi or higher level representation. For the moment, the list includes:

8.6   Definition of the Score Representation

The definition of the imported score representation is essential for the extension of score following to other domains and outside of Ircam. The constraints are multiple:

The formats that could be subject to a closer scrutiny are:
Score editor formats:
Finale, Sibelius, Guido [HHRK98]
Mark-up languages:
MusicML, Wedelmusic XML Format,
Frameworks and others:
Common Practice Music Notation (CPNview), Allegro, MIDI
Midi, despite its restrictions, does indeed fulfill all these constraints: It can code everything we want to follow, e.g. using special Midi channels, controllers, or text events. It can be exported from every score editor, and can be fine-tuned in the sequence editor. The result is that we stay with Midi for the time being, but research has to be done for a higher-level representation that inserts itself well into the composer's and musical assistant's workflow.

8.7   Calendar of Upcoming Events and Deadlines



2002
11--12 Décembre tests/enregistrements flûte Jupiter

2003
Mercredi 15 Janvier seminaire interne suivi de partition, point de vue scientifique
16--26 Janvier Nicola Orio est à Paris
Mercredi 22 Janvier seminaire interne suivi de partition, point de vue musical et discussion
Jeudi 23 Janvier tests Piano Midi Pluton avec Andy Russo
Lundi 24 Février conférence pédagogique suivi de partition
mi-Février -- mi-Mai stages ATIAM
Mars version suivi jMax-4 Forum
Juin pièce Gilles Grand avec suivi voix parlée?
Octobre version suivi Max/MSP Forum
Octobre opéra Manoury


Previous Contents Next