In this text we present an integrated environment for music composition and performance. Our system integrates a high-level garbage collected Scheme interpreter, a rich time model, a real-time synthesizer metaphor, and distributed input and output. We discuss a theoretical model as well as the implementation details.

The bulk of musical applications still approach music composition and sound synthesis as two distinct domains within the realm of computer music. This separation may be traditional, as music has always a two step process of composition and performance. However, when musicologists mention two distinct tendencies in the music of second half of the twentieth century - the first preoccupied with new techniques of music writing, the second preoccupied with the organization of sound itself - it might be due to the lack of tools to bridge the two fields. Many composers have argued to break this pattern and sought to extend musical thought to all levels of a music piece, from form to sound. Already in 1917 Edgar Varèse wrote:

I dream of instruments obedient to thought - and which, supported by a flowering of undreamed-of timbres, will lend themselves to any combination I choose to impose and will submit to the exigencies of my inner rhythm. ([Var72], cited in [MMR74]).

Essential in the design of a music system is the choice of a computer language and the specification of a time model. The choice of a language is important for several reasons. Firstly, the language must allow an efficient implementation of signal processing techniques. Since we want to realize a real-time synthesizer, sound synthesis must be performed within predictable delays. Secondly, the environment must be easily extendible. We provide a basic set of synthesis modules and control functions. Expert users will likely want to extend the environment with new modules. Thirdly, the user must be able to ``program'' the environment. The definition of synthesis networks, musical structures, and the system's behavior in an interactive set-up escapes any trivial description. The only way to communicate such complex information is thru the use of a language.

We have chosen to develop the environment in the Java programming language. In addition, we embed a Scheme interpreter into the system for the user interaction. This high-level programming environment allows the user to extend the set of primitives of the environment in ways that were not foreseen by the programmer. Since the Scheme interpreter is implemented on top of the Java platform, one single object system and one single memory strategy is used in the environment. This promotes a transparent use of functional objects throughout the system. In particular, functional objects are used extensively to describe complex behaviors and relations. Events, for example, carry high-level functional descriptions of their side effects and provoke the application of composition algorithms.

The choice of Java as programming language raises a number of technical questions. In particular, it places a garbage collector concurrent to a real-time task in a multi-threaded environment. In this thesis we estimate if hard real-time in such an environment is possible at all. We impose the constraint that the synthesizer thread can not allocate storage space. This is not a big constraint since synthesis techniques rarely allocate new objects. Under the conditions specified in the text, the problem of the garbage collector can be reduced to that of concurrent tasks. The latter is a well-documented problem in a real-time system design.

Time is undoubtedly the most important dimension in music. A rich time model is therefore essential in composition environments. On the one hand, most composition environments only treat discrete time. However, most parameters of the sound synthesis vary continuously in time. We thus need continuous time functions for the control of the sound synthesis. On the other hand, most synthesis programs offer few tools to organize music pieces. Our environment integrates a continuous time model into a hierarchical description of the music structure. This layered, continuous time model allows the expression of non-trivial time dependencies.

In our attempt to integrate composition and synthesis we realized that most synthesis systems have a very narrow notion of time. If we want to allow real-time interaction with compositional structures and assure that the system responds consistently, we must make the system aware of the complex time relationships in the composition. In addition, statical structures cannot respond to user input. To extend the composition process to more dynamic pieces we have to include behavior and causal relations into the final score.

With the proposed framework we give the composer a set of tools for the control of sound synthesis. Since, both discrete elements and continuous control functions are organized in a single hierarchical structure, compositions become complex organizations that control the sound synthesis. Control functions can be generated by the same compositional process that generates the musical form. As such, the defition of the timbre becomes an integral part of the composition.

The proposed system offers an increased flexibility to integrate composition, synthesis, and interactivity. We pretend that our system overcomes the still existing barrier between ``music writing'' and ``sound sculpting.'' Our environment allows the composer to travel seamlessly between different layers of abstraction, from the ``sub-symbolic,'' concrete level of sound synthesis to the symbolic, abstract level of musical form. Decisions on any level of the composition may influence the organization on any other level. During runtime, high-level compositional processes can interact with low-level synthesis processes. Vice versa, low-level synthesis processes may guide compositional algorithms. In addition, user events may cause a complete reorganization of the composition during the performance. We think that a rigorous implementation of these ideas can lead to new inspiration in musical writing.

Although this work started from a reflection on music systems, some of the concepts discussed in this text can be applied more generally to all time-based media. In particular, animation systems could use a layered representation of time to their advantage. Interactive graphics systems could use an embedded Scheme interpreter and rich composition structures to offer more intelligent interactions than chaining together pre-recorded sequences. Where we can in this text, we will abstract the musical origin of our research and consider time-based media in general.

Our work is more practical than theoretical; application design has been our major concern. We consider our project as an integration of existing ideas. Our research project touches many fields including garbage collection techniques, real-time scheduling, distributed systems, dynamic languages, embedded interpreters, sound synthesis, interval algebra, temporal constraints networks, and algorithmic composition. Notions in each of this fields are required to realize our project: an integrated environment for music composition and synthesis.

Overview of the Thesis

This thesis is organized in two parts. In the first part we give an overview of the state of the art. We start with a discussion of programming techniques and operating system designs relevant to the remainder of the text (Chapter 1). In particular, this overview will help us in the analysis of the real-time performance of the proposed environment. We then give an introduction to the field of computer music (Chapter 2). We dedicate a separate chapter to the representation of time and structure in composition environments (Chapter 3).

Part two presents our work. First, we discuss the fundaments of the architecture: the choice of Java as development language, the use of and embedded Scheme interpreter, the basic concepts, and the formal model that serves as the basis of our system architecture (Chapter 4). Implementation issues and basic classes are discussed in the following chapter (Chapter 5). How the proposed environment represents time and structure can be found in chapter 6. Finally, we examine the real-time performance of the environment (Chapter 7).