In this text we present an integrated environment for music
composition and performance. Our system integrates a high-level
garbage collected Scheme interpreter, a rich time model, a real-time
synthesizer metaphor, and distributed input and output. We discuss a
theoretical model as well as the implementation details.
The bulk of musical applications still approach music composition and
sound synthesis as two distinct domains within the realm of computer
music. This separation may be traditional, as music has always a two
step process of composition and performance. However, when
musicologists mention two distinct tendencies in the music of second
half of the twentieth century - the first preoccupied with new
techniques of music writing, the second preoccupied with the
organization of sound itself - it might be due to the lack of tools
to bridge the two fields. Many composers have argued to break this
pattern and sought to extend musical thought to all levels of a music
piece, from form to sound. Already in 1917 Edgar Varèse wrote:
I dream of instruments obedient to thought - and which, supported by a
flowering of undreamed-of timbres, will lend themselves to any
combination I choose to impose and will submit to the exigencies of my
inner rhythm. ([Var72], cited in [MMR74]).
Essential in the design of a music system is the choice of a computer
language and the specification of a time model. The choice of a
language is important for several reasons. Firstly, the language must
allow an efficient implementation of signal processing techniques.
Since we want to realize a real-time synthesizer, sound synthesis must
be performed within predictable delays. Secondly, the environment must
be easily extendible. We provide a basic set of synthesis modules and
control functions. Expert users will likely want to extend the
environment with new modules. Thirdly, the user must be able to
``program'' the environment. The definition of synthesis networks,
musical structures, and the system's behavior in an interactive set-up
escapes any trivial description. The only way to communicate such
complex information is thru the use of a language.
We have chosen to develop the environment in the Java programming
language. In addition, we embed a Scheme interpreter into the system
for the user interaction. This high-level programming environment
allows the user to extend the set of primitives of the environment in
ways that were not foreseen by the programmer. Since the Scheme
interpreter is implemented on top of the Java platform, one single
object system and one single memory strategy is used in the
environment. This promotes a transparent use of functional objects
throughout the system. In particular, functional objects are used
extensively to describe complex behaviors and relations. Events, for
example, carry high-level functional descriptions of their side
effects and provoke the application of composition algorithms.
The choice of Java as programming language raises a number of
technical questions. In particular, it places a garbage collector
concurrent to a real-time task in a multi-threaded environment. In this
thesis we estimate if hard real-time in such an environment is
possible at all. We impose the constraint that the synthesizer thread
can not allocate storage space. This is not a big constraint since
synthesis techniques rarely allocate new objects. Under the conditions
specified in the text, the problem of the garbage collector can be
reduced to that of concurrent tasks. The latter is a well-documented
problem in a real-time system design.
Time is undoubtedly the most important dimension in music. A rich time
model is therefore essential in composition environments. On the one
hand, most composition environments only treat discrete time. However,
most parameters of the sound synthesis vary continuously in time. We
thus need continuous time functions for the control of the sound
synthesis. On the other hand, most synthesis programs offer few tools
to organize music pieces. Our environment integrates a continuous time
model into a hierarchical description of the music structure. This
layered, continuous time model allows the expression of non-trivial
time dependencies.
In our attempt to integrate composition and synthesis we realized that
most synthesis systems have a very narrow notion of time. If we want
to allow real-time interaction with compositional structures and
assure that the system responds consistently, we must make the system
aware of the complex time relationships in the composition. In
addition, statical structures cannot respond to user input. To extend
the composition process to more dynamic pieces we have to include
behavior and causal relations into the final score.
With the proposed framework we give the composer a set of tools for
the control of sound synthesis. Since, both discrete elements and
continuous control functions are organized in a single hierarchical
structure, compositions become complex organizations that control the
sound synthesis. Control functions can be generated by the same
compositional process that generates the musical form. As such, the
defition of the timbre becomes an integral part of the composition.
The proposed system offers an increased flexibility to integrate
composition, synthesis, and interactivity. We pretend that our system
overcomes the still existing barrier between ``music writing'' and
``sound sculpting.'' Our environment allows the composer to travel
seamlessly between different layers of abstraction, from the
``sub-symbolic,'' concrete level of sound synthesis to the symbolic,
abstract level of musical form. Decisions on any level of the
composition may influence the organization on any other level. During
runtime, high-level compositional processes can interact with
low-level synthesis processes. Vice versa, low-level synthesis
processes may guide compositional algorithms. In addition, user events
may cause a complete reorganization of the composition during the
performance. We think that a rigorous implementation of these ideas
can lead to new inspiration in musical writing.
Although this work started from a reflection on music systems, some of
the concepts discussed in this text can be applied more generally to
all time-based media. In particular, animation systems could use a
layered representation of time to their advantage. Interactive
graphics systems could use an embedded Scheme interpreter and rich
composition structures to offer more intelligent interactions than
chaining together pre-recorded sequences. Where we can in this text,
we will abstract the musical origin of our research and consider
time-based media in general.
Our work is more practical than theoretical; application design has
been our major concern. We consider our project as an integration of
existing ideas. Our research project touches many fields including
garbage collection techniques, real-time scheduling, distributed
systems, dynamic languages, embedded interpreters, sound synthesis,
interval algebra, temporal constraints networks, and algorithmic
composition. Notions in each of this fields are required to realize our
project: an integrated environment for music composition and
synthesis.
Overview of the Thesis
This thesis is organized in two parts. In the first part we give an
overview of the state of the art. We start with a discussion of
programming techniques and operating system designs relevant to the
remainder of the text
(Chapter 1).
In particular, this overview will help us in the analysis of the real-time
performance of the proposed environment. We then give an introduction
to the field of computer music
(Chapter 2).
We dedicate a separate chapter to the representation of time and structure in
composition environments
(Chapter 3).
Part two presents our work. First, we discuss the fundaments of the
architecture: the choice of Java as development language, the use of
and embedded Scheme interpreter, the basic concepts, and the formal
model that serves as the basis of our system architecture
(Chapter 4).
Implementation issues and basic
classes are discussed in the following chapter
(Chapter 5).
How the proposed environment represents
time and structure can be found in
chapter 6.
Finally, we examine the real-time performance of the environment
(Chapter 7).