Unit Selection Sound Synthesis:
Extended Bibliography

Diemo Schwarz (schwarz@ircam.fr)

Last update: June 28, 2000

This document exists as one big file (good for searching), or as several pages (good for browsing). Click on the entry type, for example MISC, for the original BibTeX entry.

1 ASP Anthropic Signal Processing Group

MISC	asp:www [ASP99]
Key	ASP
Title	Anthropic Signal Processing Group, Oregon Graduate Institute of Science and Technology
Howpublished	WWW page
Year	1999
url	`http://ece.ogi.edu/asp`
pub-url	`http://ece.ogi.edu/asp/publicat.html`
Note	`http://ece.ogi.edu/asp`

INPROC.	asp:plp85 [HHW85]
Author	H. Hermansky, B. A. Hanson, H. Wakita
Title	Perceptually based linear predictive analysis of speech
Booktitle	Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
Year	1985
Pages	509--512

INPROC.	nlp:tsdproc213-218 [Her98]
Title	Data-Driven Speech Analysis For ASR
Pages	213--218
Author	Hynek Hermansky
Booktitle	Proceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98
Year	1998
Editor	Petr Sojka, Václav Matousek, Karel Pala, Ivan Kopecek
Address	Brno, Czech Republic
Month	September
Publisher	Masaryk University Press

2 AT&T Labs

MISC	att:www [ATT99]
Key	ATT
Title	AT&T Labs, Oregon Graduate Institute of Science and Technology
Howpublished	WWW page
Year	1999
url	`http://www.research.att.com/projects/tts/`
Note	`http://www.research.att.com/projects/tts/`

INPROC.	att:nextgen99 [BCS+99]
Author	M. Beutnagel, A. Conkie, J. Schroeter, Y. Stylianou, A. Syrdal
Title	The AT&T Next-Gen TTS System
Booktitle	Joint Meeting of ASA, EAA, and DAGA
Address	Berlin, Germany
Month	March
Year	1999
Note	www [ATT99]
Abstract	The new AT&T Text-To-Speech (TTS) system for general U.S. English text is based on best-choice components of the AT&T Flextalk TTS, the Festival System from the University of Edinburgh, and ATR's CHATR system. From Flextalk, it employs text normalization, letter-to-sound, and prosody generation. Festival provides a flexible and modular architecture for easy experimentation and competitive evaluation of different algorithms or modules. In addition, we adopted CHATR's unit selection algorithms and modified them in an attempt to guarantee high intelligibility under all circumstances. Finally, we have added our own Harmonic plus Noise Model (HNM) backend for synthesizing the output speech. Most decisions made during the research and development phase of this system were based on formal subjective evaluations. We feel that the new system goes a long way toward delivering on the long-standing promise of truly natural-sounding, as well as highly intelligible, synthesis.

INPROC.	att:diph-select98 [BCS98]
Author	Mark Beutnagel, Alistair Conkie, Ann K. Syrdal
Title	Diphone Synthesis using Unit Selection
Booktitle	The 3rd ESCA/COCOSDA Workshop on Speech Synthesis
Address	Jenolan Caves, Australia
Month	November
Year	1998
Note	www [ATT99]
Remarks	Summary: CHATR unit selection (using phone units) extended to diphones. Open synthesis backend: PSOLA, HNM, wave concat. Uses standard Festival. Careful listening test examining influence on quality of synthesis/unit type/pruning. Base for Next-Gen TTS [BCS+99]?
Abstract	This paper describes an experimental AT&T concatenative synthesis system using unit selection, for which the basic synthesis units are diphones. The synthesizer may use any of the data from a large database of utterances. Since there are in general multiple instances of each concatenative unit, the system performs dynamic unit selection. Selection among candidates is done dynamically at synthesis, in a manner that is based on and extends unit selection implemented in the CHATR synthesis system [1][4]. Selected units may be either phones or diphones, and they can be synthesized by a variety of methods, including PSOLA [5], HNM [11], and simple unit concatenation. The AT&T system, with CHATR unit selection, was implemented within the framework of the Festival Speech Synthesis System [2]. The voice database amounted to approximately one and one-half hours of speech and was constructed from read text taken from three sources. The first source was a portion of the 1989 Wall Street Journal material from the Penn Treebank Project, so that the most frequent diphones were well represented. Complete diphone converage was assured by the second text, which was designed for diphone databases [12]. A third set of data consisted of recorded prompts for telephone service applications. Subjective formal listening tests were conducted to compare speech quality for several options that exist in the AT&T synthesizer, including synthesis methods and choices of fundamental units. These tests showed that unit selection techniques can be successfully applied to diphone synthesis.

INPROC.	att:HNM98 [Sty98a]
Author	Yannis Stylianou
Title	Concatenative Speech Synthesis using a Harmonic plus Noise Model
Booktitle	The 3rd ESCA/COCOSDA Workshop on Speech Synthesis
Address	Jenolan Caves, Australia
Month	November
Year	1998
Note	www [ATT99]
Abstract	This paper describes the application of the Harmonic plus Noise Model, HNM, for concatenative Text-to-Speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of speech signal in these two components allows for more natural-sounding modifications (e.g., source and filter modifications) of the signal. The parametric representation of speech using HNM provides a straightforward way of smoothing discontinuities of acoustic units around concatenation points. Formal listening tests have shown that HNM provides high-quality speech synthesis while outperforming other models for synthesis (e.g., TD-PSOLA) in intelligibility, naturalness and pleasantness.

INPROC.	att:ph98 [Sty98b]
Author	Yannis Stylianou
Title	Removing Phase Mismatches in Concatenative Speech Synthesis
Booktitle	The 3rd ESCA/COCOSDA Workshop on Speech Synthesis
Address	Jenolan Caves, Australia
Month	November
Year	1998
Note	www [ATT99]
Abstract	Concatenation of acoustic units is widely used in most of the currently available text-to-speech systems. While this approach leads to higher intelligibility and naturalness than synthesis-by-rule, it has to cope with the issues of concatenating acoustic units that have been recorded in a different order. One important issue in concatenation is that of synchronization of speech frames or, in other words, inter-frame coherence. This paper presents a novel method for synchronization of signals with applications to speech synthesis. The method is based on the notion of center of gravity applied to speech signals. It is an off-line approach as this can be done during analysis with no computational burden on synthesis. The method has been tested with the Harmonic plus Noise Model, HNM, on many large speech databases. The resulting synthetic speech is free of phase mismatch (inter-frame incoherence) problems.

INPROC.	att:Yang98 [YS98]
Author	Ping-Fai Yang, Yannis Stylianou
Title	Real Time Voice Alteration Based on Linear Prediction
Year	1998
Booktitle	Proc. ICSLP98
Note	www [ATT99]

INPROC.	att:Syrdal98 [SCS98]
Author	Ann K. Syrdal, Alistair Conkie, Yannis Stylianou
Title	Exploration of Acoustic Correlates in Speaker Selection for Concatenative Synthesis
Year	1998
Booktitle	Proc. ICSLP98
Note	www [ATT99]

INPROC.	att:Ostermann98 [OBFW98]
Author	Jörn Ostermann, Mark C. Beutnagel, Ariel Fischer, Yao Wang
Title	Integration Of Talking Heads And Text-To-Speech Synthesizers For Visual TTS
Year	1998
Booktitle	Proc. ICSLP98
Note	www [ATT99]

INPROC.	att:paperSYN98 [SSG+98]
Author	Ann K Syrdal, Yannis G Stylianou, Laurie F Garrison, Alistair Conkie, Juergen Schroeter
Title	TD-PSOLA versus Harmonic Plus Noise Model in Diphone Based Speech Synthesis
Year	1998
Booktitle	Proc. ICASSP98
Pages	273--276
Note	www [ATT99]
Abstract	In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the Harmonic plus Noise Model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational load is also discussed. The results show that HNM consistently outperforms TD-PSOLA in all the above features except for computational load. HNM allows for high-quality speech synthesis without smoothing problems at the segmental boundaries and without buzziness or other oddities observed with TD-PSOLA.

3 CNMAT Center for New Music and Audio Technologies

INPROC.	cnmat:sdif98 [WCF+98]
Author	Matthew Wright, Amar Chaudhary, Adrian Freed, David Wessel, Xavier Rodet, Dominique Virolle, Rolf Woehrmann, Xavier Serra
Title	New Applications of the Sound Description Interchange Format
Booktitle	Proceedings of the International Computer Music Conference
Year	1998

INPROC.	cnmat:sdif98-short [W+98]
Author	M. Wright, others
Title	New Applications of the Sound Description Interchange Format
Booktitle	Proc. ICMC
Year	1998

INPROC.	cnmat:sdif99 [WCF+99b]
Author	Matthew Wright, Amar Chaudhary, Adrian Freed, Sami Khoury, David Wessel
Title	Audio Applications of the Sound Description Interchange Format Standard
Booktitle	AES 107th convention preprint
Year	1999

INPROC.	cnmat:sdif99-short [WCF+99a]
Author	M. Wright, A. Chaudhary, A. Freed, S. Khoury, D. Wessel
Title	Audio Applications of the Sound Description Interchange Format Standard
Booktitle	AES 107th convention
Year	1999

INPROC.	cnmat:sdif99-sshort [W+99]
Author	M. Wright, others
Title	Audio Applications of the Sound Description Interchange Format Standard
Booktitle	AES 107th convention
Year	1999

INPROC.	cnmat:sdif-mpeg4 [WS99b]
Author	Matthew Wright, Eric D. Scheirer
Title	Cross-Coding SDIF into MPEG-4 Structured Audio
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	1999
Address	Beijing
Month	October
url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/papers/saol+sdif/icmc99-saol+sdif.html`
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/abstracts/sdif+mpeg4.html`
bib-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999`
Abstract	With the completion of the MPEG-4 international standard in October 1998, considerable industry and academic resources will be devoted to building implementations of the MPEG-4 Structured Audio tools. Among these tools is the Structured Audio Orchestra Language (``SAOL''), a general-purpose sound processing and synthesis language. The standardization of MPEG-4 and SAOL is an important development for the computer music community, because compositions written in SAOL will be able to be synthesized by any compliant MPEG-4 decoder. At the same time, the sound analysis and synthesis community has developed and embraced the Sound Description Interface Format (``SDIF''), a general-purpose framework for representing various high-level sound descriptions such as sum-of-sinusoids, noise bands, time-domain samples, and formants. Many tools for composing and manipulating sound in the SDIF format have been created. Composers, sound designers, and analysis/synthesis researchers can benefit from the combined strengths of MPEG-4 and SDIF by using the MPEG-4 Structured Audio decoder as an SDIF synthesizer. This allows the use of sophisticated SDIF tools to create musical works, while leveraging the anticipated wide penetration of MPEG-4 playback devices. Cross-coding SDIF into the Structured Audio format is an example of ``Generalized Audio Coding,'' a new paradigm in which an MPEG-4 Structured Audio decoder is used to flexibly understand and play sound stored in any format. We cross-code SDIF into Structured Audio by writing a SAOL instrument for each type of SDIF sound representation and a translator that maps SDIF data into a Structured Audio score. Rather than use many notes to represent the frames of SDIF data, we use the ``streaming wavetable'' functions of SAOL to create instruments that dynamically interpret spectral, sinusoidal, or other constantly changing data. These SAOL instruments retrieve SDIF data from streaming wavetables via custom unit generators that can be reused to build SAOL synthesizers for other SDIF sound representations. We demonstrate the construction of several different SDIF object types within the Structured Audio framework; the resulting bitstreams are very compact and follow the MPEG-4 specification exactly. Any conforming MPEG-4 decoder can play them back and produce the sound desired by the composer. Our paper will discuss in depth the features of SAOL that make these sorts of instruments possible. By building a link between the MPEG-4 community and the SDIF community, our work contributes to both: The MPEG-4 community benefits by receiving support for synthesis from a large and extensible collection of sound descriptions, each with unique properties of data compression and mutability. The SDIF community gets a stable SDIF synthesis platform that is likely to be supported on a variety of inexpensive, high performance hardware platforms. MPEG-4 also provides the potential to integrate SDIF with other formats, e.g., streaming SDIF data synchronized with video and compressed speech. Finally, each standardization effort benefits from an expanded user base: SDIF users become MPEG-4 users without giving up their familiar tools, while MPEG-4 users outside the small community of sound analysis/synthesis researchers can discover SDIF and the high-level sound descriptions it supports. We have made the cross-coding tools and SDIF object instruments freely available to the computer music community in order to promote the continuing interoperability of these important specifications.

INPROC.	cnmat:sdif-mpeg4-short [WS99a]
Author	M. Wright, E. Scheirer
Title	Cross-Coding SDIF into MPEG-4 Structured Audio
Booktitle	Proc. ICMC
Year	1999
Address	Beijing
url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/papers/saol+sdif/icmc99-saol+sdif.html`
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/abstracts/sdif+mpeg4.html`
bib-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999`

INPROC.	cnmat:sdif-msp [WDK+99b]
Author	Matthew Wright, Richard Dudas, Sami Khoury, Raymond Wang, David Zicarelli
Title	Supporting the Sound Description Interchange Format in the Max/MSP Environment
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	1999
Address	Beijing
Month	October
url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/papers/msp+sdif/ICMC99-MSP+SDIF-short.html`
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/abstracts/sdif+msp.html`
bib-url	`http://www.ircam.fr/equipes/repmus/RMPapers/`
Abstract	The Sound Description Interchange Format (``SDIF'') is an extensible, general-purpose framework for representing high-level sound descriptions such as sum-of-sinusoids, noise bands, time-domain samples, and formants, and is used in many interesting sound analysis and synthesis applications. SDIF data consists of time-tagged ``frames,'' each containing one or more 2D ``matrices''. For example, in an SDIF file representing additive synthesis data, the matrix rows represent individual sinusoids and the columns represent parameters such as frequency, amplitude, and phase. Because of Max/MSP's many attractive features for developing real-time computer music applications, it makes a fine environment for developing applications that manipulate SDIF data. These features include active support and development, a large library of primitive computational objects, and a rich history and repertoire. Unfortunately, Max/MSP's limited language of data structures does not support the structure required by SDIF. Although it is straightforward to extend Max/MSP with an object to read SDIF, there is no Max/MSP data type that could be used to output SDIF data to the rest of a Max/MSP application. We circumvent these problems with a novel technique to manipulate SDIF data within Max/MSP. We have created an object called ``SDIF-buffer'' that represents a collection of SDIF data in memory, analogous to MSP's ``buffer '' object that represents audio samples in memory. This allows SDIF data to be represented with C data structures. Max/MSP has objects that provide various control structures to read data from a ``buffer '' and output signals or events usable by other Max/MSP objects. Similarly, we have created a variety of ``SDIF selector'' objects that select a piece of SDIF data from an SDIF-buffer and shoehorn it into a standard Max/MSP data type. The simplest SDIF selector outputs the main matrix from the SDIF frame whose time tag is closest to a given input time. Arguments specify which columns should be output and whether each row should appear as an individual list or all the rows should be concatenated into a single list. More sophisticated SDIF selectors hide the discrete time sampling of SDIF frames, using interpolation along the time axis to synthesize SDIF data. This provides the abstraction of continuous time, with a virtual SDIF frame corresponding to any point along the time axis. We provide linear and a variety of polynomial interpolators. This abstraction of continuously-sampled SDIF data gives rise to sophisticated ways of moving through the time axis of an SDIF-buffer. We introduce the notion of a ``time machine'', a control structure for controlling position in an SDIF time axis in real time, and demonstrate time machines with musically useful features. ``SDIF mutator'' objects have been created that can manipulate data in an SDIF-buffer in response to Max messages. This allows us to write real-time sound analysis software to generate an SDIF model of an audio signal. We implement control structures such as transposition, filtering, and inharmonicity as normal Max/MSP patches that mutate a ``working'' SDIF-buffer; these are cascaded when they share the same SDIF-buffer. These control structures communicate via symbolic references to SDIF-buffers represented as normal Max messages. This system also supports network streaming of SDIF data. As research continues towards more efficient and musically interesting streaming protocols, Max/MSP interfaces will be implemented in C as SDIF mutators that access a given SDIF buffer via a struct definition in the exposed SDIF-buffer header file. One promising approach is to begin transmission with a low-resolution representation and then fill it in with increasing detail. Time machines communicate with streaming interfaces via Max messages to request or predict ranges of time that will need to be available in the near future.

INPROC.	cnmat:sdif-msp-short [WDK+99a]
Author	M. Wright, R. Dudas, S. Khoury, R. Wang, D. Zicarelli
Title	Supporting the Sound Description Interchange Format in the Max/MSP Environment
Booktitle	Proc. ICMC
Year	1999
Address	Beijing
url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/papers/msp+sdif/ICMC99-MSP+SDIF-short.html`
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999/abstracts/sdif+msp.html`
bib-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC1999`

INPROC.	cnmat:sdif-srl [WCF+00b]
Author	Matthew Wright, Amar Chaudhary, Adrian Freed, Sami Khoury, Ali Momeni, Diemo Schwarz, David Wessel
Title	An XML-based SDIF Stream Relationships Language
Booktitle	Proceedings of the International Computer Music Conference
Year	2000
Address	Berlin
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC2000/abstracts/xml-sdif`
bib-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC2000/`

INPROC.	cnmat:sdif-srl-short [WCF+00a]
Author	M. Wright, A. Chaudhary, A. Freed, S. Khoury, A. Momeni, D. Schwarz, D. Wessel
Title	An XML-based SDIF Stream Relationships Language
Booktitle	Proc. ICMC
Year	2000
Address	Berlin
abstract-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC2000/abstracts/xml-sdif`
bib-url	`http://cnmat.CNMAT.Berkeley.EDU/ICMC2000/`

INPROC.	cnmat:osw2000-short [CFW00]
Author	A. Chaudhary, A. Freed, M. Wright
Title	An Open Architecture for Real-time Music Software
Booktitle	Proc. ICMC
Year	2000
Address	Berlin

4 CSLU Speech Synthesis Research Group

MISC	cslu:www [CSLU99]
Key	CSLU
Title	CSLU Speech Synthesis Research Group, Oregon Graduate Institute of Science and Technology
Howpublished	WWW page
Year	1999
url	`http://cslu.cse.ogi.edu/tts`
pub-url	`http://cslu.cse.ogi.edu/tts/publications`
Note	`http://cslu.cse.ogi.edu/tts`

ARTICLE	cslu:ieeetsap98 [KMS98]
Author	F. Kossentini, M. Macon, M. Smith
Title	Audio coding using variable-depth multistage quantization
Booktitle	IEEE Transactions on Speech and Audio Processing
Volume	6
Year	1998
Note	www [CSLU99]

INPROC.	cslu:esca98mm [MCW98]
Author	M. W. Macon, A. E. Cronk, J. Wouters
Title	Generalization and Discrimination in tree-structured unit selection
Booktitle	Proceedings of the 3rd ESCA/COCOSDA International Speech Synthesis Workshop
Month	November
Year	1998
Note	www [CSLU99]
Remarks	Great overview of several unit selection methods, comprehensive biliography: origin of unit selection? [Sag88]. festival unit selection [HB96, BC95]. classification and regression trees [BFOS84a]. clustering and decision trees [BT97b, WCIS93, Nak94]. Mahalanobis distance [Don96]. decision trees for: speech recognition [NGY97], speech synthesis [HAea96]. data driven direct mapping with ANN [KCG96, TR]. distance measures for: coding [QBC88], ASR [NSRK85, HJ88], in general [GS97], concatenative speech synthesis [HC98, WM98]. PLP: [HM94]. Linear regression and correlation, Fisher transform: [Edw93]. Tree pruning: [CM98]. Masking effects: [Moo89].
Abstract	Concatenative ``selection-based'' synthesis from large databases has emerged as a viable framework for TTS waveform generation. Unit selection algorithms attempt to predict the appropriateness of a particular database speech segment using only linguistic features output by text analysis and prosody prediction components of a synthesizer. All of these algorithms have in common a training or ``learning'' phase in which parameters are trained to select appropriate waveform segments for a given feature vector input. One approach to this step is to partition available data into clusters that can be indexed by linguistic features available at runtime. This method relies critically on two important principles: discrimination of fine phonetic details using a perceptually-motivated distance measure in training and generalization to unseen cases in selection. In this paper, we describe efforts to systematically investigate and improve these parts of the process.

INPROC.	cslu:esca98kain [KM98a]
Author	A. Kain, M. W. Macon
Title	Personalizing a speech synthesizer by voice adaptation
Booktitle	Proceedings of the 3rd ESCA/COCOSDA International Speech Synthesis Workshop
Month	November
Year	1998
Pages	225--230
Note	www [CSLU99]
Abstract	A voice adaptation system enables users to quickly create new voices for a text-to-speech system, allowing for the personalization of the synthesis output. The system adapts to the pitch and spectrum of the target speaker, using a probabilistic, locally linear conversion function based on a Gaussian Mixture Model. Numerical and perceptual evaluations reveal insights into the correlation between adaptation quality and the amount of training data, the number of free parameters. A new joint density estimation algorithm is compared to a previous approach. Numerical errors are studied on the basis of broad phonetic categories. A data augmentation method for training data with incomplete phonetic coverage is investigated and found to maintain high speech quality while partially adapting to the target voice.

INPROC.	cslu:icslp98cronk [CM98]
Author	Andrew E. Cronk, Michael W. Macon
Title	Optimized Stopping Criteria for Tree-Based Unit Selection in Concatenative Synthesis
Oldtitle	Optimization of stopping criteria for tree-structured unit selection
Booktitle	Proc. of International Conference on Spoken Language Processing
Volume	5
Month	November
Year	1998
Pages	1951--1955
Note	www [CSLU99]
Remarks	Summary: Method for growing optimal clustering tree (CART, as in [BFOS84a]). Not stopping with thresholds, but growing three completely (until no splittable clusters are left), and then pruning by recombining clusters by a greedy algorithm. Gives evaluation measure V-fold cross validation for tree quality. Clusters represent units with equivalent target cost. A best split of a cluster maximizes the decrease in data impurity (lower within-cluster variance of acoustic features). N.B.: Clustering of units is not classification, as the classes are not known in advance, and the method is unsupervised! Weighting in distortion measure using Mahalanobis distance as the inverse of the variance. References: [BC95], [BT97b], [BFOS84a], [Don96], [Fuk90] (CART tree evaluation criterion), [NGY97], [Nak94], [WCIS93].

INPROC.	cslu:icslp98kain [KM98b]
Author	A. Kain, M. W. Macon
Title	Text-to-speech voice adaptation from sparse training data
Booktitle	Proc. of International Conference on Spoken Language Processing
Month	November
Year	1998
Pages	2847--2850
Note	www [CSLU99]

INPROC.	cslu:icslp98-paper [WM98]
Author	J. Wouters, M. W. Macon
Title	A Perceptual Evaluation of Distance Measures for Concatenative Speech Synthesis
Booktitle	Proc. of International Conference on Spoken Language Processing
Month	November
Year	1998
Note	www [CSLU99]
Abstract	In concatenative synthesis, new utterances are created by concatenating segments (units) of recorded speech. When the segments are extracted from a large speech corpus, a key issue is to select segments that will sound natural in a given phonetic context. Distance measures are often used for this task. However, little is known about the perceptual relevance of these measures. More insightinto the relationship between computed distances and perceptual differences is needed to develop accurate unit selection algorithms, and to improve the quality of the resulting computer speech. In this paper, we develop a perceptual test to measure subtle phonetic differences between speech units. We use the perceptual data to evaluate several popular distance measures. The results show that distance measures that use frequency warping perform better than those that do not, and minimal extra advantage is gained by using weighted distances or delta features.

INPROC.	cslu:cslutoolkit [SCdV+98]
Author	S. Sutton, R. Cole, J. de Villiers, J. Schalkwyk, P. Vermeulen, M. Macon, Y. Yan, E. Kaiser, B. Rundle, K. Shobaki, P. Hosom, A. Kain, J. Wouters, D. Massaro, M. Cohen
Title	Universal Speech Tools: the CSLU Toolkit
Booktitle	Proc. of International Conference on Spoken Language Processing
Month	November
Year	1998
Note	www [CSLU99]

INCOLL.	cslu:german98 [MKC+98]
Author	M. W. Macon, A. Kain, A. E. Cronk, H. Meyer, K. Mueller, B. Saeuberlich, A. W. Black
Title	Rapid Prototyping of a German TTS System
Booktitle	Tech. Rep. CSE-98-015
Publisher	Department of Computer Science, Oregon Graduate Institute of Science and Technology
Address	Portland, OR
Month	September
Year	1998
Note	www [CSLU99]

INPROC.	cslu:icassp98mm [MMLV98]
Author	M. W. Macon, A. McCree, W. M. Lai, V. Viswanathan
Title	Efficient Analysis/Synthesis of Percussion Musical Instrument Sounds Using an All-Pole Model
Booktitle	Proceedings of the International Conference on Acoustics, Speech, and Signal Processing
Volume	6
Publisher	Speech
Month	May
Year	1998
Pages	3589--3592
Note	www [CSLU99]
Abstract	It is well-known that an impulse-excited, all-pole filter is capable of representing many physical phenomena, including the oscillatory modes of percussion musical instruments like woodblocks, xylophones, or chimes. In contrast to the more common application of all-pole models to speech, however, practical problems arise in music synthesis due to the location of poles very close to the unit circle. The objective of this work was to develop algorithms to find excitation and filter parameters for synthesis of percussion instrument sounds using only an inexpensive all-pole filter chip (TI TSP50C1x). The paper describes analysis methods for dealing with pole locations near the unit circle, as well as a general method for modeling the transient attackcharacteristics of a particular sound while independently controlling the amplitudes of each oscillatory mode.

INPROC.	cslu:icassp98kain [KM98c]
Author	Alexander Kain, Michael W Macon
Title	Spectral Voice Conversion for Text-to-Speech Synthesis
Year	1998
Booktitle	Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98)
Pages	285--288
Note	www [CSLU99]
Abstract	A new voice conversion algorithm that modifies a source speaker's speech to sound as if produced by a target speaker is presented. It is applied to a residual-excited LPC text-to-speech diphone synthesizer. Spectral parameters are mapped using a locally linear transformation based on Gaussian mixture models whose parameters are trained by joint density estimation. The LPC residuals are adjusted to match the target speaker's average pitch. To study effects of the amount of training on performance, data sets of varying sizes are created by automatically selecting subsets of all available diphones by a vector quantization method. In an objective evaluation, the proposed method is found to perform more reliably for small training sets than a previous approach. In perceptual tests, it was shown that nearly optimal spectral conversion performance was achieved, even with a small amount of training data. However, speech quality improved with an increase in training set size.

INCOLL.	cslu:ogireslpc97 [MCWK97]
Author	M. W. Macon, A. E. Cronk, J. Wouters, A. Kain
Title	OGIresLPC: Diphone synthesizer using residual-excited linear prediction
Booktitle	Tech. Rep. CSE-97-007
Publisher	Department of Computer Science, Oregon Graduate Institute of Science and Technology
Month	September
Year	1997
Address	Portland, OR
Note	www [CSLU99]

INPROC.	cslu:aes97 [MJLO+97a]
Author	M. W. Macon, L. Jensen-Link, J. Oliverio, M. Clements, E. B. George
Title	Concatenation-based MIDI-to-singing voice synthesis
Booktitle	103rd Meeting of the Audio Engineering Society
Publisher	New York
Year	1997
Note	www [CSLU99]
Abstract	In this paper, we propose a system for synthesizing the human singing voice and the musical subtleties that accompany it. The system, Lyricos, employs a concatenation-based text-to-speech method to synthesize arbitrary lyrics in a given language. Using information contained in a regular MIDI file, the system chooses units, represented as sinusoidal waveform model parameters, from an inventory of data collected from a professional singer, and concatenates these to form arbitrary lyrical phrases. Standard MIDI messages control parameters for the addition of vibrato, spectral tilt, and dynamic musical expression, resulting in a very natural-sounding singing voice.

INPROC.	cslu:trsap97 [MC97]
Author	M. W. Macon, M. A. Clements
Title	Sinusoidal modeling and modification of unvoiced speech
Booktitle	IEEE Transactions on Speech and Audio Processing
Volume	5
Month	November
Year	1997
Pages	557--560
Number	6
Note	www [CSLU99]
Abstract	Although sinusoidal models have been shown to be useful for time-scale and pitch modification of voiced speech, objectionable artifacts often arise when such models are applied to unvoiced speech. This correspondence presents a sinusoidal model-based speech modification algorithm that preserves the natural character of unvoiced speech sounds after pitch and time-scale modification, eliminating commonly-encountered artifacts. This advance is accomplished via a perceptually-motivated modulation of the sinusoidal component phases that mitigates artifacts in the reconstructed signal after time-scale and pitch modification

INPROC.	cslu:icassp97 [MJLO+97b]
Author	Michael Macon, Leslie Jensen-Link, James Oliverio, Mark A. Clements, E. Bryan George
Title	A Singing Voice Synthesis System Based on Sinusoidal Modeling
Year	1997
Booktitle	Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97)
Pages	435--438
Note	www [CSLU99]
Abstract	Although sinusoidal models have been demonstrated to be capable of high-quality musical instrument synthesis, speech modification, and speech synthesis, little exploration of the application of these models to the synthesis of singing voice has been undertaken. In this paper, we propose a system framework similar to that employed in concatenation-based text-to-speech synthesizers, and describe its extension to the synthesis of singing voice. The power and flexibility of the sinusoidal model used in the waveform synthesis portion of the system enables high-quality, computationally-effcient synthesis and the incorporation of musical qualities such as vibrato and spectral tilt variation. Modeling of segmental phonetic characteristics is achieved by employing a``unit selection'' procedure that selects sinusoidally-modeled segments from an inventory of singing voice data collected from ahuman vocalist. The system, called Lyricos, is capable of synthesizing very natural-sounding singing that maintains the characteristics and perceived identityof the analyzed vocalist.

INPROC.	cslu:icassp96 [MC96]
Address	Atlanta, USA
Author	Michael W. Macon, Mark A. Clements
Booktitle	Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96)
Title	Speech Concatenation and Synthesis Using an Overlap--Add Sinusoidal Model
Year	1996
Volume	1
Pages	361--364
Note	www [CSLU99]
Abstract	In this paper, an algorithm for the concatenation of speech signal segments taken from disjoint utterances is presented. The algorithm is based on the Analysis-by-Synthesis/Overlap-Add (ABS/OLA) sinusoidal model, which is capable of performing high quality pitch- and time-scale modification of both speech and music signals. With the incorporation of concatenation and smoothing techniques, the model is capable of smoothing the transitions between separately-analyzed speech segments by matching the time- and frequency-domain characteristics of the signals at their boundaries. The application of these techniques in a text-to-speech system based on concatenation of diphone sinusoidal models is also presented.

INPROC.	cslu:jasa95 [MC95]
Author	M. W. Macon, M. A. Clements
Title	Speech synthesis based on an overlap-add sinusoidal model
Booktitle	J. of the Acoustical Society of America
Volume	97
Publisher	Pt. 2
Month	May
Year	1995
Pages	3246
Number	5
Note	www [CSLU99]

5 CSTR Centre for Speech Technology Research

MISC	cstr:www [CSTR99]
Key	CSTR
Title	Centre for Speech Technology Research, University of Edinburgh
Howpublished	WWW page
Year	1999
url	`http://www.cstr.ed.ac.uk/`
pub-url	`http://www.cstr.ed.ac.uk/projects/festival/papers.html`
Note	`http://www.cstr.ed.ac.uk/`

INPROC.	cstr:unitsel96 [HB96]
Author	A. J. Hunt, A. W. Black
Title	Unit Selection in a Concatenative Speech Synthesis System using a Large Speech Database
Booktitle	Proc. ICASSP '96
Address	Atlanta, GA
Month	May
Year	1996
Pages	373--376
Note	www [CSTR99] Electronic version: cstr/Black1996a.s.*
Remarks	cited in [MCW98]
Abstract	One approach to the generation of natural-sounding synthesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a target phoneme sequence predicted from text which is annotated with prosodic and phonetic context information. We propose that the units in a synthesis database can be considered as a state transition network in which the state occupancy cost is the distance between a database unit and a target, and the transition cost is an estimate of the quality of concatenation of two consecutive units. This framework has many similarities to HMM-based speech recognition. A pruned Viterbi search is used to select the best units for synthesis from the database. This approach to waveform synthesis permits training from natural speech: two meth ods for training from speech are presented which provide weights which produce more natural speech than can be obtained by handtuning.

INPROC.	cstr:unitsel97 [BT97b]
Author	Alan W Black, Paul Taylor
Title	Automatically Clustering Similar Units for Unit Selection in Speech Synthesis
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	601--604
Note	www [CSTR99] Electronic version: cstr/Black1997b.*
Remarks	cited in [MCW98]: clustering and decision trees
Abstract	This paper describes a new method for synthesizing speech by concatenating sub-word units from a database of labelled speech. A large unit inventory is created by automatically clustering units of the same phone class based on their phonetic and prosodic context. The appropriate cluster is then selected for a target unit offering a small set of candidate units. An optimal path is found through the candidate units based on their distance from the cluster center and an acoustically based join cost. Details of the method and justification are presented. The results of experiments using two different databases are given, optimising various parameters within the system. Also a comparison with other existing selection based synthesis techniques is given showing the advantages this method has over existing ones. The method is implemented within a full text-to-speech system offering efficient natural sounding speech synthesis.

INPROC.	cstr:eursp95 [BC95]
Author	A. W. Black, N. Campbell
Title	Optimising selection of units from speech databases for concatenative synthesis
Booktitle	Proc. Eurospeech '95
Volume	1
Address	Madrid, Spain
Month	September
Year	1995
Pages	581--584
Remarks	Summary: Detailed description of unit selection model, used features and context, concatenation join point optimisation. Description of weight optimising procedure: euclidian cepstral distance (very limited first attempt) on real-speech test sentences. Unit selection as used in CHATR. cited in [MCW98]

INPROC.	cstr:ssml97 [STTI97]
Author	Richard Sproat, Paul Taylor, Michael Tanenblatt, Amy Isard
Title	A Markup Language for Text-To-Speech Synthesis
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	1747--1750
Note	www [CSTR99] Electronic version: cstr/Sproat1997a.*
Abstract	Text-to-speech synthesizers must process text, and therefore require some knowledge of text structure. While many TTS systems allow for user control by means of ad hoc `escape sequences', there remains to date no adequate and generally agreed upon system-independent standard for marking up text for the purposes of synthesis. The present paper is a collaborative effort between two speech groups aimed at producing such a standard, in the form of an SGML-based markup language that we call STML --- Spoken Text Markup Language. The primary purpose of this paper is not to present STML as a fait accompli, but rather to interest other TTS research groups to collaborate and contribute to the development of this standard.

TECHREP.	cstr:festival97 [BT97a]
Author	Alan Black, Paul Taylor
Title	The Festival Speech Synthesis System: System Documentation (1.1.1)
Institution	Human Communication Research Centre
Type	Technical Report
Number	HCRC/TR-83
Month	January
Year	1997
Pages	154
Note	www [CSTR99]
url	`http://www.cstr.ed.ac.uk/projects/festival/manual-1.1.1/festival-1.1.1.ps.gz`
Remarks	new version [BTC98]

TECHREP.	cstr:festival98 [BTC98]
Author	Alan Black, Paul Taylor, Richard Caley
Title	The Festival Speech Synthesis System: System Documentation (1.3.1)
Institution	Human Communication Research Centre
Type	Technical Report
Number	HCRC/TR-83
Month	December
Year	1998
Pages	202
Note	www [CSTR99]
url	`http://www.cstr.ed.ac.uk/projects/festival/manual-1.3.1/festival_toc.html`
Remarks	updated version of [BTC98], new utterance structure as in [Tay99], multiple synthesizers

TECHREP.	cstr:festivalarch98 [Tay99]
Author	Paul Taylor
Title	The Festival Speech Architecture
Type	Web Page
Year	1999
Note	www [CSTR99]
url	`http://www.cstr.ed.ac.uk/projects/festival/arch.html`
Abstract	This is a short document describing the way we represent speech and linguistic structures in Festival. There are three main types of structure: Items An item is a single linguistic unit, such as a phone, word, syllable, syntactic node, intonation phrase etc. Each item has a set of features which describe its local properties. For instance a word could have features, , , ... Values of features can be real values or functions. Relations A relation links together items of a common linguistic type. For instance there we might have a word, phone, syntax or syllable relation. Relations are general graph structures, the most common type being a simple doubly linked list. Eg. the word relation is a doubly linked list that links all the words in an utterance in the order they occur in. Relations can also take the form of trees. For example, we have a syllable structure relation which gives onset, coda, nucleus and rhyme structure for a syllable. The crucial aspect of the Festival architecture is that items can be in more than one relation. For example, a syntax relation is a tree whose terminal elements are words, which are also in the word relation. Utterances Utterances contain a list of all the relations.

INPROC.	Campbell_FactAffe_EURO97 [CYDH97]
Author	Nick Campbell, Itoh Yoshiharu, Wen Ding, Norio Higuchi
Title	Factors Affecting Perceived Quality and Intelligibility in the CHATR Concatenative Speech Synthesiser
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	2635--2638
Remarks	TO BE FOUND

ARTICLE	Campbell_CHATR [Cam96]
Author	N. Campbell
Title	CHATR: A High-Definition Speech Re-Sequencing System
Journal	Acoustical Society of America and Acoustical Society of Japan, Third Joint Meeting
Address	Honolulu, HI
Month	December
Year	1996
Remarks	TO BE FOUND

6 Computer Science

BOOK	softeng [GJM91]
Author	Carlo Ghezzi, Mehdi Jazayeri, Dino Mandrioli
Title	Fundamentals of Software Engineering
Publisher	Prentice--Hall
Address	Englewood Cliffs, NJ
Year	1991

BOOK	boehm [Boe89]
Author	Barry W. Boehm
Title	Software risk management
Publisher	IEEE Computer Society Press
Address	Washington
Year	1989

BOOK	Szyperski98 [Szy98]
Key	Szperski
Author	Clemens Szyperski
Title	Component Software: Beyond Object-Oriented Programming
Publisher	ACM Press and Addison-Wesley
Year	1998
Address	New York, NY
Annotate	An excellent overview of component-based programming. Many references.

BOOK	booch [Boo94]
Author	Grady Booch
Title	Object-Oriented Analysis and Design with Applications
Edition	2nd
Publisher	Benjamin--Cummings
Address	Redwood City, Calif.
Year	1994

BOOK	omt [RBP+91]
Author	James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, William Lorensen
Title	Object-Oriented Modeling and Design
Publisher	Prentice--Hall
Address	Englewood Cliffs, NJ
Year	1991

BOOK	ivar [Jac95b]
Author	Ivar Jacobson
Title	Object-Oriented Software Engineering: a Use Case driven Approach
Publisher	Addison--Wesley
Address	Wokingham, England
Year	1995

UNPUBLISHED	uml-www [Sof97]
Key	Rational
Author	Rational Software
Title	Unified Modeling Language, version 1.1
Month	September
Year	1997
Note	Online documentation¹

BOOK	DuCharme99 [DuC99]
Author	Bob DuCharme
Title	XML: the annotated specification
Publisher	Prentice-Hall PTR
Address	Upper Saddle River, NJ 07458, USA
Pages	xix + 339
Year	1999
Isbn	0-13-082676-6
Series	The Charles F. Goldfarb series on open information management
Keywords	XML (Document markup language); Database management.

MISC	XML [Cov00]
Key	XML
Title	The XML Cover Pages
Author	Robin Cover
Publisher	OASIS, Organization for the Advancement of Structured Information Standards
Howpublished	WWW page
Year	2000
url	`http://www.oasis-open.org/cover/xml.html`
Note	`http://www.oasis-open.org/cover/xml.html`
Abstract	Extensible Markup Language (XML) is descriptively identified as "an extremely simple dialect [or 'subset'] of SGML" the goal of which "is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML," for which reason "XML has been designed for ease of implementation, and for interoperability with both SGML and HTML."
Remarks	Interesting links (among a wealth of introductory as well as detailed information): XML Metadata Interchange Format (XMI) - Object Management Group (OMG) `http://www.oasis-open.org/cover/xmi.html`. The design of the XML Metadata Interchange Format (XMI) represents an extremely important initiative. It has a goal of unifying XML and related W3C specifications with several object/component modeling standards, as well as with STEP schemas, and more. Particularly, it would "combine the benefits of the web-based XML standard for defining, validating, and sharing document formats on the web with the benefits of the object-oriented Unified Modeling Language (UML), a specification of the Object Management Group (OMG) that provides application developers a common language for specifying, visualizing, constructing, and documenting distributed objects and business models." Extensible User Interface Language (XUL) `http://www.oasis-open.org/cover/xul.html` "XUL stands for 'extensible user interface language'. It is an XML-based language for describing the contents of windows and dialogs. XUL has language constructs for all of the typical dialog controls, as well as for widgets like toolbars, trees, progress bars, and menus." User Interface Markup Language (UIML) `http://www.oasis-open.org/cover/uiml.html` The User Interface Markup Language (UIML) "allows designers to describe the user interface in generic terms, and then use a style description to map the interface to various operating systems (OSs) and appliances. Thus, the universality of UIML makes it possible to describe a rich set of interfaces and reduces the work in porting the user interface to another platform (e.g., from a graphical windowing system to a hand-held appliance) to changing the style description." See the separate document. XML Application Environments, Development Toolkits, Conversion `http://www.oasis-open.org/cover/publicSW.htm\#xmlTestbed` XML Testbed. An XML application environment written in Java. From Steve Withall. ..."uses an XML configuration file to define the (Swing-based) user interface; includes its own non-validating XML parser (though it can use any SAX parser instead), a nascent XSL engine (to the old submission standard - just in time to be out of date), and a few other odds and ends." `http://www.w3.org/XML/1998/08withall/` `http://www.w3.org/XML/1998/08withall/xt-beta-1-980816.zip` `http://www.w3.org/XML/1998/08withall/MontrealSlides/XXXIntroduction.html`

ARTICLE	Abrams:1999:UAI [APB+99]
Author	Marc Abrams, Constantinos Phanouriou, Alan L. Batongbacal, Stephen M. Williams, Jonathan E. Shuster
Title	UIML: an appliance-independent XML user interface language
Journal	Computer Networks (Amsterdam, Netherlands: 1999)
Volume	31
Number	11--16
Pages	1695--1708
Day	17
Month	May
Year	1999
Coden	????
Issn	1389-1286
Bibdate	Fri Sep 24 19:43:29 MDT 1999
url	`http://www.elsevier.com/cas/tree/store/comnet/sub/1999/31/11-16/2170.pdf`
Remarks	TO BE FOUND

BOOK	Chauvet:1999:CTC [Cha99]
Author	Jean-Marie Chauvet
Title	Composants et transactions: COMMTS, CorbaOTS, JavaEJB, XML
Publisher	Eyrolles: Informatiques magazine
Address	Paris, France
Pages	v + 274
Year	1999
Isbn	2-212-09075-7
Lccn	????
Bibdate	Tue Sep 21 10:27:35 MDT 1999
Series	Collection dirigée par Guy Hervier
Alttitle	Composants et transactions: Corba/OTS, EJB/JTS, COM/MTS: comprendre l'architecture des serveurs d'applications
Annote	Titre de couv.: ``Composants et transactions: Corba/OTS, comprendre l'architecture des serveurs d'applications''. Bibliogr.: p. 267-269.
Keywords	Conception orienté objets (informatique).; Objet composant, Modeles d'.; Javabeans.
Remarks	TO BE FOUND

7 IRCAM

MISC	anasyn:www [AS00]
Key	AS
Title	Analysis--Synthesis Team / Équipe Analyse--Synthèse, IRCAM---Centre Georges Pompidou
Howpublished	WWW page
Year	2000
url	`http://www.ircam.fr/anasyn/`
pub-url	`http://www.ircam.fr/anasyn/listePublications/index.html`
Note	`http://www.ircam.fr/anasyn/`

MISC	anasyn:oldwww [AS99]
Key	AS
Title	Analysis--Synthesis Team / Équipe Analyse--Synthèse, IRCAM---Centre Georges Pompidou
Howpublished	WWW page
Year	1999
url	`http://www.ircam.fr/equipes/analyse-synthese/`
pub-url	`http://www.ircam.fr/equipes/analyse-synthese/listePublications/index.html`
Note	`http://www.ircam.fr/equipes/analyse-synthese/`

INPROC.	PEET981 [Pee98]
Author	G. Peeters
Title	Analyse-Synthèse des sons musicaux par la méthode PSOLA
Year	1998
Address	Agelonde (France)
Month	May

INPROC.	PEET983 [PR98]
Author	G. Peeters, X. Rodet
Title	Sinusoidal versus Non-Sinusoidal Signal Characterisation
Year	1998
Address	Barcelona
Month	November
Annote	(Workshop on Digital Audio Effects)

INPROC.	PEET991 [PR99b]
Author	G. Peeters, X. Rodet
Title	SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	1999
Address	Beijing
Month	October

INPROC.	PEET992 [PR99a]
Author	G. Peeters, X. Rodet
Title	Non-Stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
Year	1999
Address	Orlando
Month	November

INPROC.	OM97 [AAFH97]
Author	Gérard Assayag, Carlos Agon, Joshua Fineberg, Peter Hanappe
Title	An Object Oriented Visual Environment For Musical Composition
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	1997
Address	Thessaloniki, Greece
url	`http://www.ircam.fr/equipes/repmus/RMPapers/Assayag97/index.html`
bib-url	`http://www.ircam.fr/equipes/repmus/RMPapers/`

INPROC.	OM98 [AADR98]
Author	Carlos Agon, Gérard Assayag, Olivier Delerue, Camilo Rueda
Title	Objects, Time and Constraints in OpenMusic
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	1998
Address	Ann Arbor, Michigan
Month	October
url	`http://www.ircam.fr/equipes/repmus/RMPapers/ICMC98a/OMICMC98.html`
bib-url	`http://www.ircam.fr/equipes/repmus/RMPapers/`

ARTICLE	OM99 [ARL+99b]
Author	Gérard Assayag, Camilo Rueda, Mikael Laurson, Carlos Agon, O. Delerue
Title	Computer Assisted Composition at Ircam: PatchWork & OpenMusic
Journal	Computer Music Journal
Year	1999
Volume	23
Number	3
url	`http://www.ircam.fr/equipes/repmus/RMPapers/CMJ98/index.html`
bib-url	`http://www.ircam.fr/equipes/repmus/RMPapers`

ARTICLE	OM99-short [ARL+99a]
Author	G. Assayag, C. Rueda, M. Laurson, C. Agon, O. Delerue
Title	Computer Assisted Composition at Ircam: PatchWork & OpenMusic
Journal	Computer Music Journal
Month	Fall
Year	1999
Volume	23
Number	3
url	`http://www.ircam.fr/equipes/repmus/RMPapers/CMJ98/index.html`
bib-url	`http://www.ircam.fr/equipes/repmus/RMPapers`

INPROC.	OM2000 [AAS00c]
Author	Gérard Assayag, Carlos Agon, Marco Stroppa
Title	High Level Musical Control of Sound Synthesis in OpenMusic
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Year	2000
Address	Berlin
Month	August

INPROC.	OM2000-short [AAS00a]
Author	G. Assayag, C. Agon, M. Stroppa
Title	High Level Musical Control of Sound Synthesis in OpenMusic
Booktitle	Proc. ICMC
Address	Berlin
Year	2000

INPROC.	OM2000-sshort [AAS00b]
Author	G. Assayag, C. Agon, M. Stroppa
Title	High Level Musical Control of Sound Synthesis in OpenMusic
Booktitle	Proc. ICMC
Year	2000

INPROC.	sdif-ext2000 [SW00]
Author	Diemo Schwarz, Matthew Wright
Title	Extensions and Applications of the SDIF Sound Description Interchange Format
Booktitle	Proceedings of the International Computer Music Conference
Month	August
Year	2000
Address	Berlin

8 Psychoacoustics

BOOK	moore89 [Moo89]
Author	B. C. J. Moore
Title	An Introduction to the Psychology of Hearing
Publisher	Academic Press Limited
Edition	3rd
Year	1989
Remarks	cited in [MCW98]: masking effects

INPROC.	psy:susini97 [SMW97]
Author	Patrick Susini, Stephen McAdams, Suzanne Winsberg
Title	Caractérisation perceptive des bruits de véhicules
Booktitle	Actes du 4^ème Congrès Français d'Acoustique
Publisher	Société Française d'Acoustique
Month	April
Year	1997
Address	Marseille

INPROC.	psy:faure97 [FM97]
Author	Anne Faure, Stephen McAdams
Title	Comparaison de profils sémantiques et de l'espace perceptif de timbres musicaux
Booktitle	Actes du 4^ème Congrès Français d'Acoustique
Publisher	Société Française d'Acoustique
Month	April
Year	1997
Address	Marseille
url	`http://mediatheque.ircam.fr/articles/textes/Faure97a/`
Remarks	Mapping of semantic profiles (letting subjects choose descriptive words for timbre) to perceptual dimensions. Some references: Faure96, Grey77, Krimphoff94, Krumhansl89, McAdams95, Tversky77
Abstract	The purpuse of this study is to compare semantical profiles and perceptual dimensions of musical timbre. In a previous experiment, we extracted 23 most often used verbal attributes from spontaneous verbalizations describing similarities and differences between pairs of timbres and we tried to compare their use with the relative positions of timbres along each perceptual dimension. In this experiment, we used a VAME paradigm to test more quantitatively these verbal attributes. 12 synthetic sounds were presented and rated on each of the 23 unipolar semantic scales. Several distances (ether euclidien or from Tversky's model of similarity) between timbres were then calculated and the MDS semantical models obtained were compared to perceptual one. The structure of semantical and perceptual models differed a lot and the correlations with the semantical scales leads us to prefer a model in two dimensions without specificities derived from a distance directly obtained from Tversky's mod

9 Sound Synthesis

INPROC.	beauchamp95 [BHM95]
Author	James Beauchamp, A. Horner, S. McAdams
Title	Musical Sounds, Data Reduction, and Perceptual Control Parameters
Booktitle	Program for SMPC95, Society for Music Perception and Cognition
Publisher	Center for New Music and Audio Technologies (CNMAT)
Address	Univ. Calif. Berkeley
Pages	8--9
Year	1995
bib-url	`http://cmp-rs.music.uiuc.edu/people/beauchamp/publist.html`
Remarks	TO BE FOUND!

ARTICLE	beauchamp98 [Bea98]
Author	James Beauchamp
Title	Methods for measurement and manipulation of timbral physical correlates
Booktitle	J. Acoust. Soc. Am.
Year	1998
Volume	103
Part	Pt. 2
Pages	2966
Number	5
bib-url	`http://cmp-rs.music.uiuc.edu/people/beauchamp/publist.html`
Remarks	TO BE FOUND!

ARTICLE	horner98 [YH]
Author	Jennifer Yuen, Andrew Horner
Title	Hybrid Sampling-Wavetable Synthesis with Genetic Algorithms
Booktitle	Journal of the Audio Engineering Society
Volume	45
Pages	316--330
Number	5
bib-url	`http://www.cs.ust.hk/faculty/horner/subpage/pubs.html`
journal-url	`http://www.aes.org/journal/toc/may97.html`
Remarks	To BE FOUND! high quality sort-of-concatenative instrument synthesis?
Abstract	A combination of hybrid sampling and wavetable synthesis for matching acoustic instruments is demonstrated using genetic algorithm optimization. Tone sampling is used for the critical attack portion and wavetable synthesis is used to match the more gradually changing sustain and decay. A hybrid sampling wavetable performs a smooth crossfade transition. This method has been used to synthesize piano, harp, glockenspiel, and temple block tones.

ARTICLE	horner96 [CH]
Author	Ngai-Man Cheung, Andrew Horner
Title	Group Synthesis with Genetic Algorithms
Booktitle	Journal of the Audio Engineering Society
Volume	44
Number	3
Pages	130--147
bib-url	`http://www.cs.ust.hk/faculty/horner/subpage/pubs.html`
journal-url	`http://www.aes.org/journal/toc/march.html`
Abstract	Musical sounds can be efficiently synthesized using an automatic genetic algorithm to decompose musical instrument tones into group synthesis parameters. By separating the data into individual matrices, a high degree of data compression with low computational cost is achieved.

INPROC.	chandra98 [Cha98]
Author	Arun Chandra
Title	Compositional experiments with concatenating distinct waveform periods while changing their structural properties
Booktitle	SEAMUS'98
Publisher	School of Music, University of Illinois
Address	Urbana, IL
Month	April
Year	1998
url	`http://cmp-rs.music.uiuc.edu/people/arunc/miranda/seamus98/index.htm`
ps-url	`http://cmp-rs.music.uiuc.edu/people/arunc/miranda/seamus98/pre.ps`
Note	Available online²
Abstract	wigout is a sound-synthesis program, written in C and running under Unix and 32-bit Intel systems. The premise of the program is to allow the composer to compose the waveform with which she composes. Thus, sound is not a building-block with which one composes, but the subject matter of composition. The composer defines a waveform state, consisting of an arbitrary number of segments. Each segment is similar to (but not identical with) 1) a sine wave; 2) a square wave; 3) a triangle wave; or 4) a sawtooth wave. The composer stipulates the duration for which the sound is to last, and then the waveform state (which is on the order of a few milliseconds long) is iterated until the desired duration is reached. Upon each iteration, each segment changes itself by a specified amount. The resulting sound is the result of many independent changes in the waveform's segments. Up till now, five compositions have been written using wigout, for tape alone, and for tape and performers.

ARTICLE	beauchamp96 [BH]
Author	James Beauchamp, A. Horner
Title	Piecewise Linear Approximation of Additive Synthesis Envelopes: A Comparison of Various Methods
Booktitle	Computer Music Journal
Volume	20
Pages	72--95
Number	2
bib-url	`http://cmp-rs.music.uiuc.edu/people/beauchamp/publist.html`

ARTICLE	wakefield96 [PW96]
Author	W. J. Pielemeier, G. H. Wakefield
Title	A High Resolution Time--Frequency Representation for Musical Instrument Signals
Journal	J. Acoust. Soc. Am.
Volume	99
Number	4
Pages	2382--2396
Year	1996
bib-url

INPROC.	wakefield98 [Wak98a]
Author	G. H. Wakefield
Title	Time--Pitch Representations: Acoustic Signal Processing and Auditory Representations
Booktitle	Proceedings of the IEEE Intl. Symp. on Time--Frequency/Time--Scale
Year	1998
Address	Pittsburgh

INPROC.	wakefield98-short [Wak98b]
Author	G. H. Wakefield
Title	Time--Pitch Representations: Acoustic Signal Processing and Auditory Representations
Booktitle	Proc. IEEE Intl. Symp. Time--Frequency/Time--Scale
Year	1998
Address	Pittsburgh

INPROC.	loris2000a [FHC00d]
Author	Kelly Fitz, Lippold Haken, Paul Chirstensen
Title	Transient Preservation under Transformation in an Additive Sound Model
Booktitle	Proceedings of the International Computer Music Conference
Address	Berlin
Year	2000

INPROC.	loris2000a-short [FHC00b]
Author	K. Fitz, L. Haken, P. Chirstensen
Title	Transient Preservation under Transformation in an Additive Sound Model
Booktitle	Proc. ICMC
Address	Berlin
Year	2000

INPROC.	loris2000b [FHC00c]
Author	Kelly Fitz, Lippold Haken, Paul Chirstensen
Title	A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling
Booktitle	Proc. ICMC
Address	Berlin
Year	2000

INPROC.	loris2000b-short [FHC00a]
Author	K. Fitz, L. Haken, P. Chirstensen
Title	A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling
Booktitle	Proc. ICMC
Address	Berlin
Year	2000

INPROC.	sms97 [SBHL97b]
Author	X. Serra, J. Bonada, P. Herrera, R. Loureiro
Title	Integrating Complementary Spectral Models in the Design of a Musical Synthesizer
Booktitle	Proceedings of the International Computer Music Conference
Year	1997
Address	Tessaloniki

INPROC.	sms97-short [SBHL97c]
Author	X. Serra, J. Bonada, P. Herrera, R. Loureiro
Title	Integrating Complementary Spectral Models in the Design of a Musical Synthesizer
Booktitle	Proc. ICMC
Year	1997
Address	Tessaloniki

ARTICLE	sms90 [SS90]
Author	X. Serra, J. Smith
Title	Spectral Modeling Synthesis: a Sound Analysis/Synthesis System Based on a Deterministic plus Stochastic Decomposition
Journal	Computer Music Journal
Year	1990
Volume	14
Number	4
Pages	12--24

ARTICLE	beauchamp93 [Bea93a]
Author	J. W. Beauchamp
Title	Unix Workstation Software for Analysis, Graphics, Modification, and Synthesis of Musical Sounds
Journal	Proceedings of the Audio Engineering Society
Year	1993

INPROC.	beauchamp93-short [Bea93b]
Author	J. W. Beauchamp
Title	Unix Workstation Software for Analysis, Graphics, Modification, and Synthesis of Musical Sounds
Booktitle	Proc. AES
Year	1993

10 Speech Synthesis

BOOK	speechsyn96 [vSHOS96]
Editor	J.P.H. van Santen, J. Hirschberg, J. Olive, R. Sproat
Title	Progress in Speech Synthesis
Publisher	Springer-Verlag
Address	New York
Year	1996
Isbn	0-387-94701-9
amazon-url	`http://www.amazon.de/exec/obidos/ASIN/0387947019`
Remarks	van Santen Author Links: `http://www.bell-labs.com/project/tts/BOOK.html`, Springer Heidelberg: `http://www.springer.de/cgi-bin/search-book.pl?isbn=0-387-94701-9`, Springer New-York: `http://www.springer-ny.com/catalog/np/may96np/DATA/0-387-94701-9.html`

ARTICLE	psola92 [VMT92]
Key	synthesis
Author	H. Valbret, E. Moulines, J. P. Tubach
Title	Voice transformation using PSOLA technique
Journal	speech
Year	1992
Month	June
Volume	11
Number	2-3
Pages	189--194

BOOK	chomsky68sound [CH68]
Author	N. Chomsky, M. Halle
Title	The Sound Pattern of English
Publisher	Harper & Row
Address	New York, NY
Year	1968

ARTICLE	bailly1991 [BLS91]
Author	G. Bailly, R. Laboissière, J. L. Schwartz
Title	Formant trajectories as audible gestures: an alternative for speech synthesis.
Journal	Journal of Phonetics
Year	1991
Volume	19
Pages	9--23

INPROC.	soong88 [SR88]
Author	F.K. Soong, A.E. Rosenberg
Title	On the use of Instantaneous and Transitional Spectral Information in Speaker Recognition
Booktitle	IEEE Transactions on Acoustics, Speech and Signal Processing
Volume	36
Year	1988
Pages	871--879
Keywords	derivative of cepstrum
Remarks	cited in [MD97a]

INPROC.	griffin88 [GL88]
Author	D.W. Griffin, J.S. Lim
Title	Multiband Excitation Vocoder
Booktitle	IEEE Transactions on Acoustics, Speech and Signal Processing
Volume	36
Year	1988
Pages	1123--1235
Keywords	robust cepstrum by sinusoidal weighting
Remarks	cited in [MD97a]

INPROC.	allessandro95 [dM95]
Author	C. d'Alessandro, P. Mertens
Title	Automatic pitch contour stylization using a model of tonal perception
Booktitle	Computer Speech and Language
Year	1995
Pages	257--288
Keywords	perceptual stylization, based on a model of tonal perception
Remarks	cited in [MD97a]

INPROC.	traber92 [Tra92]
Author	C. Traber
Title	F0 Generation with a Database of Natural F0 Patterns and with a Neural Network
Booktitle	Talking Machines: Theories, Models, and Designs
Editor	G. Bailly, C. Benot
Publisher	North Holland
Year	1992
Pages	287--304
Remarks	cited in [MD97a]: machine learning techniques: multilayer perceptrons

INPROC.	sagisaka92 [SK92]
Author	Y. Sagisaka, N. Kaiki
Title	Optimization of Intonation Control Using Statistical F0 Resetting Characteristics
Booktitle	Proceedings of the International Conference on Acoustics
Volume	2
Publisher	Speech and Signal Processing
Year	1992
Pages	49--52
Remarks	cited in [MD97a]: machine learning techniques: linear regression

INPROC.	hirschberg91 [Hir91]
Author	J. Hirschberg
Title	Using Text Analysis to Predict Intonational Boundaries
Booktitle	Proceedings of Eurospeech
Location	Genova
Year	1991
Pages	1275--1278

INPROC.	moebius93 [MPH93]
Author	B. Möbius, M. Pätzold, W. Hess
Title	Analysis and Synthesis of German F0 Contours by Means of Fujisaki's Model
Booktitle	Speech Communication
Volume	13
Year	1993
Pages	53--61

INPROC.	sagisaka88 [Sag88]
Author	Y. Sagisaka
Title	Speech synthesis by rule using an optimal selection of non-uniform synthesis units
Booktitle	Proc. of the Int'l Conf. on Acoustics, Speech, and Signal Processing
Year	1988
Pages	679
Remarks	(origin of unit selection?), cited in [MCW98]: since the late 1980's, selection-based concatenative synthesis from large databases has received increased interest as a potential improvement upon fixed diphone inventories. TO BE FOUND

INPROC.	wang93 [WCIS93]
Author	W. J. Wang, W. N. Campbell, N. Iwahashi, Y. Sagisaka
Title	Tree-based unit selection for English speech synthesis
Booktitle	Proc. of the Int'l Conf. on Acoustics, Speech, and Signal Processing
Year	1993
Pages	191--194
Remarks	cited in [MCW98, CM98]: clustering and decision trees. TO BE FOUND

INPROC.	nakajima94 [Nak94]
Author	S. Nakajima
Title	Automatic synthesis unit generation for English speech synthesis based on multi-layered context oriented clustering
Booktitle	Speech Communication
Volume	14
Month	September
Year	1994
Pages	313
Remarks	cited in [MCW98, CM98]: clustering and decision trees. TO BE FOUND

PHDTHESIS	donovan96 [Don96]
Author	R. E. Donovan
Title	Trainable Speech Synthesis
Type	PhD thesis
School	Cambridge University
Year	1996
Remarks	cited in [MCW98]: Mahalanobis distance

INPROC.	huang96 [HAea96]
Author	X. D. Huang, A. Acero, et al.
Title	Whistler: A trainable text-to-speech system
Booktitle	Proc. of the Int'l Conf. on Spoken Language Processing
Year	1996
Pages	2387--2390
Remarks	cited in [MCW98]: decision trees for speech synthesis

INPROC.	karaali96 [KCG96]
Author	O. Karaali, G. Corrigan, I. Gerson
Title	Speech Synthesis with Neural Networks
Booktitle	Proc. of World Congress on Neural Networks
Month	September
Year	1996
Pages	45--50
Remarks	cited in [MCW98]: data driven direct mapping with NN

INPROC.	tuerk93 [TR]
Author	C. Tuerk, T. Robinson
Title	Speech synthesis using artificial neural networks trained on cepstral coefficients
Booktitle	Proc. EUROSPEECH
Pages	1713--1716
Remarks	cited in [MCW98]: data driven direct mapping with NN

BOOK	quackenbush88 [QBC88]
Author	S. R. Quackenbush, T. P. Barnwell, M. A. Clements
Title	Objective Measures of Speech Quality
Publisher	Prentice-Hall
Address	Englewood Cliffs, NJ
Year	1988
Remarks	cited in [MCW98]: distance measures for coding

INPROC.	nocerino85 [NSRK85]
Author	N. Nocerino, F. K. Soong, L. R. Rabiner, D. H Klatt
Title	Comparative study of several distortion measures for speech recognition
Booktitle	Speech Communication
Volume	4
Year	1985
Pages	317--331
Remarks	cited in [MCW98]: distance measures for ASR

INPROC.	asp:icassp88 [HJ88]
Author	H. Hermansky, J. C. Junqua
Title	Optimization of perceptually-based ASR front-end
Booktitle	Proceedings of the International Conference on Acoustics, Speech, and Signal Processing
Year	1988
Pages	219
Remarks	cited in [MCW98]: distance measures for ASR

INPROC.	ghitza97 [GS97]
Author	O. Ghitza, M. M. Sondhi
Title	On the perceptual distance between two speech segments
Booktitle	Journal of the Acoustical Society of America
Year	1997
Volume	101
Pages	522--529
Number	1
Remarks	cited in [MCW98]: distance measures in general

INPROC.	hansen98 [HC98]
Author	J. H. L. Hansen, D. T. Chappell
Title	An auditory-based distortion measure with application to concatenative speech synthesis
Booktitle	IEEE Trans. on Speech and Audio Processing
Volume	6
Month	September
Year	1998
Pages	489--495
Remarks	cited in [MCW98]: distance measures for concatenative speech synthesis

INPROC.	asp:itsa94 [HM94]
Author	H. Hermansky, N. Morgan
Title	RASTA processing of speech
Booktitle	IEEE Transactions on Speech and Acoustics
Volume	2
Month	October
Year	1994
Pages	587--589
Remarks	cited in [MCW98]

BOOK	edwards93 [Edw93]
Author	A. L. Edwards
Title	An Introduction to Linear Regression and Correlation
Publisher	W. H. Freeman and Co
Address	San Francisco
Year	1993
Remarks	cited in [MCW98]: Fisher transform

INPROC.	Ding_OptiUnit_EURO97 [DC97]
Author	Wen Ding, Nick Campbell
Title	Optimising Unit Selection with Voice Source and Formants in the CHATR Speech Synthesis System
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	537--540
Remarks	To BE FOUND!

11 Spectral Envelopes

MASTER.	diemo98 [Sch98c]
Author	Diemo Schwarz
Title	Spectral Envelopes in Sound Analysis and Synthesis
Type	Diplomarbeit Nr. 1622
School	Universität Stuttgart, Fakultät Informatik
Address	Stuttgart, Germany
Month	June
Year	1998
url	`http://www.ircam.fr/anasyn/schwarz/da/`
official-url	`http://www.informatik.uni-stuttgart.de/cgi-bin/ncstrl_rep_view.pl?/inf/ftp/pub/library/medoc.ustuttgart_fi/DIP-1622/DIP-1622.bib`
Abstract	In this project, Spectral Envelopes in Sound Analysis and Synthesis, various methods for estimation, representation, file storage, manipulation, and application of spectral envelopes to sound synthesis were evaluated, improved, and implemented. A prototyping and testing environment was developed, and a function library to handle spectral envelopes was designed and implemented. For the estimation of spectral envelopes, after defining the requirements, the methods LPC, cepstrum, and discrete cepstrum were examined, and also improvements of the discrete cepstrum method (regularization, stochastic (or probabilistic) smoothing, logarithmic frequency scaling, and adding control points). An evaluation with a large corpus of sound data showed the feasibility of discrete cepstrum spectral envelope estimation. After defining the requirements for the representation of spectral envelopes, filter coefficients, spectral representation, break-point functions, splines, formant representation, and high resolution matching pursuit were examined. A combined spectral representation with indication of the regions of formants (called fuzzy formants) was defined to allow for integration of spectral envelopes with precise formant descriptions. For file storage, new data types were defined for the Sound Description Interchange Format (SDIF) standard. Methods for manipulation were examined, especially interpolation between spectral envelopes, and between spectral envelopes and formants, and other manipulations, based on primitive operations on spectral envelopes. For sound synthesis, application of spectral envelopes to additive synthesis, and time-domain or frequency-domain filtering have been examined. For prototyping and testing of the algorithms, a spectral envelope viewing program was developed. Finally, the spectral envelope library, offering complete functionality of spectral envelope handling, was developed according to the principles of software engineering.

MASTER.	diemo98-short [Sch98a]
Author	D. Schwarz
Title	*Spectral Envelopes in Sound Analysis and Synthesis*
Type	Diplomarbeit Nr. 1622
School	Universität Stuttgart, Fakultät Informatik
Address	Stuttgart, Germany
Year	1998

MASTER.	diemo98-sshort [Sch98b]
Author	D. Schwarz
Title	*Spectral Envelopes in Sound Analysis and Synthesis*
Type	Diplomarbeit
School	Universität Stuttgart, Informatik
Year	1998

BOOK	bookbeauchamp [Bea00]
Editor	James Beauchamp
Title	The Sound of Music
Publisher	Springer
Address	New York
Year	2000

INBOOK	bookbeauchamp-specenv [RSb]
Author	Xavier Rodet, Diemo Schwarz
Title	Spectral Envelopes and Additive+Residual Analysis-Synthesis
Note	In J. Beauchamp, ed. The Sound of Music. Springer, New York, to be published 2000

INBOOK	bookbeauchamp-specenv-short [RSa]
Author	X. Rodet, D. Schwarz
Title	Spectral Envelopes and Additive+Residual Analysis-Synthesis
Note	In J. Beauchamp, ed. The Sound of Music. Springer, N.Y., to be published

INPROC.	holmes83 [Hol83a]
Author	J. N. Holmes
Title	Formant synthesizers: Cascade or Parallel
Booktitle	Speech Communication
Year	1983
Volume	2
Pages	251--273

INPROC.	holmes83-short [Hol83b]
Author	J. N. Holmes
Title	Formant synthesizers: Cascade or Parallel
Booktitle	Speech Communication
Volume	2
Year	1983

BOOK	hamming77 [Ham77b]
Author	Richard Wesley Hamming
Title	Digital Filters
Publisher	Prentice--Hall
Series	Signal Processing Series
Address	Englewood Cliffs
Year	1977

BOOK	hamming77-short [Ham77a]
Author	R. W. Hamming
Title	Digital Filters
Publisher	Prentice--Hall
Series	Signal Processing Series
Year	1977

INPROC.	fft-2 [FRD93a]
Author	A. Freed, X. Rodet, Ph. Depalle
Title	Performance, Synthesis and Control of Additive Synthesis on a Desktop Computer Using FFT^-1
Booktitle	Proceedings of the 19th International Computer Music Conference
Address	Waseda University Center for Scholarly Information
Year	1993
Publisher	International Computer Music Association
url	`http://cnmat.CNMAT.Berkeley.EDU/~adrian/FFT-1/FFT-1_ICMC93.html`

INPROC.	fft-2-short [FRD93b]
Author	A. Freed, X. Rodet, Ph. Depalle
Title	Performance, Synthesis and Control of Additive Synthesis on a Desktop Computer Using FFT^-1
Booktitle	Proc. ICMC
Year	1993

INPROC.	fft-3 [SBHL97d]
Author	Xavier Serra, Jordi Bonada, Perfecto Herrera, Ramon Loureiro
Title	Integrating complementary spectral models in the design of a musical synthesizer
Booktitle	Proceedings of the International Computer Music Conference
Year	1997
url	`http://www.iua.upf.es/~xserra/articles/spectral-models/`

INPROC.	fft-3-short [SBHL97a]
Author	X. Serra, J. Bonada, P. Herrera, R. Loureiro
Title	Integrating Complementary Spectral Models in the Design of a Musical Synthesizer
Booktitle	Proc. ICMC
Year	1997

PHDTHESIS	marine-thesis [Oud98b]
Author	Marine Campedel Oudot
Title	Étude du modèle ``sinusoïdes et bruit'' pour le traitement de la parole. Estimation robuste de l'enveloppe spectrale
Type	Thèse
School	Ecole Nationale Supérieure des Télécommunications
Address	Paris, France
Month	November
Year	1998

PHDTHESIS	marine-thesis-short [Oud98a]
Author	M. Campedel Oudot
Title	Étude du modèle sinusoïdes et bruit pour le traitement de la parole. Estimation robuste de l'enveloppe spectrale
Type	Thèse
School	ENST
Address	Paris
Year	1998

INPROC.	jmax99 [DCMS99]
Author	François Déchelle, Maurizio De Cecco, Enzo Maggi, Norbert Schnell
Title	*jMax* Recent Developments
Booktitle	Proceedings of the International Computer Music Conference
Year	1999

INPROC.	jmax99-short [DDMS99]
Author	F. Déchelle, M. DeCecco, E. Maggi, N. Schnell
Title	*jMax* Recent Developments
Booktitle	Proc. ICMC
Year	1999

INPROC.	jmax2000 [DSBO00b]
Author	François Déchelle, Norbert Schnell, Ricardo Borghesi, Nicolas Orio
Title	The *jMax* Environment: An Overview of New Features
Booktitle	Proceedings of the International Computer Music Conference
Address	Berlin
Year	2000

INPROC.	jmax2000-short [DSBO00a]
Author	F. Déchelle, N. Schnell, R. Borghesi, N. Orio
Title	The *jMax* Environment: An Overview of New Features
Booktitle	Proc. ICMC
Address	Berlin
Year	2000

INPROC.	lemur95 [FHH95a]
Author	K. Fitz, L. Haken, B. Holloway
Title	Lemur -- A Tool for Timbre Manipulation
Booktitle	Proceedings of the International Computer Music Conference
Pages	158--161
Address	Banff
Month	September
Year	1995

INPROC.	lemur95-short [FHH95b]
Author	K. Fitz, L. Haken, B. Holloway
Title	Lemur -- A Tool for Timbre Manipulation
Booktitle	Proc. ICMC
Year	1995

INPROC.	HRMP [GBM+96]
Author	R. Gribonval, E. Bacry, S. Mallat, Ph. Depalle, X. Rodet
Title	Analysis of Sound Signals with High Resolution Matching Pursuit
Booktitle	Proceedings of the IEEE Time--Frequency and Time--Scale Workshop (TFTS)
Year	1996
Note	www [AS00]
url	`\url{http://www.ircam.fr/anasyn/listePublications/articlesRodet/TFTS96/tfts96.ps.gz}`

INPROC.	HRMP2 [GDR+96]
Author	R. Gribonval, Ph. Depalle, X. Rodet, E. Bacry, S. Mallat
Title	Sound Signal Decomposition using a High Resolution Matching Pursuit
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Location	Clear Water Bay, Hong-Kong
Month	August
Year	1996
Note	www [AS00]
abstract-url	`\url{http://www.ircam.fr/anasyn/listePublications/articlesRodet/ICMC96HRMP/abstract.txt}`
url	`\url{http://www.ircam.fr/anasyn/listePublications/articlesRodet/ICMC96HRMP/ICMC96HRMP.ps.gz}`

ARTICLE	fof [Rod84b]
Author	Xavier Rodet
Title	Time-Domain Formant-Wave-Function Synthesis
Journal	Computer Music Journal
Volume	8
Number	3
Month	Fall
Year	1984
Pages	9--14
Note	reprinted from [Sim80]

ARTICLE	fof-short [Rod84a]
Author	X. Rodet
Title	Time-Domain Formant-Wave-Function Synthesis
Journal	Computer Music Journal
Month	Fall
Year	1984

BOOK	fof2 [Sim80]
Editor	J. C. Simon
Title	Spoken Language Generation and Understanding
Publisher	D. Reidel Publishing Company
Address	Dordrecht, Holland
Year	1980

ARTICLE	chant [RPB84b]
Author	Xavier Rodet, Yves Potard, Jean--Baptiste Barrière
Title	The Chant--Project: From the Synthesis of the Singing Voice to Synthesis in General
Journal	Computer Music Journal
Volume	8
Number	3
Month	Fall
Year	1984
Pages	15--31

ARTICLE	chant-short [RPB84a]
Author	X. Rodet, Y. Potard, J.--B. Barrière
Title	The Chant--Project: From the Synthesis of the Singing Voice to Synthesis in General
Journal	Computer Music Journal
Month	Fall
Year	1984

ARTICLE	chant2 [RPB85]
Author	Xavier Rodet, Yves Potard, Jean--Baptiste Barrière
Title	CHANT: de la synthèse de la voix chantée à la synthèse en général
Journal	Rapports de recherche IRCAM
Address	Paris
Year	1985
Note	Available online³

MANUAL	chant-manual [Vir97]
Author	Dominique Virolle
Title	La Librairie CHANT: Manuel d'utilisation des fonctions en C
Month	April
Year	1997
Note	Available online⁴

INPROC.	dcep1 [GR90]
Author	Thierry Galas, Xavier Rodet
Title	An Improved Cepstral Method for Deconvolution of Source--Filter Systems with Discrete Spectra: Application to Musical Sound Signals
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Address	Glasgow
Month	September
Year	1990
Notes	dcep with cloud, some pictures, middle (3 pages)

INPROC.	dcep2 [GR91b]
Author	Thierry Galas, Xavier Rodet
Title	Generalized Discrete Cepstral Analysis for Deconvolution of Source--Filter Systems with Discrete Spectra
Booktitle	IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Address	New Paltz, New York
Month	October
Year	1991
Notes	dcep with cloud, no pictures, short (2 pages)

INPROC.	dcep3 [GR91c]
Author	Thierry Galas, Xavier Rodet
Title	Generalized Functional Approximation for Source--Filter System Modeling
Booktitle	Proc. Eurospeech
Address	Geneve
Year	1991
Pages	1085--1088
Notes	power spectrum modeling, all pole, dcep with cloud, log frequency, many pictures

INPROC.	dcep3-short [GR91a]
Author	Th. Galas, X. Rodet
Title	Generalized Functional Approximation for Source--Filter System Modeling
Booktitle	Proc. Eurospeech
Year	1991

INPROC.	marine1 [OCM97]
Author	M. Oudot, O. Cappé, E. Moulines
Title	Robust Estimation of the Spectral Envelope for ``Harmonics+Noise'' Models
Booktitle	IEEE Workshop on Speech coding
Address	Pocono Manor
Month	September
Year	1997

INPROC.	marine97 [COM97]
Author	O. Cappé, M. Oudot, E. Moulines
Title	Spectral Envelope Estimation using a Penalized Likelihood Criterion
Booktitle	IEEE ASSP Workshop on App. of Sig. Proc. to Audio and Acoust.
Address	Mohonk
Month	October
Year	1997

ARTICLE	dcep-reg [CM96]
Author	O. Cappé, E. Moulines
Title	Regularization Techniques for Discrete Cepstrum Estimation
Journal	IEEE Signal Processing Letters
Volume	3
Number	4
Pages	100--102
Month	April
Year	1996

INPROC.	xspect [RFL96]
Author	Xavier Rodet, Dominique François, Guillaume Levy
Title	Xspect: a New Motif Signal Visualisation, Analysis and Editing Program
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Location	Hong Kong
Month	August
Year	1996
Note	Available online⁵

MANUAL	xspect-manual [RF96]
Author	Xavier Rodet, Dominique François
Title	XSPECT: Introduction
Month	January
Year	1996
Note	Available online⁶

INPROC.	hmm [DGR93a]
Author	Ph. Depalle, G. Garcia, X. Rodet
Title	Tracking of Partials for Additive Sound Synthesis Using Hidden Markov Models
Note	Abstract⁷
Pages	225--228
Booktitle	IEEE Trans.
Year	1993
Month	April

INPROC.	hmm-short [DGR93b]
Author	Ph. Depalle, G. Garcia, X. Rodet
Title	Tracking of Partials for Additive Sound Synthesis Using Hidden Markov Models
Pages	225--228
Booktitle	IEEE Trans.
Year	1993

INPROC.	additive [Rod97b]
Author	Xavier Rodet
Title	Musical Sound Signals Analysis/Synthesis: Sinusoidal+Residual and Elementary Waveform Models
Booktitle	Proceedings of the IEEE Time--Frequency and Time--Scale Workshop (TFTS)
Month	August
Year	1997
Note	Abstract⁸, PostScript⁹ www.ircam.fr/anasyn/listePublications/articlesRodet/TFTS97/TFTS97.ps.gz

INPROC.	additive-short [Rod97a]
Author	X. Rodet
Title	Musical Sound Signals Analysis/Synthesis: Sinusoidal+Residual and Elementary Waveform Models
Booktitle	Proc. IEEE Time--Frequency/Time--Scale Workshop
Year	1997

MANUAL	additive-manual [Rod97c]
Author	Xavier Rodet
Title	The Additive Analysis--Synthesis Package
Year	1997
Note	Available online¹⁰

INPROC.	diphones [RL97b]
Author	Xavier Rodet, Adrien Lefèvre
Title	The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Month	September
Year	1997
Address	Tessaloniki, Greece
Note	Abstract¹¹, PostScript¹² www.ircam.fr/anasyn/listePublications/articlesRodet/ICMC97/ICMC97Diphone.ps.gz

INPROC.	diphones-nourl [RL97c]
Author	Xavier Rodet, Adrien Lefèvre
Title	The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Month	September
Year	1997
Address	Tessaloniki, Greece
abstract-url	`http://www.ircam.fr/anasyn/listePublications/articlesRodet/ICMC97/ICMC97DiphoneAbstract.html`
postscript-url	`http://www.ircam.fr/anasyn/listePublications/articlesRodet/ICMC97/ICMC97Diphone.ps.gz`

INPROC.	diphones-short [RL97a]
Author	X. Rodet, A. Lefèvre
Title	The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use
Booktitle	Proc. ICMC
Address	Tessaloniki
Year	1997

INPROC.	fft-1 [RD92]
Author	Xavier Rodet, Phillipe Depalle
Title	A new additive synthesis method using inverse Fourier transform and spectral envelopes
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Month	October
Year	1992

MANUAL	sdif-manual [Vir98]
Author	Dominique Virolle
Title	Sound Description Interchange Format (SDIF)
Month	January
Year	1998
Note	Available online¹³

INPROC.	fts [DDPZ94]
Author	François Dechelle, Maurizio DeCecco, Miller Puckette, David Zicarelli
Title	The IRCAM ``Real-Time Platform'': Evolution and Perspectives
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Location	Aarhus, Danemark
Year	1994
Note	Available online¹⁴

ARTICLE	fts-basics [Puc91b]
Author	Miller Puckette
Title	FTS: A Real-Time Monitor for Multiprocessor Music Synthesis
Journal	Computer Music Journal
Volume	15
Number	3
Pages	58--67
Month	Winter
Year	1991
Note	Available from¹⁵

ARTICLE	max [Puc91a]
Author	Miller Puckette
Title	Combining Event and Signal Processing in the MAX Graphical Programming Environment
Journal	Computer Music Journal
Volume	15
Number	3
Pages	68--77
Month	Winter
Year	1991
Note	Available from¹⁶

INPROC.	specenv-rod [RDP87b]
Author	Xavier Rodet, Phillipe Depalle, G. Poirot
Title	Speech Analysis and Synthesis Methods Based on Spectral Envelopes and Voiced/Unvoiced Functions
Booktitle	European Conference on Speech Tech.
Location	Edinburgh
Month	September
Year	1987

INPROC.	specenv-rod-short [RDP87a]
Author	X. Rodet, Ph. Depalle, G. Poirot
Title	Speech Analysis and Synthesis Methods Based on Spectral Envelopes and Voiced/Unvoiced Functions
Booktitle	European Conf. on Speech Tech.
Location	Edinburgh
Year	1987

INPROC.	control [FRD92b]
Author	Adrian Freed, Xavier Rodet, Phillipe Depalle
Title	Synthesis and Control of Hundreds of Sinusoidal Partials on a Desktop Computer without Custom Hardware
Booktitle	ICSPAT
Location	San José
Year	1992
Note	Available online¹⁷
Notes	fft-1, fm, se better than BPF

INPROC.	control-short [FRD92a]
Author	A. Freed, X. Rodet, Ph. Depalle
Title	Synthesis and Control of Hundreds of Sinusoidal Partials on a Desktop Computer without Custom Hardware
Booktitle	ICSPAT
Year	1992

INPROC.	newposs [RDG95]
Author	Xavier Rodet, Philippe Depalle, Guillermo García
Title	New Possibilities in Sound Analysis and Synthesis
Booktitle	ISMA
Location	Dourdan
Year	1995
Note	Available online¹⁸ PostScript¹⁹
Notes	fft-1 + se, phys. models, ana/syn overview, farinelli

INPROC.	farinelli [DGR94]
Author	Philippe Depalle, Guillermo García, Xavier Rodet
Title	A Virtual Castrato (!?)
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Location	Aarhus, Danemark
Year	1994
Note	Available online²⁰

MANUAL	udi [WRD92]
Author	Peter Wyngaard, Chris Rogers, Philippe Depalle
Title	UDI 2.1---A Unified DSP Interface
Year	1992
Note	Available online²¹

MANUAL	pm [Gar94]
Author	Guillermo García
Title	Pm: A library for additive analysis/transformation/synthesis
Month	July
Year	1994
Note	Available online²²

INPROC.	escher [WSR98]
Author	Marcelo M. Wanderley, Norbert Schnell, Joseph Rovan
Title	ESCHER---Modeling and Performing composed Instruments in real-time
Booktitle	IEEE Systems, Man, and Cybernetics Conference
Location	San Diego
Month	October
Year	1998
Note	To be published

BOOK	nat [Hen98]
Author	Nathalie Henrich
Title	Synthèse de la voix chantée par règles
Month	July
Year	1998
Publisher	IRCAM
Address	Paris, France
Note	Rapport de stage D.E.A. Acoustique, Traitement de Signal et Informatique Appliqués à la Musique

MISC	z [Mel97]
Author	Jason Meldrum
Title	The Z--Transform
Note	Online tutorial²³
Year	1997

BOOK	dsp [OS75]
Author	Alan V. Oppenheim, Ronald W. Schafer
Title	Digital Signal Processing
Year	1975
Publisher	Prentice--Hall

INBOOK	dspapp [Opp78]
Editor	Alan V. Oppenheim
Chapter	Digital Processing of Speech
Title	Applications of Digital Signal Processing
Pages	117--168
Year	1978
Publisher	Prentice--Hall

BOOK	dsp-intro [RH91]
Author	Stuart Rosen, Peter Howell
Title	Signals and Systems for Speech and Hearing
Year	1991
Publisher	Academic Press
Address	London

BOOK	roads [Roa96]
Author	Curtis Roads
Title	The Computer Music Tutorial
Year	1996
Publisher	MIT Press

BOOK	grey80 [MG80]
Author	J.D. Markel, A.H. Gray
Title	Linear Prediction of Speech
Publisher	Springer
Year	1980

INPROC.	toeplitz [MP82]
Author	G. A. Merchant, T. W. Parks
Title	Efficient Solution of a Toeplitz--plus Hankel Coefficient Matrix System of Equations
Booktitle	IEEE TASSP
Volume	30
Pages	40--44
Month	February
Year	1982

BOOK	psycho [Zwi82]
Author	Eberhard Zwicker
Title	Psychoakustik
Year	1982
Publisher	Springer

INPROC.	splinelpc [TAW97]
Author	Keith A. Teague, Walter Andrews, Buddy Walls
Title	Enhanced Modeling of Discrete Spectral Amplitudes
Booktitle	IEEE Workshop on Speech coding
Address	Pocono Manor
Month	September
Year	1997

INCOLL.	ICS94 [vS94]
Author	R. von Sachs
Title	Peak-insensitive non-parametric spectrum estimation
Booktitle	Journal of time series analysis
Year	1994
Volume	15
Number	4
Pages	429--452

ARTICLE	additive-idea [RM69]
Author	J.C. Risset, M.V. Mathews
Title	Analysis of musical-instrument tones
Journal	Physics Today
Volume	22
Number	2
Pages	23--30
Month	February
Year	1969

INPROC.	splines [UAE93]
Author	Michael Unser, Akram Aldroubi, Murray Eden
Title	B--Spline Signal Processing: Part I---Theory
Volume	41
Optnumber	2
Pages	821--833
Booktitle	IEEE Transactions on signal processing
Year	1993

MISC	speechana [Rob98]
Author	Tony Robinson
Title	Speech Analysis
Note	Online tutorial²⁴
Year	1998

ARTICLE	MultiscaleEdges [MZ92]
Author	S. Mallat, S. Zhong
Title	Characterization of Signals from Multiscale Edges
Journal	IEEE Trans. Pattern Anal. Machine Intell.
Year	1992
Volume	40
Number	7
Pages	2464--2482
Month	July

ARTICLE	Ridges [DEG+92]
Author	N. Delprat, B. Escudié, P. Guillemain, R. Kronland-Martinet, Ph. Tchamitchian, B. Torrésani
Title	Asymptotic Wavelet and Gabor Analysis : Extraction of Instantaneous Frequency
Year	1992
Volume	38
Number	2
Pages	644--664
Month	March

ARTICLE	Ridges2 [GKM96]
Author	Ph. Guillemain, R. Kronland-Martinet
Title	Characterization of Acoustic Signals Through Continuous Linear Time--Frequency Representations
Year	1996
Volume	84
Number	4
Pages	561--585
Month	April

BOOK	mallat [Mal97]
Author	Stephane Mallat
Title	A Wavelet Tour of Signal Processing
Publisher	AP Professional
Address	London
Year	1997

BOOK	chan [Cha95]
Author	Y. T. Chan
Title	Wavelet Basics
Publisher	Kluwer Academic Publ.
Address	Boston
Year	1995

BOOK	wavelets [Hub97]
Author	Barbara Burke Hubbard
Title	The World According to Wavelets: The Story of a Mathematical Technique in the Making
Publisher	A K Peters Ltd
Year	1997

INBOOK	IBspline [AE]
Author	Aldroubi, Eden
Title	Wavelet analysis and its applications
Chapter	Polynomial Spline and Wavelets
Publisher	???
Year	???
Volume	2

BOOK	instrument-character [vH54]
Author	Hermann L. von Helmholtz
Title	On the Sensations of Tone as a Physiological Basis for the Theory of Music
Publisher	Dover
Address	New York
Year	1954
Note	Original title: [vH13]

BOOK	helmholtz [vH13]
Author	Hermann L. von Helmholtz
Title	Die Lehre von den Tonempfindungen: als physiologische Grundlage für die Theorie der Musik
Publisher	Vieweg
Address	Braunschweig
Edition	6th
Year	1913

BOOK	helmholtz-reprint [vH83]
Author	Hermann L. von Helmholtz
Title	Die Lehre von den Tonempfindungen: als physiologische Grundlage für die Theorie der Musik
Publisher	Georg Olms Verlag
Address	Hildesheim
Year	1983

BOOK	clark-yallop [CY96]
Author	John E. Clark, Colin Yallop
Title	An Introduction to Phonetics and Phonology
Publisher	Blackwell
Address	Oxford
Year	1996

ARTICLE	prosody-tilt [Dog95]
Author	Grzegorz Dogil
Title	Phonetic Correlates of Word Stress
Journal	AIMS Phonetik (Working Papers of the Department of Natural Language Processing)
Volume	2
Number	2
Publisher	Institut für Maschinelle Sprachverarbeitung
Location	Stuttgart, Germany
Address	Stuttgart, Germany
Year	1995
Note	Contents²⁵

BOOK	jackson1 [Jac95a]
Author	Michael Jackson
Title	Software requirements & specifications : a lexicon of practice, principles, and prejudices
Publisher	Addison--Wesley
Address	Wokingham
Year	1995

BOOK	jackson2 [Jac83]
Author	Michael A. Jackson
Title	System development
Publisher	Prentice--Hall Intern.
Address	Englewood Cliffs
Year	1983
Series	Prentice--Hall International series in computer science

BOOK	nagl [Nag90]
Author	Manfred Nagl
Title	Softwaretechnik: methodisches Programmieren im Großen
Publisher	Springer
Address	Berlin
Year	1990
Series	Springer compass

BOOK	sommerville [Som85]
Author	Ian Sommerville
Title	Software engineering
Edition	2nd
Publisher	Addison--Wesley
Address	Wokingham [u.a.]
Year	1985
Series	International computer science series

BOOK	iau [Utt93]
Author	Ian A. Utting
Title	Lecture Notes in Object-Oriented Software Engineering
Publisher	University of Kent at Canterbury
Address	Canterbury, UK
Year	1993

12 Statistics

ARTICLE	battiti94 [Bat94]
Author	Roberto Battiti
Title	Using the mutual information for selecting features in supervised neural net learning
Journal	IEEE Transactions on Neural Networks
Volume	5
Number	4
Pages	537--550
Year	1994
url	`http://rtm.science.unitn.it/~battiti/battiti-publications.html`

BOOK	cart84 [BFOS84a]
Author	L. Breiman, J. Friedman, R. Olshen, C. Stone
Title	Classification and Regression Trees
Publisher	Wadsworth and Brooks
Address	Monterey, CA
Year	1984
Note	new edition [B+84]?
Remarks	cited in [MCW98, CM98, BT97b] for CART, clustering, and decision trees

BOOK	cart84-2 [BFOS84b]
Author	Leo Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone
Title	Classification and Regression Trees
Year	1984
Publisher	Wadsworth Publishing Company
Address	Belmont, California, U.S.A.
Series	Statistics/Probability Series
Isbn-hard	0534980546 (softcover)
Isbn-soft	0534980538 (hardcover)

BOOK	cart93 [B+84]
Author	Leo Breiman, others
Title	Classification and Regression Trees
Publisher	Chapman & Hall
Address	New York
Year	1984
Pages	358
Note	new edition of [BFOS84a]?
Isbn	0-412-04841-8
url	`http://www.crcpress.com/catalog/C4841.htm`
amazon-url	`http://www.amazon.de/exec/obidos/ASIN/0412048418`
Price	$44.95, DM 83.26 EUR 42.57
Remarks	TO BE FOUND

ARTICLE	dubnov95 [DTC]
Author	Shlomo Dubnov, Naftali Tishby, Dalia Cohen
Title	Hearing Beyond the Spectrum
Journal	Journal of New Music Research
Volume	24
Number	4
pub-url	`http://www.swets.nl/jnmr/vol24_4.html#dubnov24.4`
Remarks	features: harmonicity, phase coherence, chorus. bispectral information. acoustic distortion (distance) measure (``concept of statistical divergence which is used for measuring the `similarity' between signals'', ``similarity classes with a good correspondence to the human acoustic perception'', ``generalization of acoustic distortion measure''). TO BE FOUND
Abstract	In this work we focus on the problem of acoustic signals modeling and analysis, with particular interest in models that can capture the timbre of musical sounds. Traditional methods usually relate to several ``dimensions'' which represent the spectral properties of the signal and their change in time. Here we confine ourselves to the stationary portion of the sound signal, the analysis of which is generalized by incorporating polyspectral techniques. We suggest that by looking at the higher order statistics of the signal we obtain additional information not present in the standard autocorrelation or its Fourier related power-spectra. It is shown that over the bispectral plane several acoustically meaningful measures could be devised, which are sensitive to properties such as harmonicity and phase coherence among the harmonics. Effects such as reverberation and chorusing are demonstrated to be clearly detected by the above measures. In the second part of the paper we perform an information theoretic analysis of the spectral and bispectral planes. We introduce the concept of statistical divergence which is used for measuring the ``similarity'' between signals. A comparative matrix is presented which shows the similarity measure between several instruments based on spectral and bispectral information. The instruments group into similarity classes with a good correspondence to the human acoustic perception. The last part of the paper is devoted to acoustical modelling of the above phenomena. We suggest a simple model which accounts for some of the polyspectral aspects of musical sound discussed above. One of the main results of our work is generalization of acoustic distortion measure based on our model and which takes into account higher order statistical properties of the signal.

INPROC.	dubnov97 [DR97]
Author	Shlomo Dubnov, Xavier Rodet
Title	Statistical Modeling of Sound Aperiodicities
Booktitle	Proceedings of the International Computer Music Conference (ICMC)
Month	September
Year	1997
Address	Tessaloniki, Greece
url	`http://www.ircam.fr/equipes/analyse-synthese/listePublications/articlesDubnov`

PHDTHESIS	rochebois97 [Roc97]
Author	Thierry Rochebois
Title	Méthodes d'analyse synthèse et représentations optimales des sons musicaux basées sur la réduction de données spectrales
Month	December
Year	1997
School	Université Paris XI
url	`http://www.ief.u-psud.fr/~thierry/these/`
Remarks	Principal components analysis of harmonic partials, gives sub-spaces as linear combinations of partials, i.e. timbral components.
Abstract	principalL'analyse et la synthèse de sons et en particulier de sons musicaux a déjà fait l'objet de nombreuses recherches. Pour l'essentiel, ces recherches ont été menées dans deux objectifs : étudier et synthétiser les sons musicaux. Ces deux objectifs sont tout à fait conciliables et complémentaires. L'objet de cette thèse est une méthode d'analyse et de synthèse des sons musicaux basée sur une réduction de données. Une telle méthode permet d'obtenir une représentation optimale - au sens de la variance - des sons musicaux. Cette représentation est, à la fois un puissant outil pour l'étude du timbre musical, mais aussi, la base d'une forme de synthèse efficace.

BOOK	fukunaga90 [Fuk90]
Author	K. Fukunaga
Title	Introduction to Statistical Pattern Recognition
Publisher	Academic Press
Edition	2
Year	1990
Remarks	cited in [CM98] for CART tree evaluation criterion. TO BE FOUND

INPROC.	nock97 [NGY97]
Author	H. J. Nock, M. J. F. Gales, Steve Young
Title	A Comparative Study of Methods for Phonetic Decision-Tree State Clustering
Booktitle	Proc. Eurospeech '97
Volume	1
Address	Rhodes, Greece
Month	September
Year	1997
Pages	111--114
Remarks	cited in [MCW98] for decision trees for speech recognition, [CM98] for CART tree evaluation criterion. TO BE FOUND

13 TCTS Circuit Theory and Signal Processing Lab

MISC	tcts:www [TCTS99]
Key	TCTS
Title	TCTS (Circuit Theory and Signal Processing) Lab, Faculté Polytechnique de Mons
Howpublished	WWW page
Year	1999
url	`http://tcts.fpms.ac.be`
group-url	`http://tcts.fpms.ac.be/synthesis/synthesis.html`
pub-url	`http://tcts.fpms.ac.be/publications.html`
Note	`http://tcts.fpms.ac.be`

INPROC.	tcts:euspico98 [DMD98]
Author	O. Deroo, F. Malfrere, T. Dutoit
Title	Comparaison of two different alignment systems: speech synthesis vs. Hybrid HMM/ANN
Booktitle	Proc. European Conference on Signal Processing (EUSIPCO'98)
Address	Greece
Year	1998
Pages	1161--1164
Note	www [TCTS99], same content as [MDD98] (but less references)
url	`http://tcts.fpms.ac.be/publications/papers/1998/eusipco98_odfmtd.zip`
Abstract	In this paper we compared two different methods for phonetically labeling a French database. The first one is based on the temporal alignment of the speech signal on a high quality synthetic speech pattern and the second one uses a hybrid HMM/ANN system. Both systems have been evaluated on French read utterances from a single speaker never seen in the training stage of the HMM/ANN system and manually segmented. This study outline the advantages and drawbacks of both methods. The high quality speech synthetic system has the great advantage that no training stage (hence no labeled database) is needed, while the classical HMM/ANN system allows easily multiple phonetic transcriptions (phonetic lattice). We deduce a method for the automatic constitution of large phonetically and prosodically labeled speech databases based on using the synthetic speech segmentation tool in order to bootstrap the training process of our hybrid HMM/ANN system. The importance of such segmentation tools will be a key point for the development of improved speech synthesis and recognition systems. All the experiments reported in this article related to the hybrid HMM/ANN system have been realized with the STRUT [3] software.

INPROC.	tcts:tsd98 [DMP+98]
Title	EULER: Multi-Lingual Text-to-Speech Project
Pages	27--32
Author	T. Dutoit, F. Malfrère, V. Pagel, M. Bagein P. Mertens, A. Ruelle, A. Gilman
Booktitle	Proceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98
Year	1998
Editor	Petr Sojka, Václav Matousek, Karel Pala, Ivan Kopecek
Address	Brno, Czech Republic
Month	September
Publisher	Masaryk University Press
Note	www [TCTS99]Electronic version: tcts/tsd98tdfmvppmmbarag.ps.*
Remarks	modularity
Abstract	Text-to-speech systems requires simultaneously an abstract linguistic analysis, an acoustic linguistic analysis and a final digital processing stage. The aim of the project presented in this paper is to obtain a set of text-to-speech synthesizers for as many voices, languages and dialects as possible, free of use for non-commercial and non-military applications. This project is an extension of the MBROLA projects. MBROLA is a speech synthesizer that is freely distributed for non-commercial purposes. A multi-lingual speech segmentation and prosody transplantation tool called MBROLIGN has also been developed and freely distributed. Other labs have also recently distributed for free important tools for speech synthesis like Festival from University o f Edinburgh or the MULTEXT project of the University de Provence. The purpose of this paper is to present the EULER project, which will try to integrate all these results, to Eastern European potential partners, so as to increase the dissemination of the important results of MBROLA and MBROLIGN projects and stimulate East/West collaboration on TTS synthesis.

INPROC.	tcts:icslp98-fmodtd [MDD98]
Author	F. Malfrere, O. Deroo, T. Dutoit
Title	Phonetic Alignement : Speech Synthesis Based Vs. Hybrid HMM/ANN
Booktitle	Proc. International Conference on Speech and Language Processing
Address	Sidney, Australia
Year	1998
Pages	1571--1574
Note	www [TCTS99], same content as [DMD98] (with more references)
url	`http://tcts.fpms.ac.be/publications/papers/1998/icslp98_fmodtd.zip`
Abstract	In this paper we compare two different methods for phonetically labeling a speech database. The first approach is based on the alignment of the speech signal on a high quality synthetic speech pattern, and the second one uses a hybrid HMM/ANN system. Both systems have been evaluated on French read utterances from a speaker never seen in the training stage of the HMM/ANN system and manually segmented. This study outlines the advantages and drawbacks of both methods. The high quality speech synthetic system has the great advantage that no training stage is needed, while the classical HMM/ANN system easily allows multiple phonetic transcriptions. We deduce a method for the automatic constitution of phonetically labeled speech databases based on using the synthetic speech segmentation tool to bootstrap the training process of our hybrid HMM/ANN system. The importance of such segmentation tools will be a key point for the development of improved speech synthesis and recognition systems.

INPROC.	tcts:iscas97 [MD97a]
Author	F. Malfrere, T. Dutoit
Title	Speech Synthesis for Text-To-Speech Alignment and Prosodic Feature Extraction
Booktitle	Proc. ISCAS 97
Address	Hong-Kong
Year	1997
Pages	2637--2640
Note	www [TCTS99]
url	`http://tcts.fpms.ac.be/publications/papers/1997/iscas97_fmtd.zip`
Remarks	Recent developments in prosody generation have highlighted the potential interest of machine learning techniques such as multilayer perceptrons [Tra92], linear regression techniques [SK92], classification and regression trees [Hir91], or statistical techniques [MPH93], based on the automatic analysis of large prosodically labeled corpora. Only the segmental features of the reference signal used in alignment. Assumption: the segmental and suprasegmental features are approximately uncorrelated. Keep only the perceptually relevant F0 cues, perceptual stylization, based on a model of tonal perception [alessandro95]. Robust cepstrum by sinusoidal weighting [GL88]. Derivative of cepstrum [SR88].
Abstract	The aim of this paper is to present a new and promising approach of the text--to--speech alignment problem. For this purpose, an original idea is developed : a high quality digital speech synthesizer is used to create a reference speech pattern used during the alignment process. The system has been used and tested to extract the prosodic features of read French utterances. The results show a segmentation error rate of about 8%. This system will be a powerful tool for the automatic creation of large prosodically labeled databases and for research on automatic prosody generation.

INPROC.	tcts:eurosp97 [SDS97]
Author	Yannis Stylianou, Thierry Dutoit, Juergen Schroeter
Title	Diphone Concatenation Using a Harmonic Plus Noise Model of Speech
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	613--616
Note	www [TCTS99]Electronic version: tcts/hnmconc.ps.*
Remarks	Important! HNM (Marine) basis paper, pitch synchronous. Diphone smoothing in region of quasi-stationarity. Additive better for concatenation than PSOLA. References: [DG96] (non pitch-synchronous hybrid harmonic/stochastic synthesis, real-time generation of signals from spectral representation), [SLM95] (phase treatment, modifications), [Mac96] (non pitch synchronous harmonic modeling).
Abstract	In this paper we present a high-quality text-to-speech system using diphones. The system is based on a Harmonic plus Noise (HNM) representation of the speech signal. HNM is a pitch-synchronous analysis-synthesis system but does not require pitch marks to be determined as necessary in PSOLA-based methods. HNM assumes the speech signal to be composed of a periodic part and a stochastic part. As a result, different prosody and spectral envelope modification methods can be applied to each part, yielding more natural-sounding synthetic speech. The fully parametric representation of speech using HNM also provides a straightforward way of smoothing diphone boundaries. Informal listening tests, using natural prosody, have shown that the synthetic speech quality is close to the quality of the original sentences, without smoothing problems and without buzziness or other oddities observed with other speech representations used for TTS.

INPROC.	tcts:speechcomm96 [DG96]
Author	T. Dutoit, B. Gosselin
Title	On the use of a hybrid harmonic/stochastic model for tts synthesis by concatenation
Booktitle	Speech Communication
Number	19
Pages	119--143
Year	1996
Remarks	Cited in [SDS97] for non pitch-synchronous hybrid harmonic/stochastic synthesis, real-time generation of signals from spectral representation. TO BE FOUND

INPROC.	macon-thesis96 [Mac96]
Author	Michael W. Macon
Title	Speech Synthesis Based on Sinusoidal Modeling
Booktitle	PhD thesis
Publisher	Georgia Institute of Technology
Month	October
Year	1996
Remarks	Cited in [SDS97] for non pitch synchronous harmonic modeling. TO BE FOUND

INPROC.	stylianou:eurospeech95 [SLM95]
Author	Y. Stylianou, J. Laroche, E. Moulines
Title	High Quality Speech Modification based on a Harmonic+Noise Model
Booktitle	Proc. EUROSPEECH
Year	1995
Remarks	Cited in [SDS97] for phase treatment, modifications, maximum voice frequency. TO BE FOUND

INPROC.	Malfrere_HighQual_EURO97 [MD97b]
Author	Fabrice Malfrere, Thierry Dutoit
Title	High Quality Speech Synthesis for Phonetic Speech Segmentation
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	2631--2634

INPROC.	Olivier_SimpAnd_EURO97 [vdVOPD+97]
Author	van der Vrecken Olivier, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere
Title	A Simple and Efficient Algorithm for the Compression of MBROLA Segment Databases
Booktitle	Proc. Eurospeech '97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	421--424

INPROC.	Dutoit_TheMbro_ICSLP96 [DPP+96]
Author	T. Dutoit, V. Pagel, N. Pierret, F. Bataille, O. V. der Vrecken
Title	The MBROLA project: Towards a Set of High Quality Speech Synthesizers Free of Use for Non Commercial Purposes
Booktitle	Proc. ICSLP '96
Address	Philadelphia, PA
Month	October
Year	1996
Volume	3
Pages	1393--1396

INPROC.	Dutoit_HighQual_ICASSP94 [Dut94]
Author	T. Dutoit
Title	High Quality Text-to-Speech Synthesis: a Comparison of four Candidate Algorithms
Booktitle	Proc. ICASSP '94
Address	Adelaide, Austrailia
Month	April
Year	1994
Pages	I--565--I--568

14 Misc

MISC	MPEG7:www [MPEG99]
Key	MPEG
Title	MPEG-7 ``Multimedia Content Description Interface'' Documentation
Howpublished	WWW page
Year	1999
url	`http://www.darmstadt.gmd.de/mobile/MPEG7`
Note	`http://www.darmstadt.gmd.de/mobile/MPEG7`
Abstract	More and more audio-visual information is available in digital form, in various places around the world. Along with the information, people appear that want to use it. Before one can use any information, however, it will have to be located first. At the same time, the increasing availability of potentially interesting material makes this search harder. The question of finding content is not restricted to database retrieval applications; also in other areas similar questions exist. For instance, there is an increasing amount of (digital) broadcast channels available, and this makes it harder to select the broadcast channel (radio or TV) that is potentially interesting. In October 1996, MPEG (Moving Picture Experts Group) started a new work item to provide a solution for the urging problem of generally recognised descriptions for audio-visual content, which extend the limited capabilities of proprietary solutions in identifying content that exist today. The new member of the MPEG family is called ``Multimedia Content Description Interface'', or in short MPEG-7. The associated pages presented in the navigation tool shall provide you with the necessary information to learn more about MPEG-7. As MPEG in general is a dynamic and fast moving standardisation body, some documents and related information may be outdated quickly. We will make every effort to keep up with the MPEG pace - however, keep in mind that the Webpages may not always contain the newest information.

MISC	MPEG7:audio-faq [Lin98]
Author	Adam Lindsay
Title	MPEG-7 Audio FAQ
Howpublished	WWW page
Year	1998
url	`http://www.meta-labs.com/mpeg-7/MPEG-7-aud-FAQ.shtml`
parent-url	`http://www.meta-labs.com/mpeg-7-aud/`
Note	moved to [TPMAS98]
Abstract	The following is an unofficial FAQ for MPEG-7 Audio issues. It is not a complete document, and is intended to act as a supplement to the FAQ found in the MPEG-7 Context & Objectives document, N2326.
Remarks	What are specific functionalities forseen for MPEG-7 audio? Although still an expanding list, we can envision indexing music, sound effects, and spoken-word content in the audio-only arena. MPEG-7 will enable query-by-example such as query-by-humming. In addition, audio tools play a large role in typical audio-visual content in terms of indexing film soundtracks and the like. If someone wants to manage a large amount of audio content, whether selling it, managing it internally, or making it openly available to the world, MPEG-7 is potentially the solution. What are the forseen elements of MPEG-7? MPEG-7 work is currently seen as being in three parts: Descriptors (D's), Description Schemes (DS's), and a Description Definition Language (DDL). Each is equally crucial to the entire MPEG-7 effort. Descriptors are the representations of low-level features, the fundamental qualities of audiovisual content which may range from statistical models of signal amplitude, to fundamental frequency of a signal, to an estimate of the number of sources present in a signal, to spectral tilt, to emotional content, to an explicit sound-effect model, to any number of concrete or abstract features. This is the place where the most involvement from the signal processing community is forseen. Note that not all of the descriptors need to be automatically extracted--the essential part of the standard is to establish a normalized representation and interpretation of the Descriptor. We are actively seeking input on what additional potential Descriptors would be useful. Description Schemes are structured combinations of Descriptors. This structure may be used to annotate a document, to directly express the structure of a document, or to create combinations of features which form a richer expression of a higher-level concept. For example, a radio segment DS may note the recording date, the broadcast date, the producer, the talent, and include pointers to a transcript. A classical music DS may encode the musical structures (and allow for exceptions) of a Sonata form. Various spectral and temporal Descriptors may be combined to form a DS appropriate for describing timbre or short sound effects. Any suggestions on other applications of DS's to Audio material are very welcome. The Description Definition Language is to be the mechanism which allows a great degreed flexibility to be included in MPEG-7. Not all documents will fit into a prescribed structure. There are fields (e.g. biomedical imagery) which would find the MPEG-7 framework very useful, but which lie outside of MPEG's scope. A solution provider may have a better method for combining MPEG-7 Descriptors than a normative description scheme. The DDL is to address all of these situations. While MPEG-4 seeks to have a unique and faithful reproduction of material, MPEG-7 foregoes some precision for the sake of identifying the "essential" features of the material (although many different representations are possible of the same material). What distinguishes it most from other material? What makes it similar?

MISC	MPEG:audio-faq [TPMAS98]
Author	D. Thom, H. Purnhagen, the MPEG Audio Subgroup
Title	MPEG Audio FAQ Version 9
Howpublished	WWW page
Year	1998
Month	October
Address	Atlantic City
url	`http://www.tnt.uni-hannover.de/project/mpeg/audio/faq`
Note	International Organisation for Standardisation, Organisation Internationale de Normalisation, Coding of Moving Pictures and Audio, ISO/IEC JTC1/SC29/WG11, N2431, `http://www.tnt.uni-hannover.de/project/mpeg/audio/faq`

PHDTHESIS	levine:thesis [Lev98]
Author	Scott N. Levine
Title	Audio Representations for Data Compression and Compressed Domain Processing
Type	Ph.D. Dissertation
School	Department of Electrical Engineering, CCRMA, Stanford University
Month	December
Year	1998
url	`http://www-ccrma.stanford.edu/~scottl/thesis.html`
Note	`http://www-ccrma.stanford.edu/~scottl/thesis.html`
Abstract	In the world of digital audio processing, one usually has the choice of performing modifications on the raw audio signal, or data compressing the audio signal. But, performing modifications on a data compressed audio signal has proved difficult in the past. This thesis provides a new representation of audio signals that allows for both very low bit rate audio data compression and high quality compressed domain processing and modifications. In this context, processing possibilities are time-scale and pitch-scale modifications. This new audio representation segments the audio into separate sinusoidal, transients, and noise signals. During determined attack transients regions, the audio is modeled by well established transform coding techniques. During the remaining non-transient regions of the input, the audio is modeled by a mixture of multiresolution sinusoidal modeling and noise modeling. Careful phase locking techniques at the time boundaries between the sines and transients allow for seamless transitions between representations. By separating the audio into three individual representations, each can be efficiently and perceptually quantized.

MISC	plunderphonics [Osw99]
Author	John Oswald
Title	Plunderphonics
Howpublished	WWW page
Year	1999
url	`http://www.interlog.com/~vacuvox/`
Note	`http://www.6q.com`, esp. [Osw93]

MISC	plexure [Osw93]
Author	John Oswald
Title	Plexure
Howpublished	CD
Year	1993
url	`http://www.interlog.com/xdiscography.html#plexure`
Note	`http://www.interlog.com/~vacuvox/xdiscography.html#plexure`
Abstract	Published by Disk Union Japan (on CD only), it should be in stores but is often hard to find or expensive. It is currently availabe from WFMU who also provide a short sample (193K).Plundered are over a thousand pop stars from the past 10 years. Rather than crediting each individual artist or group as he did in the original plunderphonic release, Oswald chose instead to reference morphed artists of his own creation (Bonnie Ratt, etc) It starts with rapmillisylables and progresses through the material according to tempo (which has an interesting relationship with genre). Oswald used several mechanisms to generate the plunderphonemes that make up this encyclopaedic popologue. This is the most formidable of the plunderphonics projects to date.

MISC	thelongestandmostharmlessentry [vdVdlLvdV48]
Author	Van van der Van, Dee de la La, Don von der Von
Title	The Longest Bibliographic Reference
Year	1848
Remarks	This is here so that the longest bibliography reference is this one, [vdVdlLvdV48], and not something with an et. al. symbol, because this confuses tth, the tex to html translator, too much.

15 Sound Sources

MISC	berio91 [Ber91]
Author	Luciano Berio
Title	Circles; Sequenza I, III, V
Howpublished	Mediathèque CD00008601
Year	1991
url	`http://mediatheque.ircam.fr/cgi-bin/archives?AFFICHAGE=long\&ID=CD00008601`
Note	Cathy Berberian (Stimme), Francis Pierre (Harfe), Jean-Pierre Drouet, Jean-Claude Casadesus (Schlagzeug), Aurèle Nicolet (Flöte), Vinko Globokar (Posaune)

16 To Read

INPROC.	baudoin:eurospeech:97 [BCC97]
Author	G. Baudoin, J. Cernocký, G. Chollet
Title	Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate
Booktitle	Proc. EUROSPEECH 97
Address	Rhodes, Greece
Month	September
Year	1997
Pages	1295--1298
abstract-url	`http://www.wcl2.ee.upatras.gr/eurtad.html#link1295`
Abstract	This paper deals with the coding of spectral envelope parameters for very low bit rate speech coding (inferior to 500 bps). In order to obtain a sufficient intelligibility, segmental techniques are necessary. Variable dimension vector quantization is one of these. We propose a new interpretation of already published research from Chou-Lockabaugh [2] and Cernocky- Baudoin-Chollet [4,6] on the quantization of variable length sequences of spectral vectors, named respectively Variable to Variable length Vector Quantization (VVVQ) and Multigrams Quantization (MGQ). This interpretation gives a meaning to the Lagrange multiplier used in the optimization criterion of the VVVQ, and should allow new developments as, for example, new modelization of the probability density of the source. We have also studied the influence of the limitation of the delay introduced by the method. It was found that a maximal delay of 400 ms is generally sufficient. Finally, we propose the introduction of long sequences in the segmental codebook by linear interpolation of shorter ones.

INPROC.	Stylianou_DecoOf_ICSLP96 [Sty96]
Author	Y. Stylianou
Title	Decomposition of Speech Signals into a Deterministic and a Stochastic Part
Booktitle	Proc. ICSLP '96
Address	Philadelphia, PA
Month	October
Year	1996
Volume	2
Pages	1213--1216

References

[AADR98]: Carlos Agon, Gérard Assayag, Olivier Delerue, and Camilo Rueda. Objects, Time and Constraints in OpenMusic. In Proceedings of the International Computer Music Conference (ICMC), Ann Arbor, Michigan, October 1998.
[AAFH97]: Gérard Assayag, Carlos Agon, Joshua Fineberg, and Peter Hanappe. An Object Oriented Visual Environment For Musical Composition. In Proceedings of the International Computer Music Conference (ICMC), Thessaloniki, Greece, 1997.
[AAS00a]: G. Assayag, C. Agon, and M. Stroppa. High Level Musical Control of Sound Synthesis in OpenMusic. In Proc. ICMC, Berlin, 2000.
[AAS00b]: G. Assayag, C. Agon, and M. Stroppa. High Level Musical Control of Sound Synthesis in OpenMusic. In Proc. ICMC, 2000.
[AAS00c]: Gérard Assayag, Carlos Agon, and Marco Stroppa. High Level Musical Control of Sound Synthesis in OpenMusic. In Proceedings of the International Computer Music Conference (ICMC), Berlin, August 2000.
[AE]: Aldroubi and Eden. Wavelet analysis and its applications, volume 2, chapter Polynomial Spline and Wavelets. ???, ???
[APB⁺99]: Marc Abrams, Constantinos Phanouriou, Alan L. Batongbacal, Stephen M. Williams, and Jonathan E. Shuster. UIML: an appliance-independent XML user interface language. Computer Networks (Amsterdam, Netherlands: 1999), 31(11--16):1695--1708, May 1999.
[ARL⁺99a]: G. Assayag, C. Rueda, M. Laurson, C. Agon, and O. Delerue. Computer Assisted Composition at Ircam: PatchWork & OpenMusic. Computer Music Journal, 23(3), Fall 1999.
[ARL⁺99b]: Gérard Assayag, Camilo Rueda, Mikael Laurson, Carlos Agon, and O. Delerue. Computer Assisted Composition at Ircam: PatchWork & OpenMusic. Computer Music Journal, 23(3), 1999.
[AS99]: Analysis--Synthesis Team / Équipe Analyse--Synthèse, IRCAM---Centre Georges Pompidou. WWW page, 1999. http://www.ircam.fr/equipes/analyse-synthese/.
[AS00]: Analysis--Synthesis Team / Équipe Analyse--Synthèse, IRCAM---Centre Georges Pompidou. WWW page, 2000. http://www.ircam.fr/anasyn/.
[ASP99]: Anthropic Signal Processing Group, Oregon Graduate Institute of Science and Technology. WWW page, 1999. http://ece.ogi.edu/asp.
[ATT99]: AT&T Labs, Oregon Graduate Institute of Science and Technology. WWW page, 1999. http://www.research.att.com/projects/tts/.
[B⁺84]: Leo Breiman et al. Classification and Regression Trees. Chapman & Hall, New York, 1984. new edition of [BFOS84a]?
[Bat94]: Roberto Battiti. Using the mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5(4):537--550, 1994.
[BC95]: A. W. Black and N. Campbell. Optimising selection of units from speech databases for concatenative synthesis. In Proc. Eurospeech '95, volume 1, pages 581--584, Madrid, Spain, September 1995.
[BCC97]: G. Baudoin, J. Cernocký, and G. Chollet. Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate. In Proc. EUROSPEECH 97, pages 1295--1298, Rhodes, Greece, September 1997.
[BCS98]: Mark Beutnagel, Alistair Conkie, and Ann K. Syrdal. Diphone Synthesis using Unit Selection. In The 3rd ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves, Australia, November 1998. www [ATT99].
[BCS⁺99]: M. Beutnagel, A. Conkie, J. Schroeter, Y. Stylianou, and A. Syrdal. The AT&T Next-Gen TTS System. In Joint Meeting of ASA, EAA, and DAGA, Berlin, Germany, March 1999. www [ATT99].
[Bea93a]: J. W. Beauchamp. Unix Workstation Software for Analysis, Graphics, Modification, and Synthesis of Musical Sounds. Proceedings of the Audio Engineering Society, 1993.
[Bea93b]: J. W. Beauchamp. Unix Workstation Software for Analysis, Graphics, Modification, and Synthesis of Musical Sounds. In Proc. AES, 1993.
[Bea98]: James Beauchamp. Methods for measurement and manipulation of timbral physical correlates. 103(5):2966, 1998.
[Bea00]: James Beauchamp, editor. The Sound of Music. Springer, New York, 2000.
[Ber91]: Luciano Berio. Circles; sequenza i, iii, v. Mediathèque CD00008601, 1991. Cathy Berberian (Stimme), Francis Pierre (Harfe), Jean-Pierre Drouet, Jean-Claude Casadesus (Schlagzeug), Aurèle Nicolet (Flöte), Vinko Globokar (Posaune).
[BFOS84a]: L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984. new edition [B⁺84]?
[BFOS84b]: Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont, California, U.S.A., 1984.
[BH]: James Beauchamp and A. Horner. Piecewise Linear Approximation of Additive Synthesis Envelopes: A Comparison of Various Methods. 20(2):72--95.
[BHM95]: James Beauchamp, A. Horner, and S. McAdams. Musical Sounds, Data Reduction, and Perceptual Control Parameters. In Program for SMPC95, Society for Music Perception and Cognition, pages 8--9, Univ. Calif. Berkeley, 1995. Center for New Music and Audio Technologies (CNMAT).
[BLS91]: G. Bailly, R. Laboissière, and J. L. Schwartz. Formant trajectories as audible gestures: an alternative for speech synthesis. Journal of Phonetics, 19:9--23, 1991.
[Boe89]: Barry W. Boehm. Software risk management. IEEE Computer Society Press, Washington, 1989.
[Boo94]: Grady Booch. Object-Oriented Analysis and Design with Applications. Benjamin--Cummings, Redwood City, Calif., 2nd edition, 1994.
[BT97a]: Alan Black and Paul Taylor. The Festival Speech Synthesis System: System Documentation (1.1.1). Technical Report HCRC/TR-83, Human Communication Research Centre, January 1997. www [CSTR99].
[BT97b]: Alan W Black and Paul Taylor. Automatically clustering similar units for unit selection in speech synthesis. In Proc. Eurospeech '97, pages 601--604, Rhodes, Greece, September 1997. www [CSTR99] Electronic version: cstr/Black_1997_b.*.
[BTC98]: Alan Black, Paul Taylor, and Richard Caley. The Festival Speech Synthesis System: System Documentation (1.3.1). Technical Report HCRC/TR-83, Human Communication Research Centre, December 1998. www [CSTR99].
[Cam96]: N. Campbell. CHATR: A high-definition speech re-sequencing system. Acoustical Society of America and Acoustical Society of Japan, Third Joint Meeting, December 1996.
[CFW00]: A. Chaudhary, A. Freed, and M. Wright. An Open Architecture for Real-time Music Software. In Proc. ICMC, Berlin, 2000.
[CH]: Ngai-Man Cheung and Andrew Horner. Group Synthesis with Genetic Algorithms. 44(3):130--147.
[CH68]: N. Chomsky and M. Halle. The Sound Pattern of English. Harper & Row, New York, NY, 1968.
[Cha95]: Y. T. Chan. Wavelet Basics. Kluwer Academic Publ., Boston, 1995.
[Cha98]: Arun Chandra. Compositional experiments with concatenating distinct waveform periods while changing their structural properties. In SEAMUS'98, Urbana, IL, April 1998. School of Music, University of Illinois. Available online²⁶.
[Cha99]: Jean-Marie Chauvet. Composants et transactions: COMMTS, CorbaOTS, JavaEJB, XML. Collection dirigée par Guy Hervier. Eyrolles: Informatiques magazine, Paris, France, 1999.
[CM96]: O. Cappé and E. Moulines. Regularization Techniques for Discrete Cepstrum Estimation. IEEE Signal Processing Letters, 3(4):100--102, April 1996.
[CM98]: Andrew E. Cronk and Michael W. Macon. Optimized Stopping Criteria for Tree-Based Unit Selection in Concatenative Synthesis. In Proc. of International Conference on Spoken Language Processing, volume 5, pages 1951--1955, November 1998. www [CSLU99].
[COM97]: O. Cappé, M. Oudot, and E. Moulines. Spectral Envelope Estimation using a Penalized Likelihood Criterion. In IEEE ASSP Workshop on App. of Sig. Proc. to Audio and Acoust., Mohonk, October 1997.
[Cov00]: Robin Cover. The XML Cover Pages. WWW page, 2000. http://www.oasis-open.org/cover/xml.html.
[CSLU99]: CSLU Speech Synthesis Research Group, Oregon Graduate Institute of Science and Technology. WWW page, 1999. http://cslu.cse.ogi.edu/tts.
[CSTR99]: Centre for Speech Technology Research, University of Edinburgh. WWW page, 1999. http://www.cstr.ed.ac.uk/.
[CY96]: John E. Clark and Colin Yallop. An Introduction to Phonetics and Phonology. Blackwell, Oxford, 1996.
[CYDH97]: Nick Campbell, Itoh Yoshiharu, Wen Ding, and Norio Higuchi. Factors affecting perceived quality and intelligibility in the CHATR concatenative speech synthesiser. In Proc. Eurospeech '97, pages 2635--2638, Rhodes, Greece, September 1997.
[DC97]: Wen Ding and Nick Campbell. Optimising unit selection with voice source and formants in the CHATR speech synthesis system. In Proc. Eurospeech '97, pages 537--540, Rhodes, Greece, September 1997.
[DCMS99]: François Déchelle, Maurizio De Cecco, Enzo Maggi, and Norbert Schnell. jMax Recent Developments. In Proceedings of the International Computer Music Conference, 1999.
[DDMS99]: F. Déchelle, M. DeCecco, E. Maggi, and N. Schnell. jMax Recent Developments. In Proc. ICMC, 1999.
[DDPZ94]: François Dechelle, Maurizio DeCecco, Miller Puckette, and David Zicarelli. The IRCAM ``Real-Time Platform'': Evolution and Perspectives. In Proceedings of the International Computer Music Conference (ICMC), 1994. Available online²⁷.
[DEG⁺92]: N. Delprat, B. Escudié, P. Guillemain, R. Kronland-Martinet, Ph. Tchamitchian, and B. Torrésani. Asymptotic wavelet and gabor analysis : Extraction of instantaneous frequency. 38(2):644--664, March 1992.
[DG96]: T. Dutoit and B. Gosselin. On the use of a hybrid harmonic/stochastic model for tts synthesis by concatenation. In Speech Communication, number 19, pages 119--143, 1996.
[DGR93a]: Ph. Depalle, G. Garcia, and X. Rodet. Tracking of Partials for Additive Sound Synthesis Using Hidden Markov Models. In IEEE Trans., pages 225--228, April 1993. Abstract²⁸.
[DGR93b]: Ph. Depalle, G. Garcia, and X. Rodet. Tracking of Partials for Additive Sound Synthesis Using Hidden Markov Models. In IEEE Trans., pages 225--228, 1993.
[DGR94]: Philippe Depalle, Guillermo García, and Xavier Rodet. A Virtual Castrato (!?). In Proceedings of the International Computer Music Conference (ICMC), 1994. Available online²⁹.
[dM95]: C. d'Alessandro and P. Mertens. Automatic pitch contour stylization using a model of tonal perception. In Computer Speech and Language, pages 257--288, 1995.
[DMD98]: O. Deroo, F. Malfrere, and T. Dutoit. Comparaison of two different alignment systems: speech synthesis vs. hybrid hmm/ann. In Proc. European Conference on Signal Processing (EUSIPCO'98), pages 1161--1164, Greece, 1998. www [TCTS99], same content as [MDD98] (but less references).
[DMP⁺98]: T. Dutoit, F. Malfrère, V. Pagel, M. Bagein P. Mertens, A. Ruelle, and A. Gilman. EULER: Multi-Lingual Text-to-Speech Project. In Petr Sojka, Václav Matousek, Karel Pala, and Ivan Kopecek, editors, Proceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98, pages 27--32, Brno, Czech Republic, September 1998. Masaryk University Press. www [TCTS99]Electronic version: tcts/tsd98_tdfmvppmmbarag.ps.*.
[Dog95]: Grzegorz Dogil. Phonetic correlates of word stress. AIMS Phonetik (Working Papers of the Department of Natural Language Processing), 2(2), 1995. Contents³⁰.
[Don96]: R. E. Donovan. Trainable Speech Synthesis. Phd thesis, Cambridge University, 1996.
[DPP⁺96]: T. Dutoit, V. Pagel, N. Pierret, F. Bataille, and O. V. der Vrecken. The MBROLA project: Towards a set of high quality speech synthesizers free of use for non commercial purposes. In Proc. ICSLP '96, volume 3, pages 1393--1396, Philadelphia, PA, October 1996.
[DR97]: Shlomo Dubnov and Xavier Rodet. Statistical Modeling of Sound Aperiodicities. In Proceedings of the International Computer Music Conference (ICMC), Tessaloniki, Greece, September 1997.
[DSBO00a]: F. Déchelle, N. Schnell, R. Borghesi, and N. Orio. The jMax Environment: An Overview of New Features. In Proc. ICMC, Berlin, 2000.
[DSBO00b]: François Déchelle, Norbert Schnell, Ricardo Borghesi, and Nicolas Orio. The jMax Environment: An Overview of New Features. In Proceedings of the International Computer Music Conference, Berlin, 2000.
[DTC]: Shlomo Dubnov, Naftali Tishby, and Dalia Cohen. Hearing Beyond the Spectrum. Journal of New Music Research, 24(4).
[DuC99]: Bob DuCharme. XML: the annotated specification. The Charles F. Goldfarb series on open information management. Prentice-Hall PTR, Upper Saddle River, NJ 07458, USA, 1999.
[Dut94]: T. Dutoit. High quality text-to-speech synthesis: a comparison of four candidate algorithms. In Proc. ICASSP '94, pages I--565--I--568, Adelaide, Austrailia, April 1994.
[Edw93]: A. L. Edwards. An Introduction to Linear Regression and Correlation. W. H. Freeman and Co, San Francisco, 1993.
[FHC00a]: K. Fitz, L. Haken, and P. Chirstensen. A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling. In Proc. ICMC, Berlin, 2000.
[FHC00b]: K. Fitz, L. Haken, and P. Chirstensen. Transient Preservation under Transformation in an Additive Sound Model. In Proc. ICMC, Berlin, 2000.
[FHC00c]: Kelly Fitz, Lippold Haken, and Paul Chirstensen. A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Additive Sound Modeling. In Proc. ICMC, Berlin, 2000.
[FHC00d]: Kelly Fitz, Lippold Haken, and Paul Chirstensen. Transient Preservation under Transformation in an Additive Sound Model. In Proceedings of the International Computer Music Conference, Berlin, 2000.
[FHH95a]: K. Fitz, L. Haken, and B. Holloway. Lemur -- A Tool for Timbre Manipulation. In Proceedings of the International Computer Music Conference, pages 158--161, Banff, September 1995.
[FHH95b]: K. Fitz, L. Haken, and B. Holloway. Lemur -- A Tool for Timbre Manipulation. In Proc. ICMC, 1995.
[FM97]: Anne Faure and Stephen McAdams. Comparaison de profils sémantiques et de l'espace perceptif de timbres musicaux. In Actes du 4^ème Congrès Français d'Acoustique, Marseille, April 1997. Société Française d'Acoustique.
[FRD92a]: A. Freed, X. Rodet, and Ph. Depalle. Synthesis and Control of Hundreds of Sinusoidal Partials on a Desktop Computer without Custom Hardware. In ICSPAT, 1992.
[FRD92b]: Adrian Freed, Xavier Rodet, and Phillipe Depalle. Synthesis and Control of Hundreds of Sinusoidal Partials on a Desktop Computer without Custom Hardware. In ICSPAT, 1992. Available online³¹.
[FRD93a]: A. Freed, X. Rodet, and Ph. Depalle. Performance, Synthesis and Control of Additive Synthesis on a Desktop Computer Using FFT^-1. In Proceedings of the 19th International Computer Music Conference, Waseda University Center for Scholarly Information, 1993. International Computer Music Association.
[FRD93b]: A. Freed, X. Rodet, and Ph. Depalle. Performance, Synthesis and Control of Additive Synthesis on a Desktop Computer Using FFT^-1. In Proc. ICMC, 1993.
[Fuk90]: K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 2 edition, 1990.
[Gar94]: Guillermo García. Pm: A library for additive analysis/transformation/synthesis, July 1994. Available online³².
[GBM⁺96]: R. Gribonval, E. Bacry, S. Mallat, Ph. Depalle, and X. Rodet. Analysis of Sound Signals with High Resolution Matching Pursuit. In Proceedings of the IEEE Time--Frequency and Time--Scale Workshop (TFTS), 1996. www [AS00].
[GDR⁺96]: R. Gribonval, Ph. Depalle, X. Rodet, E. Bacry, and S. Mallat. Sound Signal Decomposition using a High Resolution Matching Pursuit. In Proceedings of the International Computer Music Conference (ICMC), August 1996. www [AS00].
[GJM91]: Carlo Ghezzi, Mehdi Jazayeri, and Dino Mandrioli. Fundamentals of Software Engineering. Prentice--Hall, Englewood Cliffs, NJ, 1991.
[GKM96]: Ph. Guillemain and R. Kronland-Martinet. Characterization of acoustic signals through continuous linear time--frequency representations. 84(4):561--585, April 1996.
[GL88]: D.W. Griffin and J.S. Lim. Multiband excitation vocoder. In IEEE Transactions on Acoustics, Speech and Signal Processing, volume 36, pages 1123--1235, 1988.
[GR90]: Thierry Galas and Xavier Rodet. An Improved Cepstral Method for Deconvolution of Source--Filter Systems with Discrete Spectra: Application to Musical Sound Signals. In Proceedings of the International Computer Music Conference (ICMC), Glasgow, September 1990.
[GR91a]: Th. Galas and X. Rodet. Generalized Functional Approximation for Source--Filter System Modeling. In Proc. Eurospeech, 1991.
[GR91b]: Thierry Galas and Xavier Rodet. Generalized Discrete Cepstral Analysis for Deconvolution of Source--Filter Systems with Discrete Spectra. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, October 1991.
[GR91c]: Thierry Galas and Xavier Rodet. Generalized Functional Approximation for Source--Filter System Modeling. In Proc. Eurospeech, pages 1085--1088, Geneve, 1991.
[GS97]: O. Ghitza and M. M. Sondhi. On the perceptual distance between two speech segments. In Journal of the Acoustical Society of America, volume 101, pages 522--529, 1997.
[HAea96]: X. D. Huang, A. Acero, and et al. Whistler: A trainable text-to-speech system. In Proc. of the Int'l Conf. on Spoken Language Processing, pages 2387--2390, 1996.
[Ham77a]: R. W. Hamming. Digital Filters. Signal Processing Series. Prentice--Hall, 1977.
[Ham77b]: Richard Wesley Hamming. Digital Filters. Signal Processing Series. Prentice--Hall, Englewood Cliffs, 1977.
[HB96]: A. J. Hunt and A. W. Black. Unit selection in a concatenative speech synthesis system using a large speech database. In Proc. ICASSP '96, pages 373--376, Atlanta, GA, May 1996. www [CSTR99] Electronic version: cstr/Black_1996_a.s.*.
[HC98]: J. H. L. Hansen and D. T. Chappell. An auditory-based distortion measure with application to concatenative speech synthesis. In IEEE Trans. on Speech and Audio Processing, volume 6, pages 489--495, September 1998.
[Hen98]: Nathalie Henrich. Synthèse de la voix chantée par règles. IRCAM, Paris, France, July 1998. Rapport de stage D.E.A. Acoustique, Traitement de Signal et Informatique Appliqués à la Musique.
[Her98]: Hynek Hermansky. Data-Driven Speech Analysis For ASR. In Petr Sojka, Václav Matousek, Karel Pala, and Ivan Kopecek, editors, Proceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98, pages 213--218, Brno, Czech Republic, September 1998. Masaryk University Press.
[HHW85]: H. Hermansky, B. A. Hanson, and H. Wakita. Perceptually based linear predictive analysis of speech. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 509--512, 1985.
[Hir91]: J. Hirschberg. Using Text Analysis to Predict Intonational Boundaries. In Proceedings of Eurospeech, pages 1275--1278, 1991.
[HJ88]: H. Hermansky and J. C. Junqua. Optimization of perceptually-based ASR front-end. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, page 219, 1988.
[HM94]: H. Hermansky and N. Morgan. RASTA processing of speech. In IEEE Transactions on Speech and Acoustics, volume 2, pages 587--589, October 1994.
[Hol83a]: J. N. Holmes. Formant synthesizers: Cascade or Parallel. In Speech Communication, volume 2, pages 251--273, 1983.
[Hol83b]: J. N. Holmes. Formant synthesizers: Cascade or Parallel. In Speech Communication, volume 2, 1983.
[Hub97]: Barbara Burke Hubbard. The World According to Wavelets: The Story of a Mathematical Technique in the Making. A K Peters Ltd, 1997.
[Jac83]: Michael A. Jackson. System development. Prentice--Hall International series in computer science. Prentice--Hall Intern., Englewood Cliffs, 1983.
[Jac95a]: Michael Jackson. Software requirements & specifications : a lexicon of practice, principles, and prejudices. Addison--Wesley, Wokingham, 1995.
[Jac95b]: Ivar Jacobson. Object-Oriented Software Engineering: a Use Case driven Approach. Addison--Wesley, Wokingham, England, 1995.
[KCG96]: O. Karaali, G. Corrigan, and I. Gerson. Speech Synthesis with Neural Networks. In Proc. of World Congress on Neural Networks, pages 45--50, September 1996.
[KM98a]: A. Kain and M. W. Macon. Personalizing a speech synthesizer by voice adaptation. In Proceedings of the 3rd ESCA/COCOSDA International Speech Synthesis Workshop, pages 225--230, November 1998. www [CSLU99].
[KM98b]: A. Kain and M. W. Macon. Text-to-speech voice adaptation from sparse training data. In Proc. of International Conference on Spoken Language Processing, pages 2847--2850, November 1998. www [CSLU99].
[KM98c]: Alexander Kain and Michael W Macon. Spectral voice conversion for text-to-speech synthesis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98), pages 285--288, 1998. www [CSLU99].
[KMS98]: F. Kossentini, M. Macon, and M. Smith. Audio coding using variable-depth multistage quantization. 6, 1998. www [CSLU99].
[Lev98]: Scott N. Levine. Audio Representations for Data Compression and Compressed Domain Processing. Ph.d. dissertation, Department of Electrical Engineering, CCRMA, Stanford University, December 1998. http://www-ccrma.stanford.edu/~scottl/thesis.html.
[Lin98]: Adam Lindsay. MPEG-7 Audio FAQ. WWW page, 1998. moved to [TPMAS98].
[Mac96]: Michael W. Macon. Speech synthesis based on sinusoidal modeling. In PhD thesis. Georgia Institute of Technology, October 1996.
[Mal97]: Stephane Mallat. A Wavelet Tour of Signal Processing. AP Professional, London, 1997.
[MC95]: M. W. Macon and M. A. Clements. Speech synthesis based on an overlap-add sinusoidal model. In J. of the Acoustical Society of America, volume 97, page 3246. Pt. 2, May 1995. www [CSLU99].
[MC96]: Michael W. Macon and Mark A. Clements. Speech Concatenation and Synthesis Using an Overlap--Add Sinusoidal Model. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96), volume 1, pages 361--364, Atlanta, USA, 1996. www [CSLU99].
[MC97]: M. W. Macon and M. A. Clements. Sinusoidal modeling and modification of unvoiced speech. In IEEE Transactions on Speech and Audio Processing, volume 5, pages 557--560, November 1997. www [CSLU99].
[MCW98]: M. W. Macon, A. E. Cronk, and J. Wouters. Generalization and discrimination in tree-structured unit selection. In Proceedings of the 3rd ESCA/COCOSDA International Speech Synthesis Workshop, November 1998. www [CSLU99].
[MCWK97]: M. W. Macon, A. E. Cronk, J. Wouters, and A. Kain. Ogireslpc: Diphone synthesizer using residual-excited linear prediction. In Tech. Rep. CSE-97-007. Department of Computer Science, Oregon Graduate Institute of Science and Technology, Portland, OR, September 1997. www [CSLU99].
[MD97a]: F. Malfrere and T. Dutoit. Speech synthesis for text-to-speech alignment and prosodic feature extraction. In Proc. ISCAS 97, pages 2637--2640, Hong-Kong, 1997. www [TCTS99].
[MD97b]: Fabrice Malfrere and Thierry Dutoit. High quality speech synthesis for phonetic speech segmentation. In Proc. Eurospeech '97, pages 2631--2634, Rhodes, Greece, September 1997.
[MDD98]: F. Malfrere, O. Deroo, and T. Dutoit. Phonetic alignement : Speech synthesis based vs. hybrid hmm/ann. In Proc. International Conference on Speech and Language Processing, pages 1571--1574, Sidney, Australia, 1998. www [TCTS99], same content as [DMD98] (with more references).
[Mel97]: Jason Meldrum. The Z--Transform, 1997. Online tutorial³³.
[MG80]: J.D. Markel and A.H. Gray. Linear Prediction of Speech. Springer, 1980.
[MJLO⁺97a]: M. W. Macon, L. Jensen-Link, J. Oliverio, M. Clements, and E. B. George. Concatenation-based midi-to-singing voice synthesis. In 103rd Meeting of the Audio Engineering Society. New York, 1997. www [CSLU99].
[MJLO⁺97b]: Michael Macon, Leslie Jensen-Link, James Oliverio, Mark A. Clements, and E. Bryan George. A singing voice synthesis system based on sinusoidal modeling. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97), pages 435--438, 1997. www [CSLU99].
[MKC⁺98]: M. W. Macon, A. Kain, A. E. Cronk, H. Meyer, K. Mueller, B. Saeuberlich, and A. W. Black. Rapid prototyping of a german tts system. In Tech. Rep. CSE-98-015. Department of Computer Science, Oregon Graduate Institute of Science and Technology, Portland, OR, September 1998. www [CSLU99].
[MMLV98]: M. W. Macon, A. McCree, W. M. Lai, and V. Viswanathan. Efficient analysis/synthesis of percussion musical instrument sounds using an all-pole model. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, volume 6, pages 3589--3592. Speech, May 1998. www [CSLU99].
[Moo89]: B. C. J. Moore. An Introduction to the Psychology of Hearing. Academic Press Limited, 3rd edition, 1989.
[MP82]: G. A. Merchant and T. W. Parks. Efficient Solution of a Toeplitz--plus Hankel Coefficient Matrix System of Equations. In IEEE TASSP, volume 30, pages 40--44, February 1982.
[MPEG99]: MPEG-7 ``Multimedia Content Description Interface'' Documentation. WWW page, 1999. http://www.darmstadt.gmd.de/mobile/MPEG7.
[MPH93]: B. Möbius, M. Pätzold, and W. Hess. Analysis and Synthesis of German F0 Contours by Means of Fujisaki's Model. In Speech Communication, volume 13, pages 53--61, 1993.
[MZ92]: S. Mallat and S. Zhong. Characterization of Signals from Multiscale Edges. IEEE Trans. Pattern Anal. Machine Intell., 40(7):2464--2482, July 1992.
[Nag90]: Manfred Nagl. Softwaretechnik: methodisches Programmieren im Großen. Springer compass. Springer, Berlin, 1990.
[Nak94]: S. Nakajima. Automatic synthesis unit generation for English speech synthesis based on multi-layered context oriented clustering. In Speech Communication, volume 14, page 313, September 1994.
[NGY97]: H. J. Nock, M. J. F. Gales, and Steve Young. A comparative study of methods for phonetic decision-tree state clustering. In Proc. Eurospeech '97, volume 1, pages 111--114, Rhodes, Greece, September 1997.
[NSRK85]: N. Nocerino, F. K. Soong, L. R. Rabiner, and D. H Klatt. Comparative study of several distortion measures for speech recognition. In Speech Communication, volume 4, pages 317--331, 1985.
[OBFW98]: Jörn Ostermann, Mark C. Beutnagel, Ariel Fischer, and Yao Wang. Integration of talking heads and text-to-speech synthesizers for visual tts. In Proc. ICSLP98, 1998. www [ATT99].
[OCM97]: M. Oudot, O. Cappé, and E. Moulines. Robust Estimation of the Spectral Envelope for ``Harmonics+Noise'' Models. In IEEE Workshop on Speech coding, Pocono Manor, September 1997.
[Opp78]: Alan V. Oppenheim, editor. Applications of Digital Signal Processing, chapter Digital Processing of Speech, pages 117--168. Prentice--Hall, 1978.
[OS75]: Alan V. Oppenheim and Ronald W. Schafer. Digital Signal Processing. Prentice--Hall, 1975.
[Osw93]: John Oswald. Plexure. CD, 1993. http://www.interlog.com/~vacuvox/xdiscography.html#plexure.
[Osw99]: John Oswald. Plunderphonics. WWW page, 1999. http://www.6q.com, esp. [Osw93].
[Oud98a]: M. Campedel Oudot. Étude du modèle sinusoïdes et bruit pour le traitement de la parole. Estimation robuste de l'enveloppe spectrale. Thèse, ENST, Paris, 1998.
[Oud98b]: Marine Campedel Oudot. Étude du modèle ``sinusoïdes et bruit'' pour le traitement de la parole. Estimation robuste de l'enveloppe spectrale. Thèse, Ecole Nationale Supérieure des Télécommunications, Paris, France, November 1998.
[Pee98]: G. Peeters. Analyse-Synthèse des sons musicaux par la méthode PSOLA. Agelonde (France), May 1998.
[PR98]: G. Peeters and X. Rodet. Sinusoidal versus Non-Sinusoidal Signal Characterisation. Barcelona, November 1998.
[PR99a]: G. Peeters and X. Rodet. Non-Stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum. Orlando, November 1999.
[PR99b]: G. Peeters and X. Rodet. SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum. In Proceedings of the International Computer Music Conference (ICMC), Beijing, October 1999.
[Puc91a]: Miller Puckette. Combining Event and Signal Processing in the MAX Graphical Programming Environment. Computer Music Journal, 15(3):68--77, Winter 1991. Available from³⁴.
[Puc91b]: Miller Puckette. FTS: A Real-Time Monitor for Multiprocessor Music Synthesis. Computer Music Journal, 15(3):58--67, Winter 1991. Available from³⁵.
[PW96]: W. J. Pielemeier and G. H. Wakefield. A High Resolution Time--Frequency Representation for Musical Instrument Signals. J. Acoust. Soc. Am., 99(4):2382--2396, 1996.
[QBC88]: S. R. Quackenbush, T. P. Barnwell, and M. A. Clements. Objective Measures of Speech Quality. Prentice-Hall, Englewood Cliffs, NJ, 1988.
[RBP⁺91]: James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, and William Lorensen. Object-Oriented Modeling and Design. Prentice--Hall, Englewood Cliffs, NJ, 1991.
[RD92]: Xavier Rodet and Phillipe Depalle. A new additive synthesis method using inverse Fourier transform and spectral envelopes. In Proceedings of the International Computer Music Conference (ICMC), October 1992.
[RDG95]: Xavier Rodet, Philippe Depalle, and Guillermo García. New Possibilities in Sound Analysis and Synthesis. In ISMA, 1995. Available online³⁶ PostScript³⁷.
[RDP87a]: X. Rodet, Ph. Depalle, and G. Poirot. Speech Analysis and Synthesis Methods Based on Spectral Envelopes and Voiced/Unvoiced Functions. In European Conf. on Speech Tech., 1987.
[RDP87b]: Xavier Rodet, Phillipe Depalle, and G. Poirot. Speech Analysis and Synthesis Methods Based on Spectral Envelopes and Voiced/Unvoiced Functions. In European Conference on Speech Tech., September 1987.
[RF96]: Xavier Rodet and Dominique François. XSPECT: Introduction, January 1996. Available online³⁸.
[RFL96]: Xavier Rodet, Dominique François, and Guillaume Levy. Xspect: a New Motif Signal Visualisation, Analysis and Editing Program. In Proceedings of the International Computer Music Conference (ICMC), August 1996. Available online³⁹.
[RH91]: Stuart Rosen and Peter Howell. Signals and Systems for Speech and Hearing. Academic Press, London, 1991.
[RL97a]: X. Rodet and A. Lefèvre. The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use. In Proc. ICMC, Tessaloniki, 1997.
[RL97b]: Xavier Rodet and Adrien Lefèvre. The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use. In Proceedings of the International Computer Music Conference (ICMC), Tessaloniki, Greece, September 1997. Abstract⁴⁰, PostScript⁴¹.
[RL97c]: Xavier Rodet and Adrien Lefèvre. The Diphone Program: New Features, new Synthesis Methods and Experience of Musical Use. In Proceedings of the International Computer Music Conference (ICMC), Tessaloniki, Greece, September 1997.
[RM69]: J.C. Risset and M.V. Mathews. Analysis of musical-instrument tones. Physics Today, 22(2):23--30, February 1969.
[Roa96]: Curtis Roads. The Computer Music Tutorial. MIT Press, 1996.
[Rob98]: Tony Robinson. Speech Analysis, 1998. Online tutorial⁴².
[Roc97]: Thierry Rochebois. Méthodes d'analyse synthèse et représentations optimales des sons musicaux basées sur la réduction de données spectrales. PhD thesis, Université Paris XI, December 1997.
[Rod84a]: X. Rodet. Time-Domain Formant-Wave-Function Synthesis. Computer Music Journal, Fall 1984.
[Rod84b]: Xavier Rodet. Time-Domain Formant-Wave-Function Synthesis. Computer Music Journal, 8(3):9--14, Fall 1984. reprinted from [Sim80].
[Rod97a]: X. Rodet. Musical Sound Signals Analysis/Synthesis: Sinusoidal+Residual and Elementary Waveform Models. In Proc. IEEE Time--Frequency/Time--Scale Workshop, 1997.
[Rod97b]: Xavier Rodet. Musical Sound Signals Analysis/Synthesis: Sinusoidal+Residual and Elementary Waveform Models. In Proceedings of the IEEE Time--Frequency and Time--Scale Workshop (TFTS), August 1997. Abstract⁴³, PostScript⁴⁴.
[Rod97c]: Xavier Rodet. The Additive Analysis--Synthesis Package, 1997. Available online⁴⁵.
[RPB84a]: X. Rodet, Y. Potard, and J.-B. Barrière. The Chant--Project: From the Synthesis of the Singing Voice to Synthesis in General. Computer Music Journal, Fall 1984.
[RPB84b]: Xavier Rodet, Yves Potard, and Jean-Baptiste Barrière. The Chant--Project: From the Synthesis of the Singing Voice to Synthesis in General. Computer Music Journal, 8(3):15--31, Fall 1984.
[RPB85]: Xavier Rodet, Yves Potard, and Jean-Baptiste Barrière. CHANT: de la synthèse de la voix chantée à la synthèse en général. Rapports de recherche IRCAM, 1985. Available online⁴⁶.
[RSa]: X. Rodet and D. Schwarz. Spectral Envelopes and Additive+Residual Analysis-Synthesis. In J. Beauchamp, ed. The Sound of Music. Springer, N.Y., to be published.
[RSb]: Xavier Rodet and Diemo Schwarz. Spectral Envelopes and Additive+Residual Analysis-Synthesis. In J. Beauchamp, ed. The Sound of Music. Springer, New York, to be published 2000.
[Sag88]: Y. Sagisaka. Speech synthesis by rule using an optimal selection of non-uniform synthesis units. In Proc. of the Int'l Conf. on Acoustics, Speech, and Signal Processing, page 679, 1988.
[SBHL97a]: X. Serra, J. Bonada, P. Herrera, and R. Loureiro. Integrating Complementary Spectral Models in the Design of a Musical Synthesizer. In Proc. ICMC, 1997.
[SBHL97b]: X. Serra, J. Bonada, P. Herrera, and R. Loureiro. Integrating Complementary Spectral Models in the Design of a Musical Synthesizer. In Proceedings of the International Computer Music Conference, Tessaloniki, 1997.
[SBHL97c]: X. Serra, J. Bonada, P. Herrera, and R. Loureiro. Integrating Complementary Spectral Models in the Design of a Musical Synthesizer. In Proc. ICMC, Tessaloniki, 1997.
[SBHL97d]: Xavier Serra, Jordi Bonada, Perfecto Herrera, and Ramon Loureiro. Integrating complementary spectral models in the design of a musical synthesizer. In Proceedings of the International Computer Music Conference, 1997.
[SCdV⁺98]: S. Sutton, R. Cole, J. de Villiers, J. Schalkwyk, P. Vermeulen, M. Macon, Y. Yan, E. Kaiser, B. Rundle, K. Shobaki, P. Hosom, A. Kain, J. Wouters, D. Massaro, and M. Cohen. Universal Speech Tools: the CSLU Toolkit. In Proc. of International Conference on Spoken Language Processing, November 1998. www [CSLU99].
[Sch98a]: D. Schwarz. Spectral Envelopes in Sound Analysis and Synthesis. Diplomarbeit Nr. 1622, Universität Stuttgart, Fakultät Informatik, Stuttgart, Germany, 1998.
[Sch98b]: D. Schwarz. Spectral Envelopes in Sound Analysis and Synthesis. Diplomarbeit, Universität Stuttgart, Informatik, 1998.
[Sch98c]: Diemo Schwarz. Spectral Envelopes in Sound Analysis and Synthesis. Diplomarbeit Nr. 1622, Universität Stuttgart, Fakultät Informatik, Stuttgart, Germany, June 1998.
[SCS98]: Ann K. Syrdal, Alistair Conkie, and Yannis Stylianou. Exploration of acoustic correlates in speaker selection for concatenative synthesis. In Proc. ICSLP98, 1998. www [ATT99].
[SDS97]: Yannis Stylianou, Thierry Dutoit, and Juergen Schroeter. Diphone concatenation using a harmonic plus noise model of speech. In Proc. Eurospeech '97, pages 613--616, Rhodes, Greece, September 1997. www [TCTS99]Electronic version: tcts/hnmconc.ps.*.
[Sim80]: J. C. Simon, editor. Spoken Language Generation and Understanding. D. Reidel Publishing Company, Dordrecht, Holland, 1980.
[SK92]: Y. Sagisaka and N. Kaiki. Optimization of Intonation Control Using Statistical F0 Resetting Characteristics. In Proceedings of the International Conference on Acoustics, volume 2, pages 49--52. Speech and Signal Processing, 1992.
[SLM95]: Y. Stylianou, J. Laroche, and E. Moulines. High Quality Speech Modification based on a Harmonic+Noise Model. In Proc. EUROSPEECH, 1995.
[SMW97]: Patrick Susini, Stephen McAdams, and Suzanne Winsberg. Caractérisation perceptive des bruits de véhicules. In Actes du 4^ème Congrès Français d'Acoustique, Marseille, April 1997. Société Française d'Acoustique.
[Sof97]: Rational Software. Unified modeling language, version 1.1. Online documentation⁴⁷, September 1997.
[Som85]: Ian Sommerville. Software engineering. International computer science series. Addison--Wesley, Wokingham [u.a.], 2nd edition, 1985.
[SR88]: F.K. Soong and A.E. Rosenberg. On the use of instantaneous and transitional spectral information in speaker recognition. In IEEE Transactions on Acoustics, Speech and Signal Processing, volume 36, pages 871--879, 1988.
[SS90]: X. Serra and J. Smith. Spectral Modeling Synthesis: a Sound Analysis/Synthesis System Based on a Deterministic plus Stochastic Decomposition. Computer Music Journal, 14(4):12--24, 1990.
[SSG⁺98]: Ann K Syrdal, Yannis G Stylianou, Laurie F Garrison, Alistair Conkie, and Juergen Schroeter. Td-psola versus harmonic plus noise model in diphone based speech synthesis. In Proc. ICASSP98, pages 273--276, 1998. www [ATT99].
[STTI97]: Richard Sproat, Paul Taylor, Michael Tanenblatt, and Amy Isard. A markup language for text-to-speech synthesis. In Proc. Eurospeech '97, pages 1747--1750, Rhodes, Greece, September 1997. www [CSTR99] Electronic version: cstr/Sproat_1997_a.*.
[Sty96]: Y. Stylianou. Decomposition of speech signals into a deterministic and a stochastic part. In Proc. ICSLP '96, volume 2, pages 1213--1216, Philadelphia, PA, October 1996.
[Sty98a]: Yannis Stylianou. Concatenative Speech Synthesis using a Harmonic plus Noise Model. In The 3rd ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves, Australia, November 1998. www [ATT99].
[Sty98b]: Yannis Stylianou. Removing Phase Mismatches in Concatenative Speech Synthesis. In The 3rd ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves, Australia, November 1998. www [ATT99].
[SW00]: Diemo Schwarz and Matthew Wright. Extensions and Applications of the SDIF Sound Description Interchange Format. In Proceedings of the International Computer Music Conference, Berlin, August 2000.
[Szy98]: Clemens Szyperski. Component Software: Beyond Object-Oriented Programming. ACM Press and Addison-Wesley, New York, NY, 1998.
[TAW97]: Keith A. Teague, Walter Andrews, and Buddy Walls. Enhanced Modeling of Discrete Spectral Amplitudes. In IEEE Workshop on Speech coding, Pocono Manor, September 1997.
[Tay99]: Paul Taylor. The Festival Speech Architecture. Web page, 1999. www [CSTR99].
[TCTS99]: TCTS (Circuit Theory and Signal Processing) Lab, Faculté Polytechnique de Mons. WWW page, 1999. http://tcts.fpms.ac.be.
[TPMAS98]: D. Thom, H. Purnhagen, and the MPEG Audio Subgroup. MPEG Audio FAQ Version 9. WWW page, October 1998. International Organisation for Standardisation, Organisation Internationale de Normalisation, Coding of Moving Pictures and Audio, ISO/IEC JTC1/SC29/WG11, N2431, http://www.tnt.uni-hannover.de/project/mpeg/audio/faq.
[TR]: C. Tuerk and T. Robinson. Speech synthesis using artificial neural networks trained on cepstral coefficients. In Proc. EUROSPEECH, pages 1713--1716.
[Tra92]: C. Traber. F0 Generation with a Database of Natural F0 Patterns and with a Neural Network. In G. Bailly and C. Benot, editors, Talking Machines: Theories, Models, and Designs, pages 287--304. North Holland, 1992.
[UAE93]: Michael Unser, Akram Aldroubi, and Murray Eden. B--Spline Signal Processing: Part I---Theory. In IEEE Transactions on signal processing, volume 41, pages 821--833, 1993.
[Utt93]: Ian A. Utting. Lecture Notes in Object-Oriented Software Engineering. University of Kent at Canterbury, Canterbury, UK, 1993.
[vdVdlLvdV48]: Van van der Van, Dee de la La, and Don von der Von. The longest biliographic reference, 1848.
[vdVOPD⁺97]: van der Vrecken Olivier, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, and Fabrice Malfrere. A simple and efficient algorithm for the compression of MBROLA segment databases. In Proc. Eurospeech '97, pages 421--424, Rhodes, Greece, September 1997.
[vH13]: Hermann L. von Helmholtz. Die Lehre von den Tonempfindungen: als physiologische Grundlage für die Theorie der Musik. Vieweg, Braunschweig, 6th edition, 1913.
[vH54]: Hermann L. von Helmholtz. On the Sensations of Tone as a Physiological Basis for the Theory of Music. Dover, New York, 1954. Original title: [vH13].
[vH83]: Hermann L. von Helmholtz. Die Lehre von den Tonempfindungen: als physiologische Grundlage für die Theorie der Musik. Georg Olms Verlag, Hildesheim, 1983.
[Vir97]: Dominique Virolle. La Librairie CHANT: Manuel d'utilisation des fonctions en C, April 1997. Available online⁴⁸.
[Vir98]: Dominique Virolle. Sound Description Interchange Format (SDIF), January 1998. Available online⁴⁹.
[VMT92]: H. Valbret, E. Moulines, and J. P. Tubach. Voice transformation using PSOLA technique. speech, 11(2-3):189--194, June 1992.
[vS94]: R. von Sachs. Peak-insensitive non-parametric spectrum estimation. In Journal of time series analysis, volume 15, pages 429--452. 1994.
[vSHOS96]: J.P.H. van Santen, J. Hirschberg, J. Olive, and R. Sproat, editors. Progress in Speech Synthesis. Springer-Verlag, New York, 1996.
[W⁺98]: M. Wright et al. New Applications of the Sound Description Interchange Format. In Proc. ICMC, 1998.
[W⁺99]: M. Wright et al. Audio Applications of the Sound Description Interchange Format Standard. In AES 107th convention, 1999.
[Wak98a]: G. H. Wakefield. Time--Pitch Representations: Acoustic Signal Processing and Auditory Representations. In Proceedings of the IEEE Intl. Symp. on Time--Frequency/Time--Scale, Pittsburgh, 1998.
[Wak98b]: G. H. Wakefield. Time--Pitch Representations: Acoustic Signal Processing and Auditory Representations. In Proc. IEEE Intl. Symp. Time--Frequency/Time--Scale, Pittsburgh, 1998.
[WCF⁺98]: Matthew Wright, Amar Chaudhary, Adrian Freed, David Wessel, Xavier Rodet, Dominique Virolle, Rolf Woehrmann, and Xavier Serra. New Applications of the Sound Description Interchange Format. In Proceedings of the International Computer Music Conference, 1998.
[WCF⁺99a]: M. Wright, A. Chaudhary, A. Freed, S. Khoury, and D. Wessel. Audio Applications of the Sound Description Interchange Format Standard. In AES 107th convention, 1999.
[WCF⁺99b]: Matthew Wright, Amar Chaudhary, Adrian Freed, Sami Khoury, and David Wessel. Audio Applications of the Sound Description Interchange Format Standard. In AES 107th convention preprint, 1999.
[WCF⁺00a]: M. Wright, A. Chaudhary, A. Freed, S. Khoury, A. Momeni, D. Schwarz, and D. Wessel. An XML-based SDIF Stream Relationships Language. In Proc. ICMC, Berlin, 2000.
[WCF⁺00b]: Matthew Wright, Amar Chaudhary, Adrian Freed, Sami Khoury, Ali Momeni, Diemo Schwarz, and David Wessel. An XML-based SDIF Stream Relationships Language. In Proceedings of the International Computer Music Conference, Berlin, 2000.
[WCIS93]: W. J. Wang, W. N. Campbell, N. Iwahashi, and Y. Sagisaka. Tree-based unit selection for English speech synthesis. In Proc. of the Int'l Conf. on Acoustics, Speech, and Signal Processing, pages 191--194, 1993.
[WDK⁺99a]: M. Wright, R. Dudas, S. Khoury, R. Wang, and D. Zicarelli. Supporting the Sound Description Interchange Format in the Max/MSP Environment. In Proc. ICMC, Beijing, 1999.
[WDK⁺99b]: Matthew Wright, Richard Dudas, Sami Khoury, Raymond Wang, and David Zicarelli. Supporting the Sound Description Interchange Format in the Max/MSP Environment. In Proceedings of the International Computer Music Conference (ICMC), Beijing, October 1999.
[WM98]: J. Wouters and M. W. Macon. A perceptual evaluation of distance measures for concatenative speech synthesis. In Proc. of International Conference on Spoken Language Processing, November 1998. www [CSLU99].
[WRD92]: Peter Wyngaard, Chris Rogers, and Philippe Depalle. UDI 2.1---A Unified DSP Interface, 1992. Available online⁵⁰.
[WS99a]: M. Wright and E. Scheirer. Cross-Coding SDIF into MPEG-4 Structured Audio. In Proc. ICMC, Beijing, 1999.
[WS99b]: Matthew Wright and Eric D. Scheirer. Cross-Coding SDIF into MPEG-4 Structured Audio. In Proceedings of the International Computer Music Conference (ICMC), Beijing, October 1999.
[WSR98]: Marcelo M. Wanderley, Norbert Schnell, and Joseph Rovan. ESCHER---Modeling and Performing composed Instruments in real-time. In IEEE Systems, Man, and Cybernetics Conference, October 1998. To be published.
[YH]: Jennifer Yuen and Andrew Horner. Hybrid Sampling-Wavetable Synthesis with Genetic Algorithms. 45(5):316--330.
[YS98]: Ping-Fai Yang and Yannis Stylianou. Real time voice alteration based on linear prediction. In Proc. ICSLP98, 1998. www [ATT99].
[Zwi82]: Eberhard Zwicker. Psychoakustik. Springer, 1982.

Index

103rd Meeting of the Audio Engineering Society, 4
1848, 14
1913, 11
1954, 11
1968, 10
1969, 11
1975, 11
1977, 11, 11
1978, 11
1980, 11, 11
1982, 11, 11
1983, 11, 11, 11, 11
1984, 11, 11, 11, 11, 12, 12, 12
1985, 1, 10, 11, 11
1987, 11, 11
1988, 10, 10, 10, 10, 10
1989, 6, 8
1990, 9, 11, 11, 12
1991, 6, 6, 10, 10, 11, 11, 11, 11, 11, 11, 15
1992, 10, 10, 10, 11, 11, 11, 11, 11, 11
1993, 9, 9, 10, 10, 10, 11, 11, 11, 11, 11, 11, 14
1994, 6, 10, 10, 11, 11, 11, 11, 12, 13
1995, 4, 5, 6, 9, 10, 11, 11, 11, 11, 11, 11, 13
1996, 4, 5, 5, 9, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 13, 13, 13, 16
1997, 4, 4, 4, 4, 5, 5, 5, 5, 6, 7, 8, 8, 9, 9, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 13, 13, 13, 13, 16
1998, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 6, 7, 7, 7, 9, 9, 9, 9, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 13, 13, 13, 14, 14, 14
1999, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 5, 5, 6, 6, 6, 7, 7, 7, 7, 7, 11, 11, 13, 14, 14
2000, 3, 3, 3, 6, 7, 7, 7, 7, 7, 9, 9, 9, 9, 11, 11, 11
???, 11
???, 11
[APB⁺99], 6
[RM69], 11
[Rod97c], 11
[Rod97a], 11
[Rod97b], 11
[dM95], 10
[AS99], 7
[AS00], 7
[HJ88], 10
[HM94], 10
[HHW85], 1
[ASP99], 1
[BCS98], 2
[Sty98a], 2
[BCS⁺99], 2
[OBFW98], 2
[SSG⁺98], 2
[Sty98b], 2
[SCS98], 2
[ATT99], 2
[YS98], 2
[BLS91], 10
[Bat94], 12
[BCC97], 16
[Bea93b], 9
[Bea93a], 9
[BHM95], 9
[BH], 9
[Bea98], 9
[Ber91], 15
[Boe89], 6
[Boo94], 6
[RSa], 11
[RSb], 11
[Bea00], 11
[Cam96], 5
[CYDH97], 5
[Cha99], 6
[BFOS84b], 12
[BFOS84a], 12
[B⁺84], 12
[Cha95], 11
[Cha98], 9
[Vir97], 11
[RPB84a], 11
[RPB85], 11
[RPB84b], 11
[CH68], 10
[CY96], 11
[CFW00], 3
[WS99a], 3
[WS99b], 3
[WDK⁺99a], 3
[WDK⁺99b], 3
[WCF⁺00a], 3
[WCF⁺00b], 3
[W⁺98], 3
[WCF⁺98], 3
[WCF⁺99a], 3
[W⁺99], 3
[WCF⁺99b], 3
[FRD92a], 11
[FRD92b], 11
[MJLO⁺97a], 4
[SCdV⁺98], 4
[KM98a], 4
[MCW98], 4
[MKC⁺98], 4
[MC96], 4
[MJLO⁺97b], 4
[KM98c], 4
[MMLV98], 4
[WM98], 4
[CM98], 4
[KM98b], 4
[KMS98], 4
[MC95], 4
[MCWK97], 4
[MC97], 4
[CSLU99], 4
[BC95], 5
[BT97a], 5
[BTC98], 5
[Tay99], 5
[STTI97], 5
[HB96], 5
[BT97b], 5
[CSTR99], 5
[DC97], 10
[DuC99], 6
[Dut94], 13
[DPP⁺96], 13
[CM96], 11
[GR90], 11
[GR91b], 11
[GR91a], 11
[GR91c], 11
[Sch98a], 11
[Sch98b], 11
[Sch98c], 11
[RL97c], 11
[RL97a], 11
[RL97b], 11
[Don96], 10
[RH91], 11
[OS75], 11
[Opp78], 11
[DTC], 12
[DR97], 12
[Edw93], 10
[WSR98], 11
[DGR94], 11
[RD92], 11
[FRD93b], 11
[FRD93a], 11
[SBHL97a], 11
[SBHL97d], 11
[Rod84a], 11
[Sim80], 11
[Rod84b], 11
[Puc91b], 11
[DDPZ94], 11
[Fuk90], 12
[GS97], 10
[MG80], 11
[GL88], 10
[GDR⁺96], 11
[GBM⁺96], 11
[Ham77a], 11
[Ham77b], 11
[HC98], 10
[vH83], 11
[vH13], 11
[Hir91], 10
[DGR93b], 11
[DGR93a], 11
[Hol83b], 11
[Hol83a], 11
[CH], 9
[YH], 9
[HAea96], 10
[AE], 11
[vS94], 11
[Utt93], 11
[vH54], 11
[Jac95b], 6
[Jac95a], 11
[Jac83], 11
[DSBO00a], 11
[DSBO00b], 11
[DDMS99], 11
[DCMS99], 11
[KCG96], 10
[FHH95b], 11
[FHH95a], 11
[Lev98], 14
[FHC00b], 9
[FHC00d], 9
[FHC00a], 9
[FHC00c], 9
[MD97b], 13
[Lin98], 14
[MPEG99], 14
[TPMAS98], 14
[MZ92], 11
[Mac96], 13
[Mal97], 11
[Oud98a], 11
[Oud98b], 11
[OCM97], 11
[COM97], 11
[Puc91a], 11
[MPH93], 10
[Moo89], 8
[Nag90], 11
[Nak94], 10
[Hen98], 11
[RDG95], 11
[Her98], 1
[NSRK85], 10
[NGY97], 12
[vdVOPD⁺97], 13
[AAS00a], 7
[AAS00b], 7
[AAS00c], 7
[AAFH97], 7
[AADR98], 7
[ARL⁺99a], 7
[ARL⁺99b], 7
[RBP⁺91], 6
[Pee98], 7
[PR98], 7
[PR99b], 7
[PR99a], 7
[Osw93], 14
[Osw99], 14
[Gar94], 11
[Dog95], 11
[VMT92], 10
[FM97], 8
[SMW97], 8
[Zwi82], 11
[QBC88], 10
[GKM96], 11
[DEG⁺92], 11
[Roa96], 11
[Roc97], 12
[Sty96], 16
[Szy98], 6
[Sag88], 10
[SK92], 10
[SW00], 7
[Vir98], 11
[SS90], 9
[SBHL97c], 9
[SBHL97b], 9
[GJM91], 6
[Som85], 11
[SR88], 10
[RDP87a], 11
[RDP87b], 11
[Rob98], 11
[vSHOS96], 10
[TAW97], 11
[UAE93], 11
[SLM95], 13
[SDS97], 13
[DMD98], 13
[MDD98], 13
[MD97a], 13
[DG96], 13
[DMP⁺98], 13
[TCTS99], 13
[vdVdlLvdV48], 14
[MP82], 11
[Tra92], 10
[TR], 10
[WRD92], 11
[Sof97], 6
[PW96], 9
[Wak98b], 9
[Wak98a], 9
[WCIS93], 10
[Hub97], 11
[Cov00], 6
[RF96], 11
[RFL96], 11
[Mel97], 11
TO BE FOUND, 5, 5, 6, 6, 10, 10, 10, 12, 12, 12, 12, 13, 13, 13
Cernocký, J., 16
Breiman, L., 12
D. Reidel Publishing Company, 11
Diplomarbeit Nr. 1622, 11, 11
Diplomarbeit, 11
Dordrecht, Holland, 11
Friedman, J. H., 12
New York, 11
Olshen, R. A., 12
Proceedings of the First Workshop on Text, Speech, Dialogue --- TSD'98, 1, 13
Rapports de recherche IRCAM, 11
Speech Communication, 11, 11
Springer, 11
Stone, C. J., 12
A K Peters Ltd, 11
Abrams, M., 6
Abrams:1999:UAI, 6
ACM Press and Addison-Wesley, 6
Academic Press, 11, 12
Academic Press Limited, 8
Acero, A., 10
Acoustical Society of America and Acoustical Society of Japan, Third Joint Meeting, 5
Actes du 4^ème Congrès Français d'Acoustique, 8, 8
Addison--Wesley, 6, 11, 11
Adelaide, Austrailia, 13
AES 107th convention preprint, 3
AES 107th convention, 3, 3
Agelonde (France), 7
Agon, C., 7, 7, 7, 7, 7, 7, 7
AIMS Phonetik (Working Papers of the Department of Natural Language Processing), 11
Aldroubi, 11
Aldroubi, A., 11
Andrews, W., 11
Ann Arbor, Michigan, 7
AP Professional, 11
Assayag, G., 7, 7, 7, 7, 7, 7, 7
Atlanta, GA, 5
Atlanta, USA, 4
Atlantic City, 14
additive, 11
additive-idea, 11
additive-manual, 11
additive-short, 11
allessandro95, 10
anasyn:oldwww, 7
anasyn:www, 7
asp:icassp88, 10
asp:itsa94, 10
asp:plp85, 1
asp:www, 1
att:diph-select98, 2
att:HNM98, 2
att:nextgen99, 2
att:Ostermann98, 2
att:paperSYN98, 2
att:ph98, 2
att:Syrdal98, 2
att:www, 2
att:Yang98, 2
B. Escudié, 11
B. Torrésani, 11
Bacry, E., 11, 11
Bailly, G., 10, 10
Banff, 11
Barcelona, 7
Barnwell, T. P., 10
Barrière, J., 11, 11, 11
Bataille, F., 13
Batongbacal, A. L., 6
Battiti, R., 12
Baudoin, G., 16
Beauchamp, J., 9, 9, 9, 11
Beauchamp, J. W., 9, 9
Beijing, 3, 3, 3, 3, 7
Belmont, California, U.S.A., 12
Benjamin--Cummings, 6
Benot, C., 10
Berio, L., 15
Berlin, 3, 3, 3, 7, 7, 7, 9, 9, 9, 9, 11, 11, 11
Berlin, Germany, 2
Beutnagel, M., 2, 2
Beutnagel, M. C., 2
Black, A., 5, 5
Black, A. W., 4, 5, 5, 5
Blackwell, 11
Blaha, M., 6
Boehm, B. W., 6
Bonada, J., 9, 9, 11, 11
Booch, G., 6
Borghesi, R., 11, 11
Boston, 11
Braunschweig, 11
Breiman, L., 12, 12
Brno, Czech Republic, 1, 13
bailly1991, 10
battiti94, 12
baudoin:eurospeech:97, 16
beauchamp93, 9
beauchamp93-short, 9
beauchamp95, 9
beauchamp96, 9
beauchamp98, 9
berio91, 15
boehm, 6
booch, 6
bookbeauchamp, 11
bookbeauchamp-specenv, 11
bookbeauchamp-specenv-short, 11
Caley, R., 5
Cambridge University, 10
Campbell, N., 5, 5, 5, 10
Campbell, W. N., 10
Campbell_CHATR, 5
Campbell_FactAffe_EURO97, 5
Canterbury, UK, 11
Cappé, O., 11, 11
Cappé, O., 11
Cecco, M. D., 11
Center for New Music and Audio Technologies (CNMAT), 9
Chan, Y. T., 11
Chandra, A., 9
Chapman & Hall, 12
Chappell, D. T., 10
Chaudhary, A., 3, 3, 3, 3, 3, 3
Chauvet, J., 6
Chauvet:1999:CTC, 6
Cheung, N., 9
Chirstensen, P., 9, 9, 9, 9
Chollet, G., 16
Chomsky, N., 10
Clark, J. E., 11
Clements, M., 4
Clements, M. A., 4, 4, 4, 4, 10
Cohen, D., 12
Cohen, M., 4
Cole, R., 4
Collection dirigée par Guy Hervier, 6
Computer Music Journal, 7, 7, 9, 9, 11, 11, 11, 11, 11, 11
Computer Networks (Amsterdam, Netherlands: 1999), 6
Computer Speech and Language, 10
Conkie, A., 2, 2, 2, 2
Corrigan, G., 10
Cover, R., 6
Cronk, A. E., 4, 4, 4, 4
cart84, 12
cart84-2, 12
cart93, 12
chan, 11
chandra98, 9
chant, 11
chant-manual, 11
chant-short, 11
chant2, 11
chomsky68sound, 10
clark-yallop, 11
cnmat:osw2000-short, 3
cnmat:sdif-mpeg4, 3
cnmat:sdif-mpeg4-short, 3
cnmat:sdif-msp, 3
cnmat:sdif-msp-short, 3
cnmat:sdif-srl, 3
cnmat:sdif-srl-short, 3
cnmat:sdif98, 3
cnmat:sdif98-short, 3
cnmat:sdif99, 3
cnmat:sdif99-short, 3
cnmat:sdif99-sshort, 3
control, 11
control-short, 11
cslu:aes97, 4
cslu:cslutoolkit, 4
cslu:esca98kain, 4
cslu:esca98mm, 4
cslu:german98, 4
cslu:icassp96, 4
cslu:icassp97, 4
cslu:icassp98kain, 4
cslu:icassp98mm, 4
cslu:icslp98-paper, 4
cslu:icslp98cronk, 4
cslu:icslp98kain, 4
cslu:ieeetsap98, 4
cslu:jasa95, 4
cslu:ogireslpc97, 4
cslu:trsap97, 4
cslu:www, 4
cstr:eursp95, 5
cstr:festival97, 5
cstr:festival98, 5
cstr:festivalarch98, 5
cstr:ssml97, 5
cstr:unitsel96, 5
cstr:unitsel97, 5
cstr:www, 5
Déchelle, F., 11, 11
Déchelle, F., 11, 11
DeCecco, M., 11, 11
Dechelle, F., 11
Delerue, O., 7, 7, 7
Depalle, P., 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11
Department of Computer Science, Oregon Graduate Institute of Science and Technology, 4, 4
Department of Electrical Engineering, CCRMA, Stanford University, 14
Deroo, O., 13, 13
Ding, W., 5, 10
Ding_OptiUnit_EURO97, 10
Dogil, G., 11
Donovan, R. E., 10
Dover, 11
Dubnov, S., 12, 12
DuCharme, B., 6
DuCharme99, 6
Dudas, R., 3, 3
Dutoit, T., 13, 13, 13, 13, 13, 13, 13, 13, 13, 13
Dutoit_HighQual_ICASSP94, 13
Dutoit_TheMbro_ICSLP96, 13
d'Alessandro, C., 10
dcep-reg, 11
dcep1, 11
dcep2, 11
dcep3, 11
dcep3-short, 11
de la La, D., 14
de Villiers, J., 4
der Vrecken, O. V., 13
diemo98, 11
diemo98-short, 11
diemo98-sshort, 11
diphones, 11
diphones-nourl, 11
diphones-short, 11
donovan96, 10
dsp, 11
dsp-intro, 11
dspapp, 11
dubnov95, 12
dubnov97, 12
Ecole Nationale Supérieure des Télécommunications, 11
Eddy, F., 6
Eden, 11
Eden, M., 11
Edwards, A. L., 10
ENST, 11
Englewood Cliffs, 11, 11
Englewood Cliffs, NJ, 6, 6, 10
European Conf. on Speech Tech., 11

European Conference on Speech Tech., 11
Eyrolles: Informatiques magazine, 6
edwards93, 10
escher, 11
et al., 10
Faure, A., 8
Fineberg, J., 7
Fischer, A., 2
Fitz, K., 9, 9, 9, 9, 11, 11
François, D., 11
François, D., 11
Freed, A., 3, 3, 3, 3, 3, 3, 11, 11, 11, 11
Friedman, J., 12
Fukunaga, K., 12
farinelli, 11
fft-1, 11
fft-2, 11
fft-2-short, 11
fft-3, 11
fft-3-short, 11
fof, 11
fof-short, 11
fof2, 11
fts, 11
fts-basics, 11
fukunaga90, 12
Galas, T., 11, 11, 11, 11
Gales, M. J. F., 12
García, G., 11, 11, 11
Garcia, G., 11, 11
Garrison, L. F., 2
Geneve, 11
Georg Olms Verlag, 11
George, E. B., 4, 4
Georgia Institute of Technology, 13
Gerson, I., 10
Ghezzi, C., 6
Ghitza, O., 10
Gilman, A., 13
Glasgow, 11
Gosselin, B., 13
Gray, A., 11
Greece, 13
Gribonval, R., 11, 11
Griffin, D., 10
ghitza97, 10
grey80, 11
griffin88, 10
Haken, L., 9, 9, 9, 9, 11, 11
Halle, M., 10
Hamming, R. W., 11, 11
Hanappe, P., 7
Hansen, J. H. L., 10
Hanson, B. A., 1
Harper & Row, 10
Henrich, N., 11
Hermansky, H., 1, 1, 10, 10
Herrera, P., 9, 9, 11, 11
Hess, W., 10
Higuchi, N., 5
Hildesheim, 11
Hirschberg, J., 10, 10
Holloway, B., 11, 11
Holmes, J. N., 11, 11
Hong-Kong, 13
Honolulu, HI, 5
Horner, A., 9, 9, 9, 9
Hosom, P., 4
Howell, P., 11
HRMP, 11
HRMP2, 11
Huang, X. D., 10
Hubbard, B. B., 11
Human Communication Research Centre, 5, 5
Hunt, A. J., 5
hamming77, 11
hamming77-short, 11
hansen98, 10
helmholtz, 11
helmholtz-reprint, 11
hirschberg91, 10
hmm, 11
hmm-short, 11
holmes83, 11
holmes83-short, 11
horner96, 9
horner98, 9
huang96, 10
IBspline, 11
ICS94, 11
ICSPAT, 11, 11
IEEE ASSP Workshop on App. of Sig. Proc. to Audio and Acoust., 11
IEEE Computer Society Press, 6
IEEE Signal Processing Letters, 11
IEEE Systems, Man, and Cybernetics Conference, 11
IEEE TASSP, 11
IEEE Trans., 11, 11
IEEE Trans. on Speech and Audio Processing, 10
IEEE Trans. Pattern Anal. Machine Intell., 11
IEEE Transactions on Acoustics, Speech and Signal Processing, 10, 10
IEEE Transactions on Neural Networks, 12
IEEE Transactions on Speech and Acoustics, 10
IEEE Transactions on Speech and Audio Processing, 4, 4
IEEE Transactions on signal processing, 11
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 11
IEEE Workshop on Speech coding, 11, 11
Important, 13
Institut für Maschinelle Sprachverarbeitung, 11
International Computer Music Association, 11
International computer science series, 11
IRCAM, 11
ISMA, 11
Isard, A., 5
Iwahashi, N., 10
iau, 11
instrument-character, 11
ivar, 6
J. Acoust. Soc. Am., 9, 9
J. of the Acoustical Society of America, 4
Jackson, M., 11
Jackson, M. A., 11
Jacobson, I., 6
Jazayeri, M., 6
Jenolan Caves, Australia, 2, 2, 2
Jensen-Link, L., 4, 4
Joint Meeting of ASA, EAA, and DAGA, 2
Journal of New Music Research, 12
Journal of Phonetics, 10
Journal of the Acoustical Society of America, 10
Journal of the Audio Engineering Society, 9, 9
Journal of time series analysis, 11
Junqua, J. C., 10
jackson1, 11
jackson2, 11
jmax2000, 11
jmax2000-short, 11
jmax99, 11
jmax99-short, 11
Kaiki, N., 10
Kain, A., 4, 4, 4, 4, 4, 4
Kaiser, E., 4
Karaali, O., 10
Karel Pala, 1, 13
Khoury, S., 3, 3, 3, 3, 3, 3
Klatt, D. H., 10
Kluwer Academic Publ., 11
Kopecek, I., 1, 13
Kossentini, F., 4
karaali96, 10
Laboissière, R., 10
Lai, W. M., 4
Laroche, J., 13
Laurson, M., 7, 7
Lefèvre, A., 11, 11, 11
Levine, S. N., 14
Levy, G., 11
Lim, J., 10
Lindsay, A., 14
London, 11, 11
Lorensen, W., 6
Loureiro, R., 9, 9, 11, 11
lemur95, 11
lemur95-short, 11
levine:thesis, 14
loris2000a, 9
loris2000a-short, 9
loris2000b, 9
loris2000b-short, 9
Möbius, B., 10
Macon, M., 4, 4, 4
Macon, M. W., 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 13
Madrid, Spain, 5
Maggi, E., 11, 11
Malfrère, F., 13
Malfrere, F., 13, 13, 13, 13, 13
Malfrere_HighQual_EURO97, 13
Mallat, S., 11, 11, 11
Mandrioli, D., 6
Markel, J., 11
Marseille, 8, 8
Masaryk University Press, 1, 13
Massaro, D., 4
Mathews, M., 11
Matousek, V., 1, 13
McAdams, S., 8, 8, 9
McCree, A., 4
Meldrum, J., 11
Merchant, G. A., 11
Mertens, M. B. P., 13
Mertens, P., 10
Meyer, H., 4
MIT Press, 11
Mohonk, 11
Momeni, A., 3, 3
Monterey, CA, 12
Moore, B. C. J., 8
Morgan, N., 10
Moulines, E., 10, 11, 11, 11, 13
MPEG Audio Subgroup, the., 14
MPEG7:audio-faq, 14
MPEG7:www, 14
MPEG:audio-faq, 14
Mueller, K., 4
MultiscaleEdges, 11
macon-thesis96, 13
mallat, 11
marine-thesis, 11
marine-thesis-short, 11
marine1, 11
marine97, 11
max, 11
moebius93, 10
moore89, 8
N. Delprat, 11
Nagl, M., 11
Nakajima, S., 10
New Paltz, New York, 11
New York, 4, 10, 11, 12
New York, NY, 6, 10
Nocerino, N., 10
Nock, H. J., 12
North Holland, 10
nagl, 11
nakajima94, 10
nat, 11
newposs, 11
nlp:tsdproc213-218, 1
nocerino85, 10
nock97, 12
OASIS, Organization for the Advancement of Structured Information Standards, 6
Olive, J., 10
Oliverio, J., 4, 4
Olivier_SimpAnd_EURO97, 13
Olshen, R., 12
OM2000, 7
OM2000-short, 7
OM2000-sshort, 7
OM97, 7
OM98, 7
OM99, 7
OM99-short, 7
Oppenheim, A. V., 11, 11
Orio, N., 11, 11
Orlando, 7
Ostermann, J., 2
Oswald, J., 14, 14
Oudot, M., 11, 11
Oudot, M. C., 11, 11
Oxford, 11
omt, 6
others, 3, 3, 12
P. Guillemain, 11
Pätzold, M., 10
Pagel, V., 13, 13, 13
Paris, 11, 11
Paris, France, 6, 11, 11
Parks, T. W., 11
PEET981, 7
PEET983, 7
PEET991, 7
PEET992, 7
Peeters, G., 7, 7, 7, 7
Petr Sojka, 1, 13
Ph. Guillemain, 11
Ph. Tchamitchian, 11
Ph.D. Dissertation, 14
Phanouriou, C., 6
PhD thesis, 10, 13
Philadelphia, PA, 13, 16
Physics Today, 11
Pielemeier, W. J., 9
Pierret, N., 13, 13
Pittsburgh, 9, 9
Pocono Manor, 11, 11
Poirot, G., 11, 11
Portland, OR, 4, 4
Potard, Y., 11, 11, 11
Premerlani, W., 6
Prentice-Hall PTR, 6
Prentice--Hall, 6, 6, 11, 11, 11, 11
Prentice--Hall Intern., 11
Prentice--Hall International series in computer science, 11
Prentice-Hall, 10
Proc. AES, 9
Proc. EUROSPEECH, 10, 13
Proc. EUROSPEECH 97, 16
Proc. European Conference on Signal Processing (EUSIPCO'98), 13
Proc. Eurospeech, 11, 11
Proc. Eurospeech '95, 5
Proc. Eurospeech '97, 5, 5, 5, 10, 12, 13, 13, 13
Proc. ICASSP '94, 13
Proc. ICASSP '96, 5
Proc. ICMC, 3, 3, 3, 3, 3, 7, 7, 9, 9, 9, 9, 11, 11, 11, 11, 11, 11
Proc. ICSLP '96, 13, 16
Proc. IEEE Intl. Symp. Time--Frequency/Time--Scale, 9
Proc. IEEE Time--Frequency/Time--Scale Workshop, 11
Proc. International Conference on Speech and Language Processing, 13
Proc. ISCAS 97, 13
Proc. of International Conference on Spoken Language Processing, 4, 4, 4, 4
Proc. of the Int'l Conf. on Acoustics, Speech, and Signal Processing, 10, 10
Proc. of the Int'l Conf. on Spoken Language Processing, 10
Proc. of World Congress on Neural Networks, 10
Proc. ICASSP98, 2
Proc. ICSLP98, 2, 2, 2
Proceedings of Eurospeech, 10
Proceedings of the 19th International Computer Music Conference, 11
Proceedings of the 3rd ESCA/COCOSDA International Speech Synthesis Workshop, 4, 4
Proceedings of the Audio Engineering Society, 9
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1
Proceedings of the IEEE Intl. Symp. on Time--Frequency/Time--Scale, 9
Proceedings of the IEEE Time--Frequency and Time--Scale Workshop (TFTS), 11, 11
Proceedings of the International Computer Music Conference, 3, 3, 7, 9, 9, 11, 11, 11, 11
Proceedings of the International Computer Music Conference (ICMC), 3, 3, 7, 7, 7, 7, 11, 11, 11, 11, 11, 11, 11, 11, 12
Proceedings of the International Conference on Acoustics, 10
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 4, 10
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96), 4
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97), 4
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98), 4
Program for SMPC95, Society for Music Perception and Cognition, 9
Pt. 2, 4
Puckette, M., 11, 11, 11
Purnhagen, H., 14
plexure, 14
plunderphonics, 14
pm, 11
prosody-tilt, 11
psola92, 10
psy:faure97, 8
psy:susini97, 8
psycho, 11
Quackenbush, S. R., 10
quackenbush88, 10
R. Kronland-Martinet, 11, 11
Rabiner, L. R., 10
Redwood City, Calif., 6
Rhodes, Greece, 5, 5, 5, 10, 12, 13, 13, 13, 16
Ridges, 11
Ridges2, 11
Risset, J., 11
Roads, C., 11
Robinson, T., 10, 11
Rochebois, T., 12
Rodet, X., 3, 7, 7, 7, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12
Rogers, C., 11
Rosen, S., 11
Rosenberg, A., 10
Rovan, J., 11
Rueda, C., 7, 7, 7
Ruelle, A., 13
Rumbaugh, J., 6
Rundle, B., 4
roads, 11
rochebois97, 12
S. Mallat, 11
S. Zhong, 11
Saeuberlich, B., 4
Sagisaka, Y., 10, 10, 10
San Francisco, 10
Schafer, R. W., 11
Schalkwyk, J., 4
Scheirer, E., 3
Scheirer, E. D., 3
Schnell, N., 11, 11, 11, 11, 11
School of Music, University of Illinois, 9
Schroeter, J., 2, 2, 13
Schwartz, J. L., 10
Schwarz, D., 3, 3, 7, 11, 11, 11, 11, 11
SEAMUS'98, 9
Serra, X., 3, 9, 9, 9, 11, 11
Shobaki, K., 4
Shuster, J. E., 6
Sidney, Australia, 13
Signal Processing Series, 11, 11
Simon, J. C., 11
Smith, J., 9
Smith, M., 4
Société Française d'Acoustique, 8, 8
Software, R., 6
Sommerville, I., 11
Sondhi, M. M., 10
Soong, F., 10
Soong, F. K., 10
Speech, 4
Speech and Signal Processing, 10
Speech Communication, 10, 10, 10, 13
Springer, 11, 11, 11
Springer compass, 11
Springer-Verlag, 10
Sproat, R., 5, 10
Statistics/Probability Series, 12
Stone, C., 12
Stroppa, M., 7, 7, 7
Stuttgart, Germany, 11, 11, 11
Stylianou, Y., 2, 2, 2, 2, 2, 13, 13, 16
Stylianou, Y. G., 2
Stylianou_DecoOf_ICSLP96, 16
Susini, P., 8
Sutton, S., 4
Syrdal, A., 2
Syrdal, A. K., 2, 2, 2
Szyperski, C., 6
Szyperski98, 6
sagisaka88, 10
sagisaka92, 10
sdif-ext2000, 7
sdif-manual, 11
sms90, 9
sms97, 9
sms97-short, 9
softeng, 6
sommerville, 11
soong88, 10
specenv-rod, 11
specenv-rod-short, 11
speech, 10
speechana, 11
speechsyn96, 10
splinelpc, 11
splines, 11
stylianou:eurospeech95, 13
Talking Machines: Theories, Models, and Designs, 10
Tanenblatt, M., 5
Taylor, P., 5, 5, 5, 5, 5
Teague, K. A., 11
Tech. Rep. CSE-97-007, 4
Tech. Rep. CSE-98-015, 4
Technical Report, 5, 5
Tessaloniki, 9, 9, 11
Tessaloniki, Greece, 11, 11, 12
Thèse, 11, 11
The 3rd ESCA/COCOSDA Workshop on Speech Synthesis, 2, 2, 2
The Charles F. Goldfarb series on open information management, 6
Thessaloniki, Greece, 7
Thom, D., 14
Tishby, N., 12
TO BE FOUND, 5, 5, 6, 6, 10, 10, 10, 12, 12, 12, 12, 13, 13, 13
Traber, C., 10
Tubach, J. P., 10
Tuerk, C., 10
tcts:eurosp97, 13
tcts:euspico98, 13
tcts:icslp98-fmodtd, 13
tcts:iscas97, 13
tcts:speechcomm96, 13
tcts:tsd98, 13
tcts:www, 13
thelongestandmostharmlessentry, 14
toeplitz, 11
traber92, 10
tuerk93, 10
Univ. Calif. Berkeley, 9
Universität Stuttgart, Fakultät Informatik, 11, 11
Universität Stuttgart, Informatik, 11
Université Paris XI, 12
University of Kent at Canterbury, 11
Unser, M., 11
Upper Saddle River, NJ 07458, USA, 6
Urbana, IL, 9
Utting, I. A., 11
udi, 11
uml-www, 6
Valbret, H., 10
Vermeulen, P., 4
Vieweg, 11
Virolle, D., 3, 11, 11
Viswanathan, V., 4
van Santen, J., 10
van der Van, V., 14
van der Vrecken Olivier, 13
von Helmholtz, H. L., 11, 11, 11
von Sachs, R., 11
von der Von, D., 14
W. H. Freeman and Co, 10
Wadsworth and Brooks, 12
Wadsworth Publishing Company, 12
Wakefield, G. H., 9, 9, 9
Wakita, H., 1
Walls, B., 11
Wanderley, M. M., 11
Wang, R., 3, 3
Wang, W. J., 10
Wang, Y., 2
Waseda University Center for Scholarly Information, 11
Washington, 6
Web Page, 5
Wessel, D., 3, 3, 3, 3, 3
Williams, S. M., 6
Winsberg, S., 8
Woehrmann, R., 3
Wokingham, 11
Wokingham [u.a.], 11
Wokingham, England, 6
Wouters, J., 4, 4, 4, 4
Wright, M., 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 7
Wyngaard, P., 11
wakefield96, 9
wakefield98, 9
wakefield98-short, 9
wang93, 10
wavelets, 11
XML, 6
xspect, 11
xspect-manual, 11
Yallop, C., 11
Yan, Y., 4
Yang, P., 2
Yoshiharu, I., 5
Young, S., 12
Yuen, J., 9
Zicarelli, D., 3, 3, 11
Zwicker, E., 11
z, 11

1: http://#1
2: http://#1
3: http://#1
4: http://#1
5: http://#1
6: http://#1
7: http://#1
8: http://#1
9: http://#1
10: http://#1
11: http://#1
12: http://#1
13: http://#1
14: http://#1
15: http://#1
16: http://#1
17: http://#1
18: http://#1
19: http://#1
20: http://#1
21: http://#1
22: http://#1
23: http://#1
24: http://#1
25: http://#1
26: http://#1
27: http://#1
28: http://#1
29: http://#1
30: http://#1
31: http://#1
32: http://#1
33: http://#1
34: http://#1
35: http://#1
36: http://#1
37: http://#1
38: http://#1
39: http://#1
40: http://#1
41: http://#1
42: http://#1
43: http://#1
44: http://#1
45: http://#1
46: http://#1
47: http://#1
48: http://#1
49: http://#1
50: http://#1

This document was translated from L^AT_EX by H^EV^EA.

Unit Selection Sound Synthesis:Extended Bibliography

Diemo Schwarz (schwarz@ircam.fr)

Last update: June 28, 2000

1 ASP Anthropic Signal Processing Group

2 AT&T Labs

3 CNMAT Center for New Music and Audio Technologies

4 CSLU Speech Synthesis Research Group

5 CSTR Centre for Speech Technology Research

6 Computer Science

7 IRCAM

8 Psychoacoustics

9 Sound Synthesis

10 Speech Synthesis

11 Spectral Envelopes

12 Statistics

13 TCTS Circuit Theory and Signal Processing Lab

14 Misc

15 Sound Sources

16 To Read

References

Index

Unit Selection Sound Synthesis:
Extended Bibliography