Previous Contents Next

14   Misc

MISCMPEG7:www [MPEG99]
KeyMPEG
TitleMPEG-7 ``Multimedia Content Description Interface'' Documentation
HowpublishedWWW page
Year1999
urlhttp://www.darmstadt.gmd.de/mobile/MPEG7
Notehttp://www.darmstadt.gmd.de/mobile/MPEG7
AbstractMore and more audio-visual information is available in digital form, in various places around the world. Along with the information, people appear that want to use it. Before one can use any information, however, it will have to be located first. At the same time, the increasing availability of potentially interesting material makes this search harder. The question of finding content is not restricted to database retrieval applications; also in other areas similar questions exist. For instance, there is an increasing amount of (digital) broadcast channels available, and this makes it harder to select the broadcast channel (radio or TV) that is potentially interesting.

In October 1996, MPEG (Moving Picture Experts Group) started a new work item to provide a solution for the urging problem of generally recognised descriptions for audio-visual content, which extend the limited capabilities of proprietary solutions in identifying content that exist today. The new member of the MPEG family is called ``Multimedia Content Description Interface'', or in short MPEG-7.

The associated pages presented in the navigation tool shall provide you with the necessary information to learn more about MPEG-7. As MPEG in general is a dynamic and fast moving standardisation body, some documents and related information may be outdated quickly. We will make every effort to keep up with the MPEG pace - however, keep in mind that the Webpages may not always contain the newest information.


MISCMPEG7:audio-faq [Lin98]
Author
Adam Lindsay
TitleMPEG-7 Audio FAQ
HowpublishedWWW page
Year1998
urlhttp://www.meta-labs.com/mpeg-7/MPEG-7-aud-FAQ.shtml
parent-urlhttp://www.meta-labs.com/mpeg-7-aud/
Notemoved to [TPMAS98]
AbstractThe following is an unofficial FAQ for MPEG-7 Audio issues. It is not a complete document, and is intended to act as a supplement to the FAQ found in the MPEG-7 Context & Objectives document, N2326.
RemarksWhat are specific functionalities forseen for MPEG-7 audio?
Although still an expanding list, we can envision indexing music, sound effects, and spoken-word content in the audio-only arena. MPEG-7 will enable query-by-example such as query-by-humming. In addition, audio tools play a large role in typical audio-visual content in terms of indexing film soundtracks and the like. If someone wants to manage a large amount of audio content, whether selling it, managing it internally, or making it openly available to the world, MPEG-7 is potentially the solution.

What are the forseen elements of MPEG-7?
MPEG-7 work is currently seen as being in three parts: Descriptors (D's), Description Schemes (DS's), and a Description Definition Language (DDL). Each is equally crucial to the entire MPEG-7 effort.

Descriptors are the representations of low-level features, the fundamental qualities of audiovisual content which may range from statistical models of signal amplitude, to fundamental frequency of a signal, to an estimate of the number of sources present in a signal, to spectral tilt, to emotional content, to an explicit sound-effect model, to any number of concrete or abstract features. This is the place where the most involvement from the signal processing community is forseen. Note that not all of the descriptors need to be automatically extracted--the essential part of the standard is to establish a normalized representation and interpretation of the Descriptor. We are actively seeking input on what additional potential Descriptors would be useful.

Description Schemes are structured combinations of Descriptors. This structure may be used to annotate a document, to directly express the structure of a document, or to create combinations of features which form a richer expression of a higher-level concept. For example, a radio segment DS may note the recording date, the broadcast date, the producer, the talent, and include pointers to a transcript. A classical music DS may encode the musical structures (and allow for exceptions) of a Sonata form. Various spectral and temporal Descriptors may be combined to form a DS appropriate for describing timbre or short sound effects. Any suggestions on other applications of DS's to Audio material are very welcome.

The Description Definition Language is to be the mechanism which allows a great degreed flexibility to be included in MPEG-7. Not all documents will fit into a prescribed structure. There are fields (e.g. biomedical imagery) which would find the MPEG-7 framework very useful, but which lie outside of MPEG's scope. A solution provider may have a better method for combining MPEG-7 Descriptors than a normative description scheme. The DDL is to address all of these situations.

While MPEG-4 seeks to have a unique and faithful reproduction of material, MPEG-7 foregoes some precision for the sake of identifying the "essential" features of the material (although many different representations are possible of the same material). What distinguishes it most from other material? What makes it similar?


MISCMPEG:audio-faq [TPMAS98]
Author
D. Thom, H. Purnhagen, the MPEG Audio Subgroup
TitleMPEG Audio FAQ Version 9
HowpublishedWWW page
Year1998
MonthOctober
AddressAtlantic City
urlhttp://www.tnt.uni-hannover.de/project/mpeg/audio/faq
NoteInternational Organisation for Standardisation, Organisation Internationale de Normalisation, Coding of Moving Pictures and Audio, ISO/IEC JTC1/SC29/WG11, N2431, http://www.tnt.uni-hannover.de/project/mpeg/audio/faq


PHDTHESISlevine:thesis [Lev98]
Author
Scott N. Levine
TitleAudio Representations for Data Compression and Compressed Domain Processing
TypePh.D. Dissertation
SchoolDepartment of Electrical Engineering, CCRMA, Stanford University
MonthDecember
Year1998
urlhttp://www-ccrma.stanford.edu/~scottl/thesis.html
Notehttp://www-ccrma.stanford.edu/~scottl/thesis.html
AbstractIn the world of digital audio processing, one usually has the choice of performing modifications on the raw audio signal, or data compressing the audio signal. But, performing modifications on a data compressed audio signal has proved difficult in the past. This thesis provides a new representation of audio signals that allows for both very low bit rate audio data compression and high quality compressed domain processing and modifications. In this context, processing possibilities are time-scale and pitch-scale modifications. This new audio representation segments the audio into separate sinusoidal, transients, and noise signals. During determined attack transients regions, the audio is modeled by well established transform coding techniques. During the remaining non-transient regions of the input, the audio is modeled by a mixture of multiresolution sinusoidal modeling and noise modeling. Careful phase locking techniques at the time boundaries between the sines and transients allow for seamless transitions between representations. By separating the audio into three individual representations, each can be efficiently and perceptually quantized.


MISCplunderphonics [Osw99]
Author
John Oswald
TitlePlunderphonics
HowpublishedWWW page
Year1999
urlhttp://www.interlog.com/~vacuvox/
Notehttp://www.6q.com, esp. [Osw93]


MISCplexure [Osw93]
Author
John Oswald
TitlePlexure
HowpublishedCD
Year1993
urlhttp://www.interlog.com/xdiscography.html#plexure
Notehttp://www.interlog.com/~vacuvox/xdiscography.html#plexure
AbstractPublished by Disk Union Japan (on CD only), it should be in stores but is often hard to find or expensive. It is currently availabe from WFMU who also provide a short sample (193K).Plundered are over a thousand pop stars from the past 10 years. Rather than crediting each individual artist or group as he did in the original plunderphonic release, Oswald chose instead to reference morphed artists of his own creation (Bonnie Ratt, etc) It starts with rapmillisylables and progresses through the material according to tempo (which has an interesting relationship with genre). Oswald used several mechanisms to generate the plunderphonemes that make up this encyclopaedic popologue. This is the most formidable of the plunderphonics projects to date.


MISCthelongestandmostharmlessentry [vdVdlLvdV48]
Author
Van van der Van, Dee de la La, Don von der Von
TitleThe Longest Bibliographic Reference
Year1848
RemarksThis is here so that the longest bibliography reference is this one, [vdVdlLvdV48], and not something with an et. al. symbol, because this confuses tth, the tex to html translator, too much.



Previous Contents Next