Introduction to Common Music

[cmlogo.gif]
Heinrich Taube
University of Illinois
taube@uiuc.edu


Overview

Common Music is an object-oriented music composition environment. It treats musical composition as an experimental process involving the description of sound and its higher-order structural relationships. Common Music provides an extensive "compositional toolbox" and a public interface through which a composer may modify and extend the system. Sound is rendered on demand, by traversing high-level structure, producing sound descriptions, and transforming these into a variety of control protocols for sound synthesis and display. Realization targets currently include MIDI, Csound, Common Lisp Music, Music Kit, C Mix, C Music, M4C, RT, Mix and Common Music Notation.

Common Music is implemented in Common Lisp and CLOS and runs on a variety of computers, including Macintosh, SGI, NeXT, SUN, and i386. All ports of Common Music provide a terminal-based music composition editor called Stella. A graphical interface called Capella runs on the Macintosh. Source code and binary images are freely available at several sites on the internet.

The Compositional Model

Common Music views music composition as an activity with three broad areas of interest. At one level, a composer is concerned with the development of musical ideas, expressed as structural relationships between sounds. At another level, the composer is concerned not only with the ideas themselves, but how they are to be realized, or rendered, in the real world. Although compositional ideas and realization are conceptually bound together there are advantages if these activities are modularized in software design. The chief benefit is that the same high-level structure may be applicable in a variety of contexts and participate in different renderings. This ability adds to the expressive power and applicability of the overall model itself. Once a sound is rendered it is usually made audible, or previewed, so there is also a aspect of performance to experimental composition. In Common Music a performance can "feed back" into the model either implicitly, by causing a composer to make changes and formulate new ideas, or explicitly as real-time control over concurrent compositional processes.

[model.gif]

Figure 1: Music composition can be viewed as an activity with three broad areas of concentration: structural relations, realization and performance.

System Architecture

Common Music supports three different interaction modes in parallel. Composers are free to choose whichever mode is most appropriate to a given situation. The most basic way to interact with the system is through its functional interface, via Lisp code. Although evaluating Lisp expressions is very useful for algorithmic control, it is not the most efficient way for interacting with many of the features in the system. The second way to work with Common Music is through its command interpreter. A command interpreter is a program that translates textual input (commands) into appropriate internal actions. Command interpreters are sometimes referred to as "Listeners" or "Shells". Common Music's command interpreter is called Stella. Stella connects to Common Music in a manner similar to the way in which a UNIX shell connects to its operating system: text is entered by the user, the shell interprets the text causing actions to be performed, results (if any) are printed to the terminal and the interpreter waits for the more input. Stella is text-based for three reasons: many common interactions and editing functions can be most succinctly and easily expressed using text; a text-based interface can run on any computer regardless of its graphic capabilities; text is an convenient "exchange format" between applications, and may be input from sources other than the keyboard, for example from scripts files stored on the computer's disk or from concurrent programs via an inter process communication channel or a "clipboard". Although command interpreters permit powerful editing expressions to be formulated, one major deficiency with their model is that pure text displays do not permit the inner workings of a complex system to be easily visualized or understood. The third mode of interaction with Common Music addresses this problem by providing a graphical interface, called Capella. Capella runs only on the Macintosh computer. Because graphical interfaces are not "portable" across computers (or even Lisp implementations), Capella has been designed to complement and not replace the two other modes of interaction supported by the system.

[model.gif]

Figure 2: There are three parallel interaction modes possible in Common Music: procedural (Lisp), textual (Stella) and gestural (Capella)

Musical Structure

Composition systems such as DMix, MODE or Common Music generally draw a distinction between structure representing a unit of sound and structure representing a collection, or aggregation, of sound. Individual sound units are often referred to as events -- objects whose attributes correspond to sound rendering control values. Common Music provides a number of fully formed event classes and more importantly, the ability for a composer to to redefine these classes or to add new sound events as needed. Extending the system is particularly useful when working with synthesis languages such as CLM or Csound since these systems allow a composer to create new synthesis instruments (patches) with arbitrary control values. In the case of CLM, a new sound event description is automatically defined for Common Music whenever a CLM instrument definition is compiled or loaded.

Although there is much less agreement between systems on what actually constitutes "aggregation", any composition system or program must support some sort of higher-order aggregate structure, if only the most basic form of "sequences" or "chunks" of sound events. Despite the lack of uniformity, aggregation serves two common purposes: to provide the building-blocks for defining higher-order aggregate compositional structure above the time-line of performance events and to provide the composer with different "production modes" for accessing time-lined events during performance. The manner in which a collection produces performance events depends on its class, or type. Before briefly presenting the individual types of collections found in Common Music, it is useful to note what they share in common:

Common Music currently defines six types of collections. Because these are declaratively defined it is easy for users to specialize existing classes, or to add new types as needed.
Thread
A collection that represents sequential aggregation. A single time-line of events is produced by processing substructure in sequential, depth-first order.
Merge
A collection that represents parallel aggregation, or multiple time-lines. A single time-line of events is produced by processing substructure in a scheduling queue.
Heap
A collection that represents random grouping. A single time-line of events is produced by randomly "shuffling" substructure and then processing it in sequential order as a thread.
Algorithm
A collection that represents programmatic description. Instead of maintaining explicit substructure, a single-time line of events is produced by calling a user-specified program to create new events (or to side-effect a single event that is returned multiple times.)
Network
A collection that represents user-defined ordering. A single time-line of events is produced by calling a chooser to traverse substructure. The chooser may be expressed as a pattern object or a function.
Layout
A "lightweight" collection that references arbitrary chunks of existing structure.


[model.gif]

Figure 3: A graph depicting some relationships between different events and collections defined in the system. These relationships can be easily expanded or redefined by the composer. Classes such as clm-note and csound-note represent "base classes" for composers working with new synthesis patches in these languages. Subclasses will automatically understand how to produce realizations for several different types of event streams.

Realization

High-level compositional models are by definition "abstractions" of more efficient representations of sound in the computer. This means that at least one -- and possibly a series -- of interpretations must occur in order to render sound from such a model. As a simple example of this process, think of a sequencer that represents aggregate structure as tracks of MIDI notes with duration. In order to produce sound, a track must be traversed and the notes transformed to produce a "lower-level" representation: the actual on/off MIDI bytes sent to the MIDI driver. In Common Music, this interpretation process is called "realization". Despite the fact that a realization might entail several levels of interpretation, current hardware and compilers make performance overhead of even complex aggregate models almost negligible.

Realization in Common Music can occur in one of two possible time modes: run time and real time. In run time mode, realized events receive their proper "performance time stamp" but the performance clock runs as fast as possible. In realtime mode, realized events are stamped at their appropriate real-world clock time.

Event Streams

In Common Music, the results of a realization depend as much upon their context as they do upon the type of events realized during the process. This context is called an "event stream" in Common Music. Identical data can produce radically different results based solely on what type of event stream is used during the realization process. Making results depend on both structure and context makes the compositional model itself more flexible and general. There are currently more than a dozen types of event streams in Common Music, representing everything from PostScript display to sound files to MIDI ports. All of these streams are controlled by a simple, generic protocol whose four operators (methods) are specialized to take appropriate actions according to the combination of object and stream they are passed.

[model.gif]

Figure 4: A graph depicting some relationships between different events streams defined in the system.

Visualization

Much of the recent implementation work in Common Music has been concerned with the development of graphical displays to make Common Music as easy as possible to understand. This work involves both the model itself as well as its output. Figure 5 shows an Output Browser, a graphical tool that supports compositional sketching. Music sketching is by its very nature experimental, typically involving temporary formulations, alternative testing and successive refinement. Common Music provides a "lightweight" collection called a Layout to represent and facilitate the sketching process. A layout differs from other collections in several respects, the most important of which is that its substructure is accessed indirectly through "references". A reference is a "handle" that can map to one or more objects. A reference may just be the name or position of single object or arbitrary, non-adjacent groupings of objects. The Output Browser, developed by Tobias Kunze, supports the creation and manipulation of layouts. The interface permits mixtures of sequential and parallel references to be graphically manipulated and then performed on selected event streams. Inside the browser, Stream and Layout panes permit the user to create, load, save and select layouts and performance streams from menus. Once a layout and stream have been selected, the layout references are graphically depicted inside a pane as tiled boxes linked along both horizontal and vertical dimensions. A layout is altered by dragging its reference blocks horizontally or vertically. A block "snaps" to its closest horizontal and vertical neighbors after it has been released. All editing actions -- the creation or duplication of references, moving, selecting, and changing references definition -- can also be performed using standard command key combinations. Layout processing moves from left to right across the pane: references within a single column are scheduled relative to one another; horizontal movement across columns represents "sectional" (non overlapping) units. Within a column, time shifts relative to other objects in the same column may be specified using the @