Information theory.

A theory which seeks to describe, by means of mathematical equations, the properties and behaviour of systems for storing, processing and transmitting information. Here the term ‘information’ is interpreted broadly as covering not only messages transmitted via the familiar communications media (radio, television, telephone and computer networks), but also the signals (aural, visual and other sensory stimuli) by means of which an individual perceives its immediate environment or communicates with others.

One of the fundamental tenets of information theory is that information can be quantified and measured in terms of the number of bits (binary digits: 0, 1) required to store or transmit a message. Information is understood as the selection by a message source of one particular message from a set of all possible messages. Since the simplest choice (that between two equally likely alternatives) can be represented by a single bit, the quantity of information represented by this choice is taken as the fundamental unit. Similarly, two bits represent a choice among four alternatives each with probability 1/4, and three bits represent a choice among eight alternatives each with probability ⅛. If the alternatives are not all equally likely, then each message in the set will have its own probability of occurrence. The average information content of the message set, or the average number of bits required to represent a message from the set, is called the entropy of the set (H). The entropy is found by adding up the information content of each message multiplied by its probability of occurrence. If the entropy of a message source is H bits per message then every binary encoding representing the source requires an average of at least H bits per message. Conversely, it is always possible to find binary encodings representing the source which use arbitrarily close to H bits per message. Thus entropy can be interpreted as the average number of bits required per message in the most efficient binary encoding of the source. Suppose, for example, we have a source which selects randomly from a message list of eight musical phrases m1m8 with associated probabilities p1p8. If all eight phrases have an equal probablity of ⅛ then H=3 bits, its maximum possible value; however if four phrases each have a probability of 1/16 while the other four have probabilities of 1/4, ⅛, 0 and 0 then H=2 bits. The difference is a result of the unequal probabilities and reveals some of the statistical strucure of the source; since it contains only 2/3 of the maximum information it is said to exhibit 33% redundancy. This is known as a Stochastic process and is characteristic of all information sources in information theory. In fact a music source is not a simple stochastic process; it is a Markov chain, where the probability of a future musical event occurring in the sequence depends explicitly on the occurrence of previous events.

From 1956 a spate of publications appeared applying information theory to many aspects of music analysis and the aesthetics of music. In particular, information theory has been used to determine the relative information rates or entropy profiles of different samples of music in attempting to analyse content, style or perception objectively. Youngblood (1958) made considerable use of the concept of redundancy in defining the notion of style, whilst Moles (1958) addressed the broader issues of aesthetic perception including what he termed ‘sonic material’. Meyer (1956) came close to information theory in viewing styles as culturally conditioned systems of expectations which are continually aroused, fulfilled or frustrated, thereby engendering musical meaning. Revising his definition to explicitly include information theory led Meyer (1957) to a three-stage model for the evolution of musical meaning, and to the notion of designed entropy which measures a composer’s intentional deviation from a stylistic norm. Both Meyer (1957) and Moles (1956) addressed the modulating or distorting effect of noise on musical information. The situation in which music is frequently reheard was examined by Meyer (1961).

Krahenbuehl and Coons (1958) devised quantitative indexes of articulateness and hierarchy which essentially measure the degree of coherence of a sequence of musical events in terms of their unity and diversity. Hiller and Bean (1966) analysed four sonata expositions statistically and derived a variety of ‘contours of information fluctuation’ from which they were able to draw conclusions regarding the composers’ styles and make useful comparisons between the sonatas. Böker-Heil (1971, 1972, 1977) applied statistical and related data analysis techniques to 12-note rows from works by Berg and Schoenberg. His stylistic analyses of madrigals by Palestrina, Rore and Marenzio used three-dimensional graphical profiles of statistically determined functions to define and differentiate stylistic features of works. He later employed a novel tripartite computer simulation model to analyse folksong melodies from the southern Tyrol.

More recently important work has been done by Conklin, Witten and others (1988, 1992, 1994, 1995) on the entropies of Bach chorale melodies as measured by both human and computational models of music prediction. Human subjects were asked to predict the next pitch in each chorale melody using a gambling technique to quantify their degree of confidence in the prediction. The computational approach involved learning inductively the rules for generating the musical sequences. Good agreement between the human and computational estimates was obtained, both for the entropy profiles and for the average pitch entropies. For the latter, values of between 1.5 and 2.1 bits per musical event were found, corresponding to an average probability of between 35% and 23% per musical event respectively.

The combinatorial complexity of two classes of algorithm for musical similarity and melodic recognition has been analysed in detail and compared quantitatively by Overill (1993). The computational problems associated with approximate string-matching techniques for music analysis and musical information retrieval are considered by Crawford, Iliopoulos and Raman (1998).

See also Analysis, §II, 5, Computers and music, §II and Psychology of music.

BIBLIOGRAPHY

L.B. Meyer: Emotion and Meaning in Music (Chicago, 1956)

A.A. Moles: Informationstheorie der Musik’, Nachrichtentechnische Fachberichte, iii (1956), 47–55

R.C. Pinkerton: Information Theory and Melody’, Scientific American, cxciv.ii (February 1956), 77–86

L.B. Meyer: Meaning in Music and Information Theory’, Journal of Aesthetics and Art Criticism, xv (1957), 412–24

E. Coons and D. Krahenbuehl: Information as a Measure of Structure in Music’, JMT, ii (1958), 127–61

W. Fucks: Mathematische Analyse der Formalstruktur von Musik (Cologne and Opladen, 1958)

D. Krahenbuehl and E. Coons: Information as a Measure of Experience in Music’, Journal of Aesthetics and Art Criticism, xvii (1958), 510–22

A.A. Moles: Théorie de l’information et perception esthétique (Paris, 1958; Eng trans., 1966)

J.E. Youngblood: Style as Information’, JMT, ii (1958), 24–35

W. Meyer-Eppler: Grundlagen und Anwendungen der Informationstheorie (Berlin, 1959)

C. Bean: Information Theory Applied to the Analysis of a Particular Formal Process in Tonal Music (diss., U. of Illinois, 1961)

L.B. Meyer: On Rehearing Music’, JAMS, xiv (1961), 257–67

J.E. Cohen: Information Theory and Music’, Behavioral Science, vii (1962), 137–163

W. Meyer-Eppler: Informationstheoretische Probleme der musikalischen Kommunikationen’, Die Reihe, viii (1962), 7–10; Eng. trans. in Die Reihe, viii (1968), 7–10

L.A. Hiller: Informationstheorie und Computermusik (Mainz, 1964)

L.A. Hiller: Informationstheorie und Musik’, Darmstädter Beiträge zur neuen Musik, viii (1964), 7–34

B. Reimer: Information Theory and the Analysis of Musical Meaning’, Council for Research in Music Education Bulletin, ii (1964), 14–22

F. Winckel: Die informationstheoretische Analyse der musikalischen Strukturen’, Mf, xvii (1964), 1–14

L.A. Hiller and C. Bean: Information Theory Analysis of Four Sonata Expositions’, JMT, x (1966), 96–137

L.A. Hiller and R. Fuller: Structure and Information in Webern’s Symphonie, Op.21’, JMT, xi (1967), 60–115

J. Brincker: Statistical Analysis of Music; an Application of Information Theory’, STMf, lii (1970), 53–7

L. Knopoff and W. Hutchinson: Entropy as a Measure of Style: the Influence of Sample Length’, JMT, xxvii (1970), 75–97

B. Vermazen: Information Theory and Musical Value’, Journal of Aesthetics and Art Criticism, xxix (1970–71), 367–70

N. Böker-Heil: DODEK eine Computer-Demonstration’, Zeitschrift für Musiktheorie, ii (1971), 2–14

N. Böker-Heil: Ein algebraisches Modell des Durmoll tonalen Systems’, Kongress für Musiktheorie I: Stuttgart 1971, 64–104

N. Böker-Heil: Musikalische Stilanalyse und Computer: einige grundsätzliche Erwägungen’, IMSCR XI: Copenhagen 1972, i, 45–50

N. Böker-Heil: Der Zustand polyphoner Strukturen: ein Beispiel automatischer Stilbeschreibung’, IMSCR XI: Copenhagen 1972, i, 108–20

N. Böker-Heil: Computer-Simulation eines musikalischen Verstehensprozesses’, IMSCR XII: Berkeley 1977, 324–9

D. Conklin and J.G. Cleary: Modelling and Generating Music using Multiple Viewpoints’, 1st Workshop on Artificial Intelligence and Music: St Paul 1988, 125–37

L.C. Manzara, I.H. Witten and M. James: On the Entropy of Music: an Experiment with Bach Chorale Melodies’, Leonardo Music Journal, ii (1992), 81–8

R.E. Overill: On the Combinatorial Complexity of Fuzzy Pattern Matching in Music Analysis’, Computers and the Humanities, xxvii (1993), 105–10

I.H. Witten, L.C. Manzara and D. Conklin: Comparing Human and Computational Models of Music Prediction’, Computer Music Journal, xviii (1994), 70–80

D. Conklin and I.H. Witten: Multiple Viewpoint Systems for Music Prediction’, Journal of New Music Research, xxiv (1995), 51–73

T. Crawford, C.S. Iliopoulos and R. Raman: String Matching Techniques for Musical Similarity and Melodic Recognition’, Computing in Musicology, xi (1998), 73–100

RICHARD E. OVERILL