Expression and communication in musical performance

E. F. Clarke



The issue of communication in music is regarded with some suspicion by music analysts and music theorists for at least two reasons. One is the desire to avoid the pitfalls of a naive and dogmatically intentionalist approach to musical communication in which the composer is the 'sender', the listeners are the 'receivers', and the work itself is the 'channel'. The drawbacks of this perspective, exemplified by Cooke (1959), are its rigidly prescriptive tendencies, and its dependence either on loose and unreliable biographical information to establish what the composer 'really meant' in a work, or on apparently arbitrary or flimsily supported pronouncements by different commentators as to the intended meaning of a work, all of which tend to deflect attention away from the music itself. A second reason has been the influence of a brand of structuralism which has regarded the work as an object, whose author or composer is irrelevant - a position adopted by Roland Barthes, among others, and encapsulated in the title of his essay "The death of the author" (1977). Because this perspective encourages analysts to explore whatever meanings may be found in a work, without regard for their possible origins in the author's own mind, it precludes the possibility of regarding a work as a vehicle which conveys meaning from one human 'sender' to other human 'receivers': the absence of a source removes one of the necessary terms in the communicative chain. The attraction of this approach is that analysts can concentrate on the music itself without having to consider its origins. The approach also highlights the distinction between communication and meaning: the meaning of an object, event or relation is a property that any observer is entitled to pick up or construct for him or herself, without implying any reciprocity between the creator and the receiver of meaning. Communication, on the other hand, requires that the meaning recovered by the receiver be the same as that of the sender - that meaning be shared by the two parties, the extent of this reciprocity being a measure of the success of communication. It is important to recognise that this carries with it no assumptions of conscious awareness or intention: a person's sweating and trembling may communicate anxiety to an observer very effectively and entirely veridically whether or not the sender wishes to communicate this state of mind, or is even aware of having done so.

While communication is a somewhat fraught topic in the context of music

theory and analysis, it is rather less hedged around with problems when musical performance is considered. Rather than bringing with it the kind of conceptual difficulties already touched upon here, it becomes an essential term through which to understand the function of performance and the behaviour of performers: performers are primarily engaged in attempting to communicate the structure of the music to an audience. In order to investigate the communicative function of expression in musical performance, I will initially decompose performance communication into the separate activities of the performer and listener before returning to a consideration of the two together.

Expression in Musical Performance

The term "expression" is used in a variety of ways in discourse about music, but I shall use it here to refer to those continuously variable parameters of a performance that are used by a player to convey an interpretation of the music. Instruments differ in both the number and type of expressive parameters available to a performer: for the piano, for example, modifications of timing, dynamic and articulation are the only independently variable parameters available. For some instruments, such as the harpsichord and organ, the range of expressive options is even more restricted, while for others, such as the voice or violin, there is a greater range of parameters that the performer can manipulate. In every case, however, expression is conceived as small-scale continuous variations of these parameters in a manner that is either not explicitly indicated in the score (if one exists), or is indicated only in vague terms. A simple definition of expression in performance is therefore a pattern of systematic departures from the indications of the score - though we shall see later that this definition is inadequate in certain important respects.

Empirical studies of performance expression (e.g. Clarke, 1988; Gabrielsson, 1988; Repp, in press; Shaffer, 1981; Todd, 1985), which have been almost entirely restricted to piano playing for purely technical reasons, have identified a number of recurring characteristics which point to a particular model of the origin and control 'of expression. The critical evidence is that performance expression can be extremely stable over repeated performances that may sometimes span a number of years (e.g. Clynes and Walker, 1982), is found even in sight-read performances (Shaffer, 1981), and can be spontaneously changed by a performer at a moment's notice (e.g. Clarke, 1985). These observations mean that expression cannot possibly be understood as a pattern of changes in timing, dynamic and articulation that is simply learned, remembered and applied to a piece each time it is played, but must be regarded as being generated from the performer's understanding of the musical structure. Any other model imposes excessive memory demands on a performer, and is unable to cope with the mixture of stability and flexibility that has already been mentioned. The stability of performances over time is due to the stability of a performer's mental representation of the musical structure; the existence of expression in sight-read performances is the consequence of a performer forming a representation of the music as s/he reads and parses it; and spontaneous changes in expression are the inevitable consequence of the multivalence of musical structures.

In principle every aspect of musical structure contributes to the specification of an expressive profile for a piece, but a number of authors have shown that grouping structure (or phrase structure) is particularly salient. Todd (1985; 1989) has produced a strikingly effective formal model which takes the hierarchical grouping structure of the music as its input and gives a pattern of rubato as its output on the basis of an appealingly simple rule. The resulting rubato profiles compare well with the profiles of real performances by professional players. A number of other studies have also shown rule-like correspondences between various aspects of musical structure and expression (e.g. Gabrielsson, 1987; Shaffer and Todd, 1987; Sloboda, 1983; Sundberg, 1988).

If the origin and control of expression is based on a representation of musical structure, then it is appropriate to view expressive features as the signs of that structural representation. Empirical studies and introspection indicate that we should regard this semiotic relation in two ways: as the inevitable and insuppressible consequence of a particular representational structure; and as a conscious and voluntary attempt on the part of the performer to make audible an otherwise abstract interpretation of structure. Evidence for the unconscious and insuppressible quality of expression comes from attempts by performers to play without expression: Seashore (1938/1967) showed that while the degree of expression is reduced under these circumstances, it is never eliminated, and that it retains the same general pattern that is observed under normal circumstances. Similarly, Sloboda (1983) showed that the same melody presented to pianists in two different metrical notations was played with different patterns of expression even though the players were not asked to try to distinguish the two metrical versions, and indeed did not notice that the two pieces were in all other respects identical. For both the deadpan performances and the metrically distinguished pair, performance expression was clearly related to basic structural features of the material (such as phrase structure and metre) and can thus be seen as the consequence of the performers' spontaneous parsing of the musical structure. In this sense, therefore, the expressive properties of the performances are symptomatic of the performers' representations of the structure of the music.

Nonetheless it is also clear that performers consciously shape the expression in their performances in order to achieve particular structural and stylistic results. Apart from overcoming purely technical problems this is the function of rehearsal, involving changes in the degree to which an expressive parameter is used within a performance; changes in which of a number of expressive options is used to project a feature of the music (for instance, using articulation rather than dynamics to highlight some feature of a phrase); and changes in the performer's structural understanding. All of these can be thought of as processes which emphasise the inherent or spontaneously emergent expressive properties discussed above, or which superimpose a different pattern upon them. In the case of a piece with a highly indeterminate or ambiguous structure, for instance, there may be very little expressive patterning that arises directly out of spontaneous structural parsing.

A systematic view of the way in which structural information gives rise to an expressive patterning is to propose a set of rules each of which takes a structural description of the music as its input, and gives a sequence of expressive transformations as its output. Although the expressive properties of skilled performances can be extraordinarily subtle, this does not require that the expressive rules themselves be very complex or numerous, since the musical structures which constitute their input are themselves multiply interpretable and multi-dimensional. It is this structural indeterminacy which makes the whole expressive system so rich and variable, since it is quite possible that no two structural interpretations (either by different performers, or the same performer on different occasions) will be identical, thus ensuring that the output of even a very simple collection of expressive rules will be quite diverse. I have proposed elsewhere (Clarke, 1985) that as few as nine such rules may be sufficient to account for a great deal of the structurally derived expressive features of piano playing, covering timing, dynamics and articulation, and Sundberg (1988) proposes a similarly small number. The important point here is that the observed subtlety of performance need not be directly written into the rules of expression themselves as long as the structures which form their input are rich and complex. This means that the problem of giving a convincing formal account of musical expression is very largely the problem of developing a satisfactory approach to the representation of musical structure.

However, it would be a mistake to imagine that structure is the sole determinant of expression. A wide range of other factors including the possibilities of the instrument, the acoustics of the performing environment, the nature of the audience, the mood and intentions of the performer, and even the performance ideology s/he espouses will contribute to the result, sometimes to the detriment of structure (as in the case of an indulgent and egocentric performer), but ideally in conformity with the dictates of structure. This is an issue to which I will return in the next section.

Perceiving Expression

In essence, listeners carry out precisely the same operations in relation to expression as does the performer - but in reverse. We can think of the listener's initial task as being the separation of expressive modifications from the underlying structure of the music (particularly in the case of timing information where rubato must be disentangled from rhythmic structure) so as to make sense of what the expression is trying to convey. Empirical work has shown that for timing we are very sensitive to expressive changes: in the context of a perfectly metronomic performance with notes of between 100 and 400 msec duration, listeners will reliably detect changes in length of a single note somewhere in the sequence of no more than 20 to 30 msec, increasing to 40 to 50 msec with more realistic sequences that already contain an element of rubato (Clarke, 1989). Psychoacoustical studies (see Moore, 1982) suggest that our acuity in the detection of small changes in pitch and loudness is at least comparable to our sensitivity to timing, indicating that we have perceptual access to a tremendous wealth of expressive information in performance. As mentioned above, timing presents a particular perceptual challenge to the listener, due to the need to separate rhythmic structure from rubato. Listeners are remarkably successful at making rhythmic sense of sequences that can contain considerable fluctuations in tempo. This ability seems to result from the operation of categorical perception, whereby a set of discrete structural values is recovered from a continuously variable data stream. In rhythm, the categories appear to be the small whole number proportions on which our notational system is based -though there is some evidence that perceptual categorisation for rhythm may be even simpler than this, employing only a distinction between even or uneven subdivisions of the beat, coupled with the appropriate metrical framework (see Clarke, 1987a). The effect of categorical perception is to convert a stream of raw timing data into two distinct components: a set of discrete categories which constitutes the rhythmic structure of the music; and an associated collection of continuously variable timing values, representing the extent to which a given duration deviates from its ideal category value (i.e. the "remainder" after the categorization process), which is interpreted as expressive modification. For example the sequence of three durations (in msec.) 450 - 200 - 680 might be interpreted in an appropriate metrical context (e.g. 6/8) as the sequence of proportions 2 : 1 : 3 (corresponding perhaps to quarter note - eighth note - dotted quarter note) with 'expressive' emphasis on the first and last notes. In a different metrical context (e.g. 4/4) the same sequence might be interpreted as a 3 : 1 4 sequence (dotted quarter note - eighth note - half note) with 'expressive' emphasis on the second and last notes, demonstrating the interdependencies between metre, rhythmic category, and expressive deviation as perceptual dimensions.

The example also shows that the perception of expression is structure-dependent, just as its production is. The structural context may not only determine the distribution of perceived expression in a sequence, but may also affect whether a timing change is heard as expressive or as an error. In a recent study investigating the effect of structure on expression in imitating rubato (Clarke, in preparation; see also Clarke & Baker-Short, 1987), keyboard players were asked to try to reproduce as accurately as possible heard performances of four melodies which were either exactly as a previous performer had played them, or were distortions of the performance. The distortions were of two types: an inversion of the expressive timing profile, such that positive timing deviations became negative and vice versa; or a translation of the timing profile, such that the timing deviation of each note was shifted along the sequence by one of two different time values. Both types of transformation have the effect of leaving the form of the timing profile unchanged, while dislocating its relation to the structure of the melody. The principal result was that the unmodified performances were reproduced with greater accuracy and stability than the transformed versions, demonstrating that an identical timing profile, or its mirror image, can become unreproducible when its relation to the underlying musical structure is disrupted. A subsequent perceptual experiment with a group of music students who were played the transformed and untransformed performances of the melodies showed that their judgements of the quality of the performances followed the ability of the group of performers to reproduce the different versions: the unchanged performances were considered best, and the inversions worst, with the translations in between. However, it is interesting that the performers were not entirely unable to grasp the modified versions: their attempts to reproduce these strange-sounding performances clearly showed partial success in reproducing the aberrant timing profile, and there was evidence that for at least some versions of the melodies their attempts became more like the target over three successive attempts. This suggests that they formed some kind of direct acoustical representation of the performance to guide their own attempts, even if the pattern of rubato appeared to have no structural logic to it. The stability of such a representation remains to be investigated, but it seems likely that it is rather short-lived and comparatively easily disrupted. At a phenomenological level, it is striking that although all versions of each of the melodies used in the experiment contained the same total amount of timing deviation, its perverse distribution in the transformed sequences had the effect of making these expressive' features sound like unintended errors, or hesitations and uncertainties, on the part of the performer. With further research of this kind it may be possible to specify more clearly the structural constraints that can make the same few milliseconds of lengthening or shortening sound like a mistake in one case and an acceptable rubato in another.

In considering the relationship between structure and expression, it is important to realize that a performer is usually not merely trying to convey the most obvious and basic structural framework of a piece, since this is often quite clear to a listener from even a fairly inexpert parsing of the surface structure of the music. Only under the peculiar conditions of a melodic dictation exercise, or in a piece with an interesting and complex metrical structure would we expect a performer to hammer out the metrical framework of a piece of music. Essentially this is a principle of redundancy reduction: it is unnecessary (and irritating to an audience) to emphasise information that is already available from another source (the pitch and rhythmic structure of the music itself). Empirical support for this is provided by Seashore (1938/1967) who showed that listeners attributed a greater number of expressive features (in this case perceived stress on metrical accents) to a performer than was actually measured to be the case in the performance, as a result of their unconscious processing of the musical structure. He summarizes this observation in a rhetorical question: 'Can it be that objective emphasis by the player, either by strength or duration of the note, is comparatively secondary in value to the compositional emphasis which the musical listener "feels into" the measure subjectively ?' (p. 244)

Where 'expert' audiences are involved this principle becomes particularly important in trying to understand the idiosyncracies of performance expression, and paradoxically it can lead to two apparently opposite outcomes. On the one hand the expertise of the audience means that a performer can rely on even the most subtle gestures of performance being picked up and understood, and this may therefore lead to a performance with apparently understated characteristics. The converse of this can be heard in the rather gushing and overly demonstrative rubato of popular recordings of the nineteenth century piano repertoire that are aimed at the 'easy listening' market. On the other hand, the stylistic knowledge of an audience can allow a performer to be more risky and even wayward in interpretation, in the knowledge that the audience will have sufficient understanding of the musical structure to be able to tolerate and enjoy the transformations to which it is subjected in performance. One musical genre in which this kind of treatment is common is jazz: because the repertoire consists very largely of 'standards' with which the audience is expected to be familiar, performers can devote their attentions to finding interesting and unusual ways to interpret the music, rather than simply making sure that the musical structure is faithfully conveyed. The dramatic rubato and inflections of pitch and timbre audible in recordings of a singer like Billie Holiday make clear how much this style thrives on, and requires, idiosyncratic and extreme expressive treatment. The simplicity of the musical structure in many of the standards provides a firm anchor that allows the performer to pull the performance around in a striking manner, and a performance that aimed simply to emphasise the metrical or phrase structure would clearly be absurdly inappropriate. Just as this relatively fixed repertoire of music has led to a strong performer cult within jazz, so also can the same phenomenon be observed in classical music: when 'great performers' give gala performances, they invariably play well-known music from the standard repertoire, allowing the attention of the audience to be turned away from the music and onto the expressive activities of the performer.

An assessment of the originality or expressivity of a performer can only made if I ' there is some neutral baseline against which to make the comparison. The most obvious baseline, and one which has the appeal of theoretical simplicity, is a perfectly metronomic or deadpan performance. This is essentially a return to the definition of j expression given towards the beginning of this paper, where expression was characterised as continuously variable departures from the indications of the score. This is, however, too neat. The problem is that the baseline should be perceived as neutral, and this is certainly not true of metronomic performances, which sound unnaturally devoid of variability. Strict metricality in performance can actually be used as an expressive device, as styles as diverse as disco music and moto perpetuo illustrate. The true baseline against which to assess expressivity would be a performance with perfectly normative rubato (and the equivalent in all other relevant expressive parameters), the effect of which should be to make the performance sound ' . blandly predictable. Because it is not easy to establish exactly what pattern of rubato, a dynamics and articulation would achieve this, and since any such normative performance would almost certainly be different for different listeners, most empirical work on expressive performance retains the metronomic baseline as its standard, since this at least provides a fixed reference point. In fact most empirical studies of performance expression can be seen as an attempt to establish the general characteristics of a normative performance. There is still some way to go, however, before a reasonable picture of this normative pattern that can be adapted to different styles of music and different performance practices is established.

Although I have once again concentrated initially on structure as an element to be retrieved from performance and as a vital factor in making sense of a performance, it would be a mistake to imply that structure is the only kind of information conveyed by expression. Listeners also pick up information about a variety of attributes of the performer and his/her instrument, including his/her state of mind, technical competence, and even the difficulty (technical or physical) of the music - though this is something that many instrumental traditions are at pains to conceal. The unaccompanied violin and 'cello music of J.S. Bach is a striking example of this, where a significant element of expression in the music comes from the sense of physical and technical effort and the need to overcome the inherent constraints of the instrument in producing polyphony on essentially monophonic instruments. The Chaconne from the D minor violin Partita, for example, contains numerous places where a change in instrumental technique and an associated change in the sense of physical tension reinforces a structural feature. At the other extreme, information is conveyed symbolically through the performance conventions that a player uses, consciously or unconsciously. The conventions of a particular performance practice not only convey specific expressive information within the confines of that style, but as a whole convey an ideological message - such as the 'authenticity' of the performance. It is because these different kinds of semiotic relation coexist and develop in parallel that expression in musical performance is so rich and multidimensional, and can be interpreted in such different ways by different listeners.

Expression and Twentieth Century Music

A number of developments in the music of this century have made the relationship between structure and expression, and even the definition of performance expression, considerably more problematic than for the music that I have so far considered. The first of these is the dramatic change in European music around the turn of the century that resulted in the abandonment of tonality by many composers, and the loss of the relatively homogeneous style that characterised the eighteenth and nineteenth centuries. A consequence of this is the increased difficulty of understanding the structure of such music, not only in strictly analytical terms, but also for the listeners and performers of the music. A symptom of the difficulties for performers is the enormous increase in the number of expressive markings in the scores of early twentieth century composers (for example the composers of the Second Viennese School), as if the composers were no longer confident that their performers would be able to identify the significant features of the music, or be relied upon to treat them expressively in an appropriate way. In short this represents a breakdown of the undeclared contract between composers and performers (and their audience) in which performers could be assumed both to have sufficient insight into the musical structure and also sufficient unconscious understanding of an appropriate performance practice to be able to interpret the music without explicit instruction from the composer.

The stylistic changes in European music at the turn of the century did not only affect pitch structure, and in certain respects the corresponding developments in rhythmic structure have had a greater impact on expression and communication. The reason for this is that metrical structure forms a vital framework around which expressive timing is organised for both performers and listeners. Without the sense of a regular framework of beats and the accompanying principle of rhythmic structures based on small integer multiples of a basic underlying pulse, it is almost impossible for listeners to pick up the sense of continuously variable tempo on which rubato depends, or for performers to make use of it with precision and control (see Clarke, 1987b for further discussion). One of the characteristics of music in the early part of this century is a progressive increase in the complexity of its metrical structure, resulting in an effective abandonment of metre in a significant amount of music from around the time of the second world war. With the addition of an increasing interest in the compositional control of dynamics and articulation, listeners faced the prospect of being unable to distinguish structure from expression in this music, and performers struggled to find ways to interpret it without simply being a slave to the( dictates of the score. One of the consequences of these radical changes has been the need to establish a new performance practice to deal with the technical and aesthetic demands made on performers. An indication of the gradual establishment of such z performance practice is the existence of individuals and ensembles who specialize it the performance of twentieth century music - not only because they have a mastery of its technical requirements, but also because they have an understanding of how t( interpret this music.

The final developments in twentieth century music that I want to consider here ( are the most radical from the point of view of performance expression: the use of non-categorical, indeterminate and graphical notation, and the arrival o electroacoustic music. While notational innovations and electroacoustic music do no present identical issues for performance, they are in many ways closely related. In both cases the distinction between categories of musical structure and continuously variable departures from (or modifications of) those categories is blurred o eliminated. The notational developments mentioned have meant that rhythm and pitch structure have lost the character of consisting of discrete values, and that 1 progressively more improvisational component has been introduced into the music Structure in the music of composers such as Morton Feldman, Earle Brown and Jolt Cage combines elements of precise specification with elements that are entirely, determined by the performer at the time of performance in a way that makes the( distinction between structure and interpretation (and thus expression) difficult t( maintain. In the case of electroacoustic music this process is taken one step further: it 'classic' electroacoustic music of the kind that exists on tape alone, or as a file in ; microcomputer, the distinction between the piece and its performance completely disappears since the performance consists of little more than activating a fixed sown( source.

While these observations are not intended as an attack on musical development in the twentieth century, there are nonetheless consequences for listeners that should( be recognised. One is that if it becomes more difficult, or impossible, to distinguish an expressive component in performances of this music, then the performances mal be perceived as 'cold' orí inhuman' - particularly in the context of the obsession wit performers as individuals that currently prevails in our musical culture. It is no difficult for the 'coldness' that is perceived in the performance to become associate( with the music itself, and with a sense of alienation on the part of the audience however unjustified these attributions may be in reality. Part of the failure o twentieth century music to be accepted by a wider public is, I believe, attributable to (this problem. A second consequence of the reduction or absence of perceive (expression in performance also contributes to the resistance to twentieth century music in a way that is perhaps more direct. As I have tried to indicate already expression has an important part to play in articulating and communicating structure to listeners, and if the impact of expression is attenuated or eliminated it may b, considerably more difficult to make perceptual sense of the music. A failure v perceive any expressive markers in the performance may exacerbate the intrinsic, difficulties for listeners in picking up and making sense of the structure c contemporary music - which is obviously a prerequisite to accepting and enjoying it.

There have been a number of responses to this situation by different groups o composers. One is the acceptance and even the deliberate cultivation of this coldness abstraction and objectivity - a response that is most obviously associated with the music of Darmstadt in the fifties. A second response has been to turn to a more gestural and mimetic style in an effort to reinject a sense of the involvement of human performers in musical production. Music such as that of Berio's Sequenzas is of this type, incorporating an element of theatre that allies them with music theatre proper, where the gestural and mimetic character of the performance is more literally apparent. In a similar vein, it is striking that recent electroacoustic music has increasingly incorporated an interaction between electroacoustic media and live performers - not only because this allows for different compositional possibilities, but also because it reintroduces an element of performance that is absent from the purer versions of the genre. A third response has been to reintroduce those structural properties of music whose attenuation or absence I have focussed upon, but to employ them to rather different ends. The music of so-called minimalist composers such as Steve Reich, Philip Glass and Michael Nyman returns to the discrete categorical frameworks of pitch and metre in a manner that, though offering the possibilities for expressive deviation, exploits the paradoxically expressive potential of an almost mechanical precision and regularity that was remarked upon earlier.

The main arguments of this paper can be summarised as follows. Communication in performance can be regarded as the conscious and unconscious processes by which musical structure is encoded into the variables of expression, and correspondingly decoded by listeners. The process contains unconscious components in the construction by a performer of a representation of the music, and in the way that this representation is activated in the physical actions of musical production; and it contains conscious components in the active search by a performer for interesting and enlightening ways to explore and project that structure. Listeners are essentially engaged in the reverse process, decoding and interpreting the structure of the music and the intentions of the performer from the continuously variable parameters of performance. A variety of features is conveyed by expression, ranging from quite concrete facts about the instrument, state of mind and expertise of the perfomer, through more abstract structural features of the music, to the ideological allegiances of the performer in relation to performance conventions. According to this view, performance communication depends on departures from expected patterns of continuation established both within an individual performance and also with reference to more stable external norms. It is, however, exceedingly difficult to specify the limits within which this distinction between norm and acceptable deviation operates. Furthermore, a number of difficulties for contemporary music derive from the uncertainties surrounding our understanding of the basic structural organization of this music, the identification of appropriate performance practices, and the increasingly fuzzy boundary between a canonical representation of the music (equivalent to the score) and its transformation in performance.


