RHYTHM and TEMPO
The task of those who study rhythm is a difficult one, because a precise, generally accepted definition of rhythm does not exist. This difficulty derives from the fact that rhythm refers to a complex reality in which several variables are fused. Our aim will be to distinguish these variables successively. However, since this work is devoted to music, it is necessary to emphasize that the problem has been complicated by music theorists who have often chosen, due to their personal aesthetic preferences, to recognize only one of the several aspects of rhythm.
Can etymology help us? Rhythm comes from the Greek words ñõèìïò (rhythm) and ñÝù (to flow). However, as Benveniste (1951) has shown, the semantic connection between rhythm and flow does not occur through the intervention of the regular movement of waves, as was often believed: In Greek one never uses rheo and rhythmos when referring to the sea. Rhythmos appears as one of the key words in Ionian philosophy, generally meaning `.`form," but an improvised, momentary, and modifiable form. Rhythmos literally signifies a "particular way of flowing." Plato essentially applied this term to bodily movements, which, like musical sounds, may be described in terms of numbers. He wrote in The Banquet "The system is the result of rapidity and of slowness, at first opposed, then harmonized." In The Laws he arrived at the fundamental definition that rhythm is "the order in the movement." We will adopt this definition, which, even in its generality, conveys different aspects of rhythm.
However, an essential distinction asserts itself. Rhythm is the ordered characteristic of succession. This order may be conceived or perceived. We speak of the rhythm of the days and of the nights, of the seasons, and of rapid or of very slow physical phenomena (such as that of light frequencies or of the planets, respectively). If by direct or by indirect observation, we ascertain the successive phases of these. phenomena, in none of these cases do we directly perceive the order-that is to say, the succession of the phases itself. The rhythm is thereby inferred from a mental construction.
However, there exist cases in which there is, properly speaking, the perception of rhythm, such as in dance, song, music, and poetry. We then find precisely the connection that Plato made between order and human movement: All of the rhythms that we perceive are rhythms which originally resulted from human activity. The first psychologists of the nineteenth century felt this relationship. Mach (1865) placed motor activity at the center of our experience of rhythm, and Vierordt (1868), several years later, began to record rhythmic movements.
Let us, nevertheless, insist on the fact that rhythm is a perceptual quality specifically linked to certain successions, a Gestaltqualität, according to Von Ehrenfel's definition. At this point, in order to clarify the rest of our discussion, it is necessary to specify the characteristic traits of this rhythmic perception.
Most generally, we say that there is rhythm when we can predict on the basis of what is perceived, or, in other words, when we can anticipate what will follow. In this guise, we return to the idea of order found in Plato and in the most modern definitions, such as Martin's (1972): "Inherent in the rhythmic concept is that the perception of early events in a sequence generates expectancies concerning later events in real time (p. 503)."
This characteristic appears in its true form if we compare rhythm with arrhythmia. All sequences of random stimulations will be considered arrhythmic (see Section 111,C,1). Nevertheless, one can more or less anticipate what is to follow, and from this the difficulties arise. At one extreme, we have the isochronous repetition of the same stimulus: the pulse, the march, the tick-tuck of a clock. This repetition can be that of a pattern of stimuli having analogous structures, as in a waltz or an alexandrine. At the other extreme, we have a succession of relatively different patterns, as in free verse or in certain modern music.
The anticipation can only be temporal; that is to say, linked to the organization within the duration: But what is organized in this way? At this. point, there is often a misunderstanding. Is rhythm born out of a series of stimuli, whose temporal characteristics are fundamentals one could be led to believe by a description of a poetic sequence, in terms of breves and longs, or by the reading of a musical score where ,each note is of a precise length? Or is rhythm born out of the ordering of the temporal intervals among the elements marked by a difference in intensity, of pitch or of the timbre? These two propositions are opposed by all theorists of rhythm. The problem is, without doubt, as old as is music. Plato in The Republic already made' fun of a critic of his era: "I vaguely remember" he wrote, "that he spoke of anapaestic verse. . . ; I don't know how he arranged them and established the equality of the up beat and o1 the fall by a combination of long and breve. . . ." The problem remains: Is rhythm the arrangement of durable elements, or is it the succession of more or less intense elements, the upbeat and the fall, the arsis and the thesis of the Greeks being the moss simple example? We will see that both forms of organization exist, one type of relation prevailing over the other. Moreover, they are most often linked and interdependent. Rhythm is the perception of an order.
One of the perceptual aspects of rhythmic organization is tempo. It can be lively or slow. It corresponds to the number of perceived elements per unit time, or to the absolute duration of the different values of the durations. Evidently, one passed from a definition based on frequency to a definition based on duration. We will use both of them. The possibility of rhythmic perception depends on tempo, because the organization of succession into perceptible patterns is largely determined by the law of proximity. When the tempo slows down too much, the rhythm and also the melody disappear.
We have chosen to begin this chapter with the most simple perceptions and to end up with the most complex ones, which are evidently those of artistic rhythms. The reason is not because the simple explains the complex, but because simple configurations can be more easily analyzed (see also Fraisse, 1956, 1974).
II. RHYTHM AND SPONTANEOUS TEMPO
A. Spontaneous Rhythmic Movements
The most easily perceived rhythm is one that is produced by the simple repetition of the same stimulus at a constant frequency. In the rest of this article, we will call this a cadence. The simplest examples are the beating of a clock or of a metronome. But the most important fact is that these rhythms are characteristic of some very fundamental activities such as walking, swimming, and flying. Both animals and people move about with rhythmic movements characteristic of their species. The first rhythmic movement found in the human new-born is sucking, with periods that follow at intervals of from 600 to 1200 msec. This regularity is interrupted by spontaneous pauses, but sucking movements occur at a cadence that seems to be characteristic for each infant. Later on, walking appears. While one of the limbs supports the body weight, the other swings forward; before serving, in turn, as support. In the adult there is also a brief period (100 msec) of double support. The duration of the step is about 550 msec, and corresponds to a frequency of 110-112 per minute[ Mishima,1965).
This frequency depends a little on anthropometric differences between individuals, age, and environmental conditions. This spontaneous activity, which is similar to a reflex, is a fundamental element of human motor activity. It plays an important role in all of the rhythmic arts.
Spontaneous activities reveal that physiological settings exist in the human organism, and, more generally, in all living things, which are regulated by peripheral afferents and, above all, by nervous centers situated at different levels. The tempo of walking, for example, seems to be determined by the medulla. From these centers and from their activity, we have other manifestations, among which it is necessary to cite the heartbeat (with an average of 72 beats per minute) and the electrical oscillations of the cerebral cortex with frequencies varying from 1 to 3 per second for delta waves, from 14 to 30 for beta waves, and most characteristically, from 8 to 13 for alpha waves.
We still do not know where the different biological clocks that assure the regularity of these phenomena are tocated. However; one can think that an autorhythmicity is characteristic of certain nervous tissues. Even though Sherrington has shown that a nerve follows the rhythm of an excitation whatever its frequency, it is not the same when the excitation crosses a nervous center. The frequency of the response is then different from that of the excitation. Also, the myotatic reflex follows the cadence of a mechanical excitation up to a frequency of 4 to 5 per second. For values higher than this, a halving of the frequency of the response occurs every second shock, and for even higher values, every third shock is effective (Paillard, 1955). This fact, reminiscent of the general properties of oscillating circuits, is suggestive.
Among spontaneous movements, it is necessary to cite rocking, which clearly intervenes in games or in dances, but which is manifested from the most tender age on by the beating of the foot of the newborn lying on his back (average age of appearance 2.7 months). As soon as the child can remain scated, the rocking of the trunk appears (toward 6 months). This rocking of the trunk can be considered essentially as a movement of the head, of which one observes different modalities: rocking while on all fours (forward-backward), standing, or on the knees. In most children these movements are transitory, but in others they can last for months, sometimes until 2-3 years and even until 5 years of age. One encounters, moreover, other forms of rocking in the older child (movement of the legs, for example, while seated). Also, the use of the rocking-chair is not without a relation to this type of behavior. There is little precise information regarding the frequency of these rockings. They occur within the range of spontaneous tempos (.5-2 per second), and this frequency depends on the muscular mass concerned.
It is necessary to note that rocking is related neither to vegetative functions (as is the heart) or to relational functions (sucking, walking); it appears when the child is idle or at the moment of falling asleep. In the adult it also translates into an absence of voluntary control or a state of distraction. These movements seem to correspond to a regulation of nervous tension. The postural activity with its tonic effects then takes preponderance over the relational activity (Wallop, 1949). It above all appears when the (possibility of communicating with the environment is reduced, as when it happens in an intervention of an adult or by illness. It is frequent in mental deficiency, neurosis or dementia. In a of these cases, rocking seems to aim at the maintenance of a state of excitation, a d it has a heavy affective connotation.
C. Spontaneous Temp
The periodic activities that we have just mentioned have their own spontaneous tempo. Stern (1900) thus thought that a psychic tempo characteristic of voluntary activity exists. In order to determine this tempo, he proposed a simple motor activity: tapping a spontaneous tempo on a table. History has shown the fecundity of this proposed test, but has not confirmed the existence of a psychic tempo characteristic of all an individual's activities. There are only weak correlations between the different repetitive tests executed at a spontaneous tempo. Factorial analysis always reveals a plurality of factors (Allport and Vernon, 1933; Rimoldi, 1951).
Spontaneous tempo, also called personal tempo (Frischeisen-Köhler, 1933 a) or mental tempo (Mishima, 1951-1952) and measured by the natural speed of tapping, is of great ' interest. The length of the interval between two taps varies, according to the authors,from 380 to 880 msec. One can assert that a duration of 600 msec is the most representative. All of the research underscores the great interindividual variability of this tempo (from 200 to 1400 msec, Fraisse, Pichot, & Clairouin, 1949). By contrast, individual variability is slight. One can verify this within a trial: the variability of intervals is from 3 to 5%, which is in the range of the differential threshold for durations of this type. Also, there is great reliability from one trial to another: the correlations are of the order of.75 to.95 (Harrel, 1937; Rimoldi, 1951). This reliability indicates at spontaneous tempo is characteristic of the individual, a statement reinforced by twin research. Differences in tempo between two identical twins (homozygous) are no larger than between two executions of spontaneous tempo by the same subject; however, the differences between two heterozygous twins are as great as between two individuals chosen at random (Frischeisen-Köhler, 1933a; Lëhtovaara, ' Saarinen, & Järvinen, 1966). ,
Spontaneous tempo of the forefinger has a good correlation with that of the palm of the hand, with the swinging of the leg of a seated subject, and with the swinging of the arm when the subject is standing (Mishima, 1965).
It is necessary to distinguish between spontaneous tempo and preferred tempo. The latter corresponds to the speed of a succession of sounds or of lights that appears to be the most natural-that is to say, to a regular succession judged as being neither too slow nor too fast. Since the nineteenth century a number of German scientists have sought the interval which appeard to be neither too short nor too long. The most frequent determination has been about 600 msec. In this regard, Wundt examined the natural duration of associations between two perceptions, and he proposed a value of 720 msec.
Since Wallin (1911). the preferred tempo has most frequentlv been measured using a metronome. The results found are fairly close to 500 msec [Wallin 1911,
Frischeisen-Kiihler, 193 36; Mishima, 1956). Possibly, this value is in part determined by the scale of tempos which the metronome offers. The preferred tempo of an individual is an constant as is spontaneous tempo, but the correlations between the two tests are not higher than .40 (Mishima, 1965).
striking that the rhythm of the heart, of walking, of
spontaneous and of preferred tempo are of the same order of magnitude (intervals
of from 500 to 700 msec). It bas been tempting to study whether one of these
rhythms serves in some way as a sort of pacemaker for the others. The rhythm of the heart, the most often
invoked, is not correlated with spontaneous tempo (Tisserand & Guillot,
1949-1950). Moreover, it has been verified several times that an acceleration
of the hearbeat does not correspond to an acceleration of spontaneous tempo. By
contrast, one finds a noteworthy correlation between the rhythm of walking and
of spontaneous tempo (.28,
C. Motor Induction and Synchronization
Spontaneous motor tempo and preferred tempo do not only have comparable frequencies, but observations and experiments show that they are also often associated. People fairly easily accompany with a motor act a regular series of sounds. This phenomenon spontaneously appears in certain children toward one year of age, sometimes even earlier. Parents are surprised to see their child sitting or standing, rocking in one way or another while listening to rhythmic music. From the age of 3-4 years on, the child is capable of accompanying, when requested, the beating of a metronome (Fraisse et al., 1949). This accompaniment tends to be a synchronization between the sound and a tap-that is to say, that the stimulus and the response occur simultaneously.
This behavior is all the more remarkable, as it constitutes an exception in the field of our behaviors. As a general rule, our reactions succeed the stimuli. In synchronization the response is produced at the same time as the appearance of the stimulus. A similar behavior is possible only if the motor command is anticipated in regard to the moment when the stimulus is produced. More precisely, the signal for the response is not the sound stimulus but the temporal interval between successive signals. Synchronization is only possible when there is anticipation-that is, when the succession of signals is periodic. Thus, the most simple rhythm is evidently the isochronal production of identical stimuli. However; synchronization is also possible in cases of more complex rhythms. What is important is not the regularity but the anticipation. The subjects can, for example, synchronize their tapping with some series of accelerated or decelerated sounds, the interval between the successive sounds being modified by a fixed duration (10, 20, 50, 80, or 100 msec). Synchronization, in these cases, remains possible, but its precision diminishes with the gradient of acceleration or of deceleration [Ehrlich,1958].
The spontaneity of this behavior is attested to by its appearance early in life an also b the fact that the so-called evolved adult has to learn how to inhibit his it voluntary movements of accompaniment to music. Experiments confirm' these observations. When subjects were presented with a regular series of sounds and asked t tap for each sound, they spontaneously synchronized sound and tap. When asked not to synchronize but to respond after each sound, as in a reaction-time experiment, all c the subjects found this task difficult, the more so the higher the frequency of sound (Fraisse, 1966). The same difficulty arose when the subjects were asked t syncopate-that is, to interpolate the series of taps between the series of sounds. The subjects habitually succeeded only when the intervals between the' sounds were longer than one second (Fraisse & Ehrlich, 1955). Conversely, it has been show (Fraisse, 1966) that synchronization is established very rapidly, and that it is acquire from the third sound on. Let us add, in anticipation of what is to follow, that the synchronization of repetitive patterns is also realized from the third pattern on.
Not only is synchronization possible at the frequency of preferred tempo, but it is also possible in the whole range of frequencies of spontaneous tempos.. More precisely, one observes that synchronization is most regular for intervals of 400 to 800 msec. If the frequencies are faster or slower, the separation between taps and sound: is more variable. For rapid cadences it is, above all, a perceptual problem: the interval between two sounds is not perceived exactly enough to permit precise synchronization (Michon, 1964). The subject oscillates between exaggerated anticipations and delays when the tap follows the sound as in a reaction time situation (Fraisse, 1966). In conclusion, the range in which synchronization is possible is at sound intervals of 200 to 1800 mseç between the sounds.
The synchronization that we have considered in its most elementary forms plays a fundamental role in music, not only in dance but also in all instances in which several rnusicians play together. The unity of their playing is possible only when they are capable of anticipation. One of the roles of the conductor of an orchestra is to furnish the signals that will result in synchronization between musicians. "
D. Subjective Rhythmization A
If one listens to identical sounds that follow each other at equal intervals, that is to say, a cadence, these sounds seem to be grouped by twos or by threes. Since nothing objectively suggests this grouping, this phenomenon has been termed subjective rbytbmization. 'This expression, which appeared at the end of the nineteenth century (Meurnann, 1894; Bolton, 1894), must today be considered inadequate, because all perceived rhythm is the result of an activity by the subject since, physically, there are only successions. The observations made using sound series were later confirmed by using visual series (Koffka, 1909). When one thus listens to a cadence, introspection reveals that grouping seems to correspond to the lengthening of one of the intervals. If one continues to be attentive, it seems that one of the elements of the group, the first in general, also appears to be more intense than the others.
[[ In speaking of synchronization, it
is necessary to specify what is synchronized with what. In effect, if one measures
the temporal separation between a tap of the forefinger and the sound,one finds that the tap slightly anticipates the sound by about
30 msec. The subject does not perceive this error systematically. This was
pointed out as early as 1902 by Miyake. Moreover, this error is greater if the
sound is synchronized with the foot. The difference between hand and foot
permits us to think that the subject's criterion for synchronization is the
coincidence of the auditory and of the tactile-kinesthetic information at the
cortical level. For this coincidence to be as precise as possible, the movement
of tapping should slightly precede the sounding order to make allowance for the
lenght of the transmission of peripheral information. This length is all the
greater when the distance is longer (Fraisse, 1980, pp. 252-257). ]]
These introspective notions were confirmed by authors who asked their subjects to accompany each of the sounds by a tap (Miyake, 1902; Miner, 1903; MacDougall, 1903; Temperley, 1963). The recording of the taps corresponded to introspective observation. There were temporal differentiations and corresponding accentuations. This phenomenon, today seemingly banal, which preceded the work of the Gestalts, was considered extraordinary. Its significance remains important at the present time since it underscores the perceptual and spontaneous character of rhythmic grouping.
In order to
understand this, let us take the example of the tick-tuck of a clock. The
sounds are linked together in groups of two. Let us suppose that one can slow
down this tick-tuck indefinitely. There comes a moment when the tick and the
tock arc no longer linked perceptually. They appear as independent events. This
upper limit is also that where all melody disappears, and is substituted by
isolated notes. The limit proposed by
On the motor level, I've also find an optimum of about 600 msec for perceptual organization. This length is also that which is perceived with the greatest precision (Fraisse, 1963, p. 119; Michon, 1964).
The importance of all of these parameters will appear when we discuss more complex rhythms.
III. RHYTHMIC FORMS
A. Regular Groupings
As soon as a difference is introduced into an isochronous sequence of elements, this difference produces a grouping of the elements included between two repetitions of the difference.. One then speaks of objective rhythmization. This difference can be a lengthening of a sound, an increase in its intensity, a change in pitch or in timbre, or simply a lengthening of an interval between two elements. This fact suggests two types of question: (a) the possible durations of rhythmic groups and (b) the nature of the effects principally produced by modifications in intensity or in duration. .
1. The Duration of Groups
If one accentuates or lengthens one sound out of two, three, or four, this produces the perception of a repetitive group of two, three, or four elements. We already know that the interval between the elements is important and that the perception of the rhythm, objective or subjective, disappears if the intervals are either too short or too long. By asking subjects to produce groups of three or four taps, we found that there was, on the average, an interval of 420 msec between the taps of groups of three, and of 370 msec between groups of four (Fraisse, 1956, p. 15).
Within these limits of succession one can perceive groups of from two to six sounds that correspond to the boundaries of our immediate memory or of our capacity of apprehension. In order to obtain a good rhythmization, it is necessary when the number of elements increases to increase the frequency of the successive sounds. MacDougall (1903), while employing a method of production, found that the longer the groups were the faster the frequencies of the sounds. Thus, a group of four is only 1.8 times as lon as a group of two, while a group of six is 2.2 times as long. Everything happens as if the subject was trying to strengthen the unity of the group when the number of elements to be perceived is larger.
By employing the method of reproduction of auditory series (while preventing the subject from counting by a concomitant verbalization), we found that for an interval of 17 msec, 5.7 elements were accurately perceived (total duration 800 msec); for an interval of 630 msec, 5.4 elements (total duration 2770 msec); and for an interval of 1800 msec, 3.3 elements (total duration 4140 msec).-Thus, there is an interaction between the number of elements and their frequency. The total length of possible groupings depends on both. However, more complex groupings of sounds can be perceived (such as those that we will study in Section 1II), if subunits analogous to those that are called "chunks" can be created. Thus, one can come to perceive about 25 sounds as a unity (Dietze, 1885; Fraisse and Fraisse, 1937) if they form five subgroups of five sounds following each other at a rapid frequency (180 msec). However, the total length of the groups, in this extreme case, cannot be more than 5 sec.
This limit is found in the rhythmic arts. The slowest adagio in a 9/4 bar is no longer than 5 sec, and the longest lines of poetry have from 13 to 17 syllables, the time necessary to recite them being no longer than from 4 to 5 sec (Wallin, 1901). This length of from 4 to 5 sec is, however, an extreme limit that allows only unstable groupings. For the groups of sounds produced by subjects, MacDougall (1903) gives 3 sec as a practical limit. According to Sears (1902), the average length of a musical bar in religious hymns is 3.4 sec. According to Wallin (1901), the average duration of lines of poetry is 2.7 sec.
This duration limit corresponds to what has been called the psychological present. We know that we can perceive, relatively simultaneously, a series of successive events (for example, a telephone number or the elements of a sentence). This phenomenon is also called short-term storage or even precategarical acoustic storage (Crowder & Morton, 1969). We prefer, however, in the case of rhythm to speak of the psychological present. This term expresses well the organization of a sequence of events into a perceptual unity. It corresponds to our limit in organizing a succession. A similar unity introduces a perceptual discontinuity in the physical continuum into the psychological present. One should not repeat the mistake that James (1891) made when he thought that there was a continuous sliding of the present into the past. He cited as an example the recitation of the alphabet. If one's present is at moment t: C D E F G, at moment t + 1 it will-be D E F G H, C having disappeared and having been replaced by H. This analysis is inexact. Language, as well as rhythm, shows that one group of stimuli succeeds another group.
Today, it is
easier to accept as true the principle of the temporal Gestalt. At the beginning of the twentieth century, psychologists
were very much preoccupied with associative links as the basis of the
unification of rhythmic groups. Two hypotheses were dominant about 1900. One,
introduced for the first time by
2. Factors in Grouping
Any differentiation in an isochronous series of identical elements serves as a basis for grouping. However, they do not all have the same effects as far as the organization of a temporal series is concerned. In general, a noteworthy lengthening of the duration of a sound or of the interval between two sounds determines the end of a group; this longer duration allows one to distinguish between two successive patterns. It imposes itself in subjective rhythtpization. This lengthening creates a rupture between two groups. We call it pause. Its duration is not random and cannot be assimilated to a gap or to a ground according to Rubin's terminology. În effect, the perception of rhythm is not only that of a grouping but also that of a linking of groups called Gestaltverkettung by Werner (1919) and Fugengestalten by Sander (1928). In his 1909 dissertation, Koffka noted the following striking behavior: If a subject was given three lights-a, b, c-and if he was asked to continue the rhythm, he not only reproduced the intervals between a and b and c but also linked the groups of lights as though the interval between the final c of the first group and the initial a of a second one had been proposed to him. Out of seven subjects five did not even see that there was a problem.
When asked to tap regularly in groups of three or of four, subjects spontaneously separated the groups by a pause that was from about 600 to 700 msec. With more complex patterns the duration of the pause was at least equal to that of the duration of the longest interval inside the pattern. Otherwise, there was a reorganization of the pattern, so that the longest interval played the role of the pause. However, the pause was never longer than 1800 msec; since, if such were the case, there would no longer be the perception of a chain of patterns but only the perception of isolated patterns. Wallin (1901) found that the pause at the end of a line of poetry was, on the average, 680 msec. Evidently, pauses in the strictsense of the word do not exist in a musical sequence; still, one exists at the end of each pattern in the form of a slight interval.
Most often, it has been assumed that the structuring of patterns was based on the accentuation of one or several elements. This accentuation already appears, as we have seen, in subjective rhythmization. It is important in music where the pause stricto sensu -does not play a role. The accented element, when it determines the length of a group, also determines the nature of the grouping. The objective accent is situated most spontaneously at the beginning of the pattern. This fact has already appeared in subjective rhythmization.
A regular succession of a strong and of a weak sound of equal duration is perceived in 60% of the cases as a succession of trochees (strong-weak) and in 40% of the cases as a series of iambuses (weak-strong) (Fraisse, 1956, p. 95). Other discussions continued and are continuing-concerning the relative role of accents and of pauses. In reality, there is an interaction between the two factors producing segregation. An important lengthening of a sound leads it to play the role of a pause. A slight lengthening of the duration of a sound makes it appear more intense and confers upon it the role of an accent. It then, most often, becomes the first element of a pattern. Reciprocally, the accenting of an element slightly modifies its duration or, if one prefers, the interval that follows it. Thus, while synchronizing the taps with a regular series of strong and weak sounds, we found, as did initially Miyake (1902), that the intervals between the taps depended on the perceived structure (Fraisse, 1956, pp. 95-96).
trochee 484-452 msec
iambus 432-520 msec
A general fact is observed: the most intense element is lengthened. But it is lengthened more when it terminates the pattern, as the effect of the properly so-called accent adds itself to the effect of the pause.
The most intense sound is spontaneously lengthened even by musicians (Stetson, 1905; Vos, 1973). In prosody when the structure is fundamentally in breves and longs, the accent is always placed on the long clement. There is between the lengthening of the duration and the accent, a certain functional and perceptual equivalence. If the more intense sounds are perceived as being longer, the longer sounds are perceived as being more intense.
Evidently, by modifying the intervals between sounds of different intensities, one can make the trochees or the iambuses, for example, more frequent. This experiment was performed by Woodrow (1909) who was able to establish a point of indifference where the iambus had as many chances of being perceived as the trochee, the dactyl as the anapaest. It suffices to lengthen relatively the interval that follows the weak sound or that precedes the strong sound. Since grouping can be obtained by modifying durations or accents, and since there is an interaction between these two factors, simplistic conclusions can be discounted such as those which affirm that the perception of rhythm is based only on the perception of durations or of accents. But the roles of durations and of accents are not the same. The duration of the elements or of the intervals which separate them (in the rhythmic arts these are barely distinguishable) is always a precise quantity. Experiments show that in performances durations are less variable than are accents. By making subjects tap repetitive forms, Brown (1911) found that the relative variability of accents was of the order of 10 to 12% and those of durations of 3 to 5%. This result was confirmed by Schmidt (1939). One finds the same results in vocal performances.
One can then modify in the repetition of patterns the strength-but not the place of some accents without modifying the nature of the perceived rhythm. One can also modify the duration of the elements, but to a lesser degree. Variations of about 6% do not in any way alter the nature of the rhythm. They are still acceptable at 12% but not beyond (Wallin, 1911). Modifications of accents can be very much more important, and artists use them a great deal. ..
We have reasoned until now as though accentuation signified an increase in intensity. However accents, as we said, can be produced by a slight increase in duration. They can also be obtained by changes in pitch or in timbre. A change in pitch brings with it rhythmic segregation. However, according 'to the best study (Woodrov, 1911), the highest sound can be spontaneously placed at the beginning as well as at the end of a group. There is also an interaction of some sort between the intensities and the pitches of sounds. In a series of sounds the highest appears subjectively as the most intense, and vice versa.
If differences in duration, in intensity, and in pitch can organize rhythmic groups, the intensity of the accent has a specific role that we have already noticed in subjective rhythmization. The periodic repetition of accents more or less induces motor reactions that repeat themselves regularly and reinforce the salience of the perceived patterns.
We have described the modifications, which in a series of isochronous sounds, produce rhythmic structures (Fraisse, 1975). We have not explained them. We can state that these perceptual laws are identical to those pointed out by the Gestaltists, and in particular by Wertheimer (1922, 1923). The pause underscores the importance of proximity, the accent, that of repetition of identical elements, or of good continuation.
This comparison still docs not explain anything, as Gestalt laws are themselves unexplained. However, current research on the perceptions of the new-born have at Icast shown the great precocity of these perceptual laws in the spatial domain (Vurpillot, Ruel, & Castrec, 1977). It has recently been demonstrated that the very young child is also sensitive to differences in rhythms. Demany, McKenzie, & Vurpillot (1977), by using a habituation paradigm with an operant response consisting of the fixation of a visual target, showed that the new-born child (71 -!- 12 days) discriminates a series of isochronous sounds (duration 40 msec with intervals of 194 msec) from a series of patterns of four sounds separated by intervals of 194, 97, 194, 297 cosec. They can also discriminate a pattern of the type 97, 291, 582 rnsec from another pattern 291, 97, 582 msec. These sequences are here described in an arbitrary manner, as we remain ignorant as to how the child groups sounds.
However, a child of this age does not perceive a difference in tempo between a scries composed of sounds of 500 msec followed by intervals of 500 msec and another series composed of sounds of 1000 msec followed by intervals of 1000 msec (Clifton & tMeyers, 1969). Does the technique of this experiment have any flaws? Were the chosen tempos t o slow? One does not know at present, but Berg (1974) and then Leavitt, Brown,Iorse, & Graham (1976) found that a simple change of tempo bctwcen two simple structures (séries of sounds of 400 msec followed by an interval of 600 cosec compared with a series of sounds of 800 msec followed by an interval of 1200 msec) was discriminated. According to Chang and Trchib (1977), children at 5 months are capable of discriminating groups of two from groups of four sounds (children of this age are also capable of discriminating between identical groups composed of different sounds). Allen, Walker, Symonds, and Marcell (1977) also found that children can at 7 months distinguish an isochronous succession from an iambic type of grouping.
Rhythmic grouping thus appears very
early in life. In consequence, hypotheses that consider it as a voluntary
activity, such as the pulse of attention or a motor accompaniment, are
invalid. Furthermore, the law of proximity seems to be very primitive in time, as it is in space [
In order to
summarize the effects of pause and of accent, we can say
1. Any noteworthy lengthening of a sound or of an interval between sounds plays the role of a pause between two successive groups.
2. Any sound, qualitatively different from the others, especially as to its intensity, plays the role of an accent that begins the group. When these two principles act simultaneously, one can as well say that the pause preceded the accent (Vos, 1977) as the converse.
3. More intense sounds are perceived as relatively lengthened and longer sounds as relatively more intense.
B. Patterns in Time
We will now attempt to understand the laws underlying the organization of groups when several elements are different from each other. In other words, we are going to study patterns in time (Handel, 1974). One preliminary remark is important. In a complete series of sounds the first perceived pattern tends to impose its structure on the later patterns. It becomes a privileged form of grouping (Preusser, Garner, & Gottwald, 1970a), and this fact confirms the importance of predictability as the basis of rhythmic perception. In order to avoid this effect, it is necessary to use artifices so that no pattern imposes itself due to its initial position. One can, for example, increase the intensity or the frequency of the sounds little by little, use long presentations in order to allow reorganizations, present random series before ordered ones, etc.
1. What happens if in a potential pattern there are several intense sounds? We mixed loud sounds (L) of 100 dB with softer sounds (S) of 75 dB (Fraisse & Oléron, 1954) in patterns of four or five sounds. The sounds were brief and between them the intervals were equal (475 msec for four sounds, 380 msec for five sounds). The subjects had to listen to a long series and then consecutively reproduce three times the perceived patterns by tapping on a key which thereby enabled the measurement of the force of the tap. The subjects grouped, as much as possible, sounds of the same intensity, all of which resulted in the construction of runs. Thus, beginning with a sequence L S S L L S S L S S, etc., one does not perceive the pattern L S S L but the pattern L L S S and, less often, S S L L. The pattern includes the smallest possible number of runs and is at the same time the simplest.
2. What happens if the number of loud sounds is greater than that of soft sounds? The subjects in their reproduction can invert the relative force of the elements and reproduce, for example, the series L L L S in the form S S S L or L S S S. These inversions are reminiscent of those which one can perceive in figure-ground reversals of spatial forms and allow us to think that, in these cases, the differentiating clement is the least frequent one.
We have found the same phenomena by using sounds of different pitches (for example, sounds of 1040 Hz in combination with sounds of 760 or 520 Hz) (Ehrlich, Oléron, & Fraisse, 1956). The subjects tend to regroup so that the sharpest sound; tend to begin the group. With the same technique of reproduction, we have also found that the least numerous elements, high or low, were accented.
This type of research was developed by Garner and his colleagues using longer series. The first study was by Royer and Garner (1966). The patterns had eight sounds of two different types (two buzzers) and were repeated until the subject was capable of finding a pattern and of tapping it on two different keys. The authors of this research were, above all, concerned with estimating the effect of response uncertainty, evaluated in bits, on the identification of a pattern in a series. The first observation they made was that the subjects did not proceed by trial and error; they only began to respond when they had identified a pattern, and at that moment, the pattern would be responded to in complete synchrony and with little difficulty. Thus, construction does not proceed element by element, but wholistically. As for the rest, their hypothesis was partially confirmed. The simple' patterns were organized quickly and the complex ones more slowly, which confirms Garner's (1962) thesis according to which perceptually good patterns should have few alternatives. This research showed that the most often chosen organizations were those in which the number of changes was minimalthat is, where the sounds of the same quality grouped themselves to the maximum extent. Thus, the pattern X X X O X X O O was the most often perceived (31 times out of 128) ; however, the pattern X O XX O O X X vas practically never perceived (once out of 128). As one also notices by this example, the longest run tended to begin the pattern. It could sometimes end the pattern, but it was practically never in the middle.
Later research (Royer & Garner, 1970; Preusser, Garner, & Gottwald; 1970a), most often using sounds presenting little difference in pitch, have confirmed this result. The longest run was placed at the beginning or, more frequently, at the end of the pattern. The solution evidently depended on the structure of the whole and on the relative lenght of the longest run. Thus, in the example mentioned above, the longest run more often began the pattern than finished it; another pattern X X X X O O 1 O was perceived only in 36 cases out of 128 whereas the pattern O O I O X X X X was perceived 61 times out of 128.
The place and the role of the run were the main determinants of grouping. Others could also play a role, all of which corresponded to making simplifications prevail. Thus, when possible, the subject chose a directional simplicity with run lengths either increasing or decreasing in regular order (Preusser, Garner, & Gottwald, 19706). The most redundant and/or symmetric forms (for example, X O X O, or still, X X O O X O O O-where the first, third, and fifth elements were conspicuous) were more easily perceived than the pattern X X O X O X O O that did not have a simple structure nor a longer run than the others (Sturger & Martin, 1974).
Preusser et al. (1970b) and also Handel (1974) have analyzed these results in terms of figure-ground relations by claiming that one of the elements plays the role of figure, and the other that of ground. In particular, they rely on the fact that if one of the elements is replaced by an empty interval, the laws of organization are the same. In the case in which the longest run is at the beginning of a pattern, it plays the role of figure and the pattern obeys the run principle. When it is at the end, it plays the role of ground, and the authors then speak of the gap principle.
The distinction between figure and ground, however, does not appear to be relevant. In a rhythmic structure there is no ground. Even the empty intrapattern intervals are part of the structure. As for the interpattern intervals, even though they have a different status, they nevertheless form links between the successive subpatterns. In the types of structures used in this research, there is no pause, stricto sensu, between the. patterns. We also think that when the longest run is at the end of a pattern, it plays the role of accent more than that of a gap. Reference to poetry or to music, moreover, helps us to understand that all of the elements that structure a succession play a role. There is, between all of them, a relation that is not that of all-or-none (figure-ground) but of a hierarchy of salience.
What is the influence of tempo on patterns in time? All of the presented results were
obtained using frequencies of two to three per second. What is the result when one
increases or decreases this frequency? A frequency of two to three per second appears as an
optimum. For more rapid frequencies (eight per second), more time is needed in order to
discover the pattern in the presented sequence.
However, the structuring phenomenon still occurs as an "integrated, immediate,
compelling, and passive" process. In contrast, at the lowest frequency, .8 (which still is not
very low), the subject constructs the pattern that is learned little by little according to an
''integrated, derived, intellectualized and active'.' process (Garner,&.Gottwald, 1968).
Garner and Gottwald have also found that, at the lowest frequency, the structuring of patterns was all the more difficult if they deviated more from patterns constructed according to the run principle. Preusser (1972) systematically stated the problem of the interaction between the frequency of the elements and the structuring of the patterns. With two sounds of 238 and 275 Hz, the frequency being rapid (four per second), the subjects tended to place the longer run at the end of the pattern, making it play the role of a gap according to Preusser, of an accent plus pause according to us. At the slowest frequency (one per second), the longest run tended to begin the pattern. Why, at the most rapid frequency, was the run at the end? This solution seems to be characteristic of perceived rhythm whereas the initial run would be more characteristic of constructed rhythm, if we use Garner and Gottwald's distinction. Moreover, Preusser used two criteria in order to detect a pattern. One was to reproduce the pattern on one or two keys. The other consisted of asking the subject to describe the perceived rhythm by means of symbols. The delay necessary to describe the pattern is at least twice that which is necessary to reproduce it. This fact, previousty found by Oléron (1959), confirms the wholistic character of rhythmic perceptiom and also the compatibility between perceived patterns and motor patterns. In
order to describe rhythms, it is necessary to analyze their structure, but this analysis is not necessary in order to reproduce them.
C. Patterns of Time
Rhythm, understood as "order of movement" is evidently based on an order which is primarily temporal. Until now we have envisaged only the most simple temporal situation: the isochronous repetition of sounds. What are the more complex temporal situations that permit perception of rhythm, and following Handel's expression (1974), what are the characteristic patterns of time?
1. Rhythm and Arrhythmia
If rhythm is order, arrhythmia is disorder (Le., it is a priori, a sequence of continuous sounds where no temporal organization is perceptible). A computer can create this type of sequence. Can man? We asked subjects to produce an uninterrupted series o taps as irregularly as possible. We also asked them, in contrast, to produce patterns o five or six sounds having an internal structure of their choice, while trying to avoid reproducing known tunes (Fraisse, 1946-1947). While subjects found the task o producing a series of patterns easy, they nevertheless found it difficult to produce are irregular sequence. In order to study temporal structure, we have calculated that successive ratios between durations by computing the ratio of the shorter of the two intervals to the longer.
The first. characteristic fact, in rhythm as well as in arrhythmia, is that a ratio o, near-equality between two successive. intervals predominates (40% of the ratios are less than 1.2). It is as though every sequence were based on a tendency to produce an interval equal to the preceding one, which is evidently the easiest and the most economical activity.
Rhythmic and arrhythmic sequences are constructed on the basis of this regularity However, the way of breaking regularity is different in the two cases. In arrhythmia the higher the ratio the less frequent it is. The rupture with equality then happens b3! a lengthening (or by a decrease) of the preceding interval: small differences become numerous, large ones become rare. In rhythm, on the contrary, small differences are rare. When the subject has broken the regularity, he or she produces a new interval of a noticeable duration. The difference forms about a ratio of one to two.
If one considers the absolute durations of the intervals, one finds the following results:
Intervals less than (msec) Rhythm (%) Arrhythmia (%)
400 56.2 35.2
1000 92 75.8
1800 98 93.8
First, these numbers indicate that in order to perceive regularities or irregularities, we use few intervals larger than 1800 msec. These would break the succession of sound; into independent sequences. It is also necessary to note the high proportion of short intervals in the rhythmic patterns. Moreover, ratios of the order of one to two inter vene most often only when the time that we call short (less than 400 msec) follows or precedes the time that we call long.
Fig. I. Frequency of the ratios between successive intervals for rhythmic and arrhythmic sequences. I indicates equality of the intervals. Negative values indicate that the second interval is shorter than the first; positive values indicate that the second interval is longer than the first. Class interval equals .2 (Le. class 1.2 includes ratios between -1.09 and + 1.09). Only ratios inferior to 2.9 arc represented here; they correspond to 85% of the ratios with rhythmic sequences and 86.19% with arhythmic sequences (from Fraisse, 1946-47, 47-48, 11-21 by courtesy of Année Psycbologique).
complete analysis reveals that the relative equalization of durations in
rhythmic patterns is not only produced between adjacent intervals. One thus
finds patterns 680-260-630-280 (in msec) and patterns 280-300-850-290-850 (in
msec) with equalization of the short times on the one hand and of the long
times on the other hand. This phenomenon is found in patterns of three or four
[[ Here the word "time" is used as synonymous with duration or with interval until Section IV, where we use time according to common usage.]]
Here are some examples (average of 10 subjects): 210-480-490 msec; 470-190-430 msec; etc. (Fraisse, 1956). The hypothesis that we formulated above-that is, of a simple tendency to repeat equal intervals is only partially exact since we find the phenomenon of equalization between nonadjacent intervals. Briefly, patterns arc characterized by a composition of basically two sorts and only two sorts of time: short times of 200 to 300 mscc and long times of 450 to 900 msec. If one looks not only at the averages but at the individual performances, one finds that between short times, adjacent or not, 84% of the ratios are less than 1.15 and 97% are less than 1.55. Between long times, 54% of the ratios arc less than 1.15 and 94% are less than 1.55. The modal value of the ratios' long times-short times is 2.4 of which 95% arc less than 1.55. This ratio 1.55 seems to be the dividing point between two sorts of time.
If two durations belong to the same category, there is a tendency to equalize these durations. We prefer to say that there is assimilation since this equalization is not absolute. Among durations of differing categories, there is a sharp distinction. Assimilation and distinction bring us back to the classical perceptual laws which correspond to a principle of economy in perceptual organization (Fraisse, 1947).
2. Temporal Rhythms as Structure
Our previous analyses already confirm that temporal intervals in rhythmic structures are interdependent. However, one can go further and show that the basic pattern described previously corresponds to "good form." Are we capable of producing or of reproducing any other patterns? One can demonstrate, in several ways, the salience of good form.
First, by 'a
conflict between space and time: if one lays out before the subject 4, 5, or 6
targets at different distances while asking him to tap them successively as
quickly as possible without stopping, he establishes a veritable rhythmic
pattern of taps. This temporal pattern is simpler than the spatial pattern.
Unequal spaces are gone through either in equal times or in very distinct times
(ratios of two to three) (
Good form then, is not only a spontaneous form but a dynamic organization that imposes itself in production or reproduction.
We can find, in very different contexts, examples of this type of structure based on the two durations only. When Samuel…. used the language transposed and based on the play of durations, the Morse code, composed of two durations called dots and dashes, was invented. Greco-Latin prosody was based on the opposition of two durations: breves and longs. In music, there is, at any given moment, a play of two notes that are in a ratio of one to two or one to three (double quaver and quaver, quaver and crotchet, quaver and pointed crotchet). These two notes represent 85 to 95% of the movement (Fraisse, 1956, p. 107).
The first theorist of rhythm, Aristoxenus of Tarentum, distinguished two sorts of beats corresponding to the upbeat and the fall. One was the first beat upon which only one syllable or one note could fall; the other was worth two or three first beats. Aristoxenus claimed that only ratios corresponding to whole numbers are rational.
This generality, regarding the use of only two durations, corresponds, according to us, to a perceptual requirement revealed by psychophysics. Research done on information theory in order to measure channel capacity has shown that the channel is always limited by our ability to distinguish in an absolute way, several levels of stimulation. This capacity, which is about five, varies with the nature of the sensetion. In the case of duration, the studies by Hawkes (1961), Murphy (1966), and Bovet (1974) have shown that even trained subjects could differentiate only two or, at the most, three durations in the range of perceived durations (below 2 sec). If the durations were more numerous, confusion arose.
However, these laws do not apply to the time interval between two patterns that we have called pause. Phenomenally, a pattern ends with the last element. But between one pattern and the next there is, as was revealed by subjective rhythmization, a pause that corresponds to the length of the last note in the case of music and that is an empty time in the case of taps.
Let us take one more step in the analysis of temporal patterns. When they arc quite long, they often split up into several subunits. A pattern of six sound-taps is often decomposed into two subunits of 3 + 3, of 4 + 2, or of 2 + 2 + 2 as the case may be. In this case, the interval between two subunits has the characteristic of a pause: it is at least equal to the longest duration but it is not necessarily equal to it, while being more integrated with the pattern than with the pause, stricto sensu, between two patterns. This type of analysis explains, we think, certain groupings that intervene when models have eight or ten sounds, as in research such as Garner's.
If a subject taps a pattern at his spontaneous tempo and if he is asked to continue to tap the same pattern more quickly or more slowly, it is seen that the ratio long time-short time is maximal at the spontaneous tempo. When the tempo slows down a great deal, there is no longer a sharp distinction between long time and short time. At the limit the durations are almost equal. We have seen rhythm born from a rupture with regular movement; we see it disappear by a return to this movement.
The previous analyses were based on methods of production and of reproduction of fairly short patterns. Preusser (1972) has produced new data. He not only had patterns reproduced with two types of elements (see Section 11I,B), but he also constructed similar patterns that presented only one type of clement by replacing the other by an empty temporal interval. Two organizational principles were obvious from this work (1) the run principle: the longest run begins the pattern (for example, 3"1')3 and (2) the gap principle: the longest interval terminates the pattern (for example, 1'3").
If these two principles are compatible, as in 3'1", the pattern is correctly identified in 90% of the cases. If they are incompatible, as in the first two examples cited the gap principle is, on the average, the decisive factor in 68% of the cases and the run principle in 32% of the cases. When there are three runs, which we consider as three subunits, a third principle, which we have already detected in spontaneous rhythms, is added to the two earlier principles: The sequence of run lengths produces an upward progression (for example, 1'2'3'). This principle evidently gives rise to an organization differing from one starting with the longest run. Preusser has compared these results with those found in the case of sequences of eight elements composed of two sounds of a different nature. By comparing the results of the two studies he concludes that the gap principle plays a more important role when there is only one element. This confirms our previous conclusions. Of two elements, one is not figure and the other ground since the empty intervals have a stronger structuring effect than the element considered as the ground in their analyses.
Handel's research (1974) brought along a supplementary piece of information. He had the duration of sounds varied (ratio of 1 to 5), and he found that, most often, the short durations began the pattern and that the long durations ended it. This effect is all the more marked when the run of the short and/or of the long durations is longer.
More recent research by Vos (1977) produced comparable results obtained by another method. The subjects had to judge, for a sequence of two durations which were in a ratio of 1 to 4, whether it was an iambus or a trochee; for a series of three durations, whether it was a dactyl, an anapaest, or an amphibrach. He used three principles in order to explain the obtained results: (1) Tones that are separated by short intervals are perceptually grouped together; (2) the first tone of a perceptual group is a tone that is immediately preceded by a long interval (which is another example of the role of the pause, or of the gap principle); and (3) long tones are perceived as accented and short tones as nonaccented.
All authors are in agreement that complexity is important among the factors which intervene to produce greater or lesser salience of rhythmic patterns. This is difficult to evaluate. It seems that one can draw several conclusions from research done on Morse code signals: the relative difficulty in learning each signal can be considered to be an index of its complexity. If using Plotkin's (1943) results, we divide the signals into three categories of 12 easy, average, and difficult to learn-we can calculate three indices, keeping in mind that the number of dots and dashes varies from three to five elements: (a) the number of elements in a signal (N), (b) the number of signals having only one category of elements, dots or dashes (E), and (c) the number of signals in which there is an interleaving of elements (for example: --. . -) in contrast to those in which there are only two runs (for example: - - -. .) (R).
[The digits represent the number of elements which follow each other, the hyphens, the intervals which are equal in duration to the elements.]]
One finds the following , results:
N E R
Easy signals 2.9 8 0
Average signals 4.2 1 3
Difficult signals 3.9 0 6
Complexity increases a little with the length of the signals and, above all, with the multiplication of runs, as was stated by Preusser (1970; Preusser et al., 1972). Signals with only one type of element are always easy.
Generally, one can say that, the more a temporal form is brief and simple, the easier it is to perceive. Vos (1973) attempted to calculate the indices of complexity by taking into account the indices mentioned above as well as the ratio between the length of subunits.
One can, moreover, more closely approach music by studying how syncopated auditory rhythms are perceived. Polyrhythms are defined as the simultaneous presentation of two pulse trains such that the rates are not integral multiples of each other (for example, three against four). Each pulse train is a series of regularly recurrent stimuli (Oshinsky & Handel, 1978). How will this ambiguous pattern be perceives the criterion being the choice made by the subject asked to tap in synchrony with the pattern in question? Will he follow the pattern of three or that of four elements? The most remarkable result was that the subjects most often preferred to accompany the pattern of the three rather than that of four elements but that this tendency was not the same for all tempos. In this research the pattern had a duration that varies from 0.96 to 2.4 sec. There was a reversal of the tendency for durations of 1.2 or of 1.6 s depending on the pitch of the sounds. These tendencies were about the same when the two patterns of three and of four consisted of identical sounds or when the consisted of sounds of differing pitch.
Is it also necessary to underscore the fact that synchronizations are very rapid established? The subjects began to tap in a stable way after about 3 sec, which proves that the two trains of stimuli were not analyzed. The majority of subjects ,moreover did not detect that there was an ambiguity in the polyrhythms.
XV. THE PERCEPTION OF MUSICAL RHYTHMS
The above analyses have permitted us to extract the laws characteristic of rhythm perception. However, the stimuli used were far from musical, since these research used only taps, identical sounds, or at best, two types of sound of different duration, intensity, or pitch. Musical rules, however, do not escape the fundamental laws that we have demonstrated. Without doubt, these laws do not explain music any mi than gravity explains the art of architecture. But there is not an architect who ignores gravity anymore than there is a musical rhythm that does not respect perceptual laws…
Many vocabulary that we have evoked by distinguishing rhythm, which is the perception of a pattern, and meter, which allows the description of a musical composition. We will use this distinction as we consider the perceived rhythm and meter used by the composer.
A musical composition is a synthesis of very different stimuli that are perceptually unified much as forms and colors are unified in a painting. We distinguish melody, harmony, timbre, and a rhythmic organization consisting of the succession of rhythmic patterns, at the same time identical to themselves and also varying continuously. The unity assures the characteristic of anticipation, which seems to us to be fundamental, and that Steedman (1977) finds, for example, when he tries to discover in a fugue by Bach the algorithms that allow one to give an account, if not of the rhythm, at least of its meter. What appears fundamental to him is the "principle of consistency" that corresponds to the fact that there is, with the passage of time, a constancy of predictable forms from the first bars on.
These patterns are composed of subunits that metrically correspond to times and, in performance and in perception, to a succession of beats. The metrics tell us that there are bars at two, three, four, and even nine times, but perceptually the bars at four times are often reduced to binary rhythm and the others to combinations of substructures. The longest bars have hardly more than nine times and are generally understood as a triple ternary rhythm. Reciprocally, the simplest bars can group themselves into periods as do. the lines of poetry into stanzas. A famous example is at of the scherzo of Beethoven's Ninth Symphony, written in 3/4 bars, in which Beethoven indicated ritmo a tre battute in order to indicate that it is necessary to group three bars into one rhythmic unity.
Table 1 gives the proportion of each note in each piece studied. It is immediately evident that the compositions are based on two notes that represents more than 80% of the note used. They are in ratio of 1:2. sometimes 1:3 . The briefest among them is also the most frequent.
The duration of notes in our examples varies-taking into account the tempos-from 150 to 290 msec. These values are again found in performances recorded by Gabrielsson (1973). We found, in spontaneous rhythms, durations going from 180 to 280 msec, and we have already stated that the shortest time was also the most frequent. These comparisons are striking. The massive use of two different notes is explained, we said, by the difficulty in identifying more than two durations. However, let us not forget that composers use other, longer durations that have an important aesthetic role and which permit syncopations and pauses.)
The two basic notes, moreover, do not have the same perceptual status (Table I). The brief note, or the interval that we have called short time, does not last. One was able to speak of point in time and, for the rapid succession of two sounds, of a collective perception (Schultze, 1908). The other note on the contrary, which is'; double or triple the first one, corresponds to the perception of a duration. These notes vary between 300 and 900 msec-that is, the range of durations which appear as neither too brief nor too long and which are centered around an optimal duration of 600 msec, which is also that of spontaneous tempo.
Comparisons with the results of our analysis also lead us to consider the ratio of 1.5 between two successive notes (or two intervals) that appeared to us as ambiguous from' the perceptual viewpoint. It corresponds to the case of the pointed note. In partitions, one rarely finds a succession of two notes in a ratio of 1.5, and musicians are acquainted with the difficulty of realizing. such a succession…
Even in the case where a pointed quaver follows a double quaver (ratio of 1:3), the ratio is distorted in the performances recorded by Gabrielsson (1973a); it is closer to 1:4 than 1:3. The ratio of 1:2, between the, quavers and the double quavers is often slightly increased, especially when there is syncopation: The ratio of equality between two ♪ or two ♪ [full] is not strictly respected but the difference does not attain 1.2, which remains within the limits found in the spontaneous production of tapped rhythms.
However, it is equally necessary to remember that, at the level of composition, and also of perception, notes are grouped into what in metric one calls a time. Two ideas arc fundamental in order to define a bar, the number and duration of each time. The bar, in principle, has a unity that provides the relative accentuations of each of the times, the first being, in classical music, the most accentuated. Two questions are important: What do musicians do? And what does the audience perceive?
The accentuation of a note slightly lengthens its duration, a fact that we have already found in nonmusical rhythms. The difference in accentuation of diverse notes, in a dance rhythm, can vary from 10 dB in piano performances to 20 dB in percussion performances (Gabrielsson, 1973).
In general, the first time is accentuated. But how does the subject perceive succession and grouping of times? Recent research by Vos (1978) brings along a first answer. On commercial versions of Bach's preludes, subjects familiar with classical music but not particularly acquainted with the pieces chosen were asked to tap in synchrony with the beginning of each bar-that is, with the beginning of the perceived rhythmic pattern. The subject did not tap in all cases on the first beat of the bar. Let us take the example of a 2/4 bar that lasted 1.75 sec. Forty percent of the subjects tapped in synchrony with the first beat 1> 1 2 1;' 45% tapped on the second beat of the measure thus linked in a rhythmic pattern with the first beat of the following measure 1 1 2 l 1 2 1; 10% tapped each beat. For a 3/4 bar, 20% tapped each beat. Eighty percent tapped as if it were a bar with two beats, grouping two of the three beats, the last beat of a bar being grouped one time out of two with the first beat of the following bar 1 1 2 3 1 2 3 1. This result shows that arguments regarding the place of accent arc irrelevant since for the same musical performance, some perceive, as one can predict, the accent on the first beat, whereas others do so on the last: In the same ternary sequence, a binary perception corresponds to an accent successively placed on the first and last beats. As a general rule, the intervals between the subjects' taps were shorter than the length of the bars. These varied between 1.75 sec and 4.8 sec in the examples studied and the intervals between the taps varied from .8 sec to 2.4 sec. Vos wondered, without being able to answer, whether it was the melodic saliency that led to these cuts or whether it N-as the difficulty of storing too long a series in short-term memory? Without ignoring the importance of, melodic structure, we think that the generality of the observed phenomenon is particularly explained by limits in storage capacity and by the necessity of maintaining a perceived succession between successive taps. From this point of ,view, tempo plays a decisive role.
The ease of the task of motor accompaniment to musical rhythm brings us back to the perceptual and motor aspects of rhythm that we have found in the most simple rhythmic forms. Musical time, like the foot in poetry, recalls the origins of rhythm or chant, dance and music linked by the beat of the foot made by a succession of arsis and of thesis. In the time of the Greeks, theorists already debated whether the accent corresponded to arsis or to thesis. Aristoxenus of Tarentum, moreover, did not speak of upbeat and of fall but of high times and of low times.
This' link between rhythmic perception and movement also appeared in research requiring judgments and not performances. Gatewood (1927) asked subjects to listen to diverse pieces of music and report which of the four following qualities: rhythm, melody, harmony, or pitch appeared to be dominant. For each of the pieces, the subjects had to indicate impressions as evoked by these pieces. Impression of movement was perceived in 64% of the pieces in which rhythm was dominant, 25% of the pieces in which melody was dominant, 15% of the pieces in which harmony was dominant, and 12% of the pieces in which pitch was dominant. The movement thus appeared associated with rhythm and was, moreover, also present in the pieces where melody, harmony, or pitch were judged to be predominant.
More recent work has dealt with the problem of the dimensions of the rhythmic experience. Gabrielsson (1973a) set out from three sorts of musical samples: (a) monophonic structures played on a piano or on a drum, (b) polyphonic structures of diverse dances, in which the rhythm arises from the play of the pitch of percussion instruments and of their duration without melodic intervention, and (c) real music (most often dances).
By using diverse methods (judgments of similarity of two sequences, estimatcs from adjectives recalling the semantic differential, performance with monophonic rhythms, free verbal descriptions) and by using the methods of factorial analysis and of multidimensional analysis, Gabrielsson (1973b) could sum up his results by distinguishing three groups of dimensions:
1. Structural properties of rhythm. One distinguishes here what we call the relationship between perceived rhythms and the bar: coincidence or not, place of accent(s), degrees of perceptual prominence of a basic pattern-accentuation versus clearness, simplicity versus complexity, or uniformity versus variation in the pattern. "The more different note values used, the more duration of duration patterns, the more syncopation, the more variation in instrumentation, the more leaps in a melody, the more changes in harmonic functions,-etc. the more varied and in most cases, the more complex the rhythm will be judged to be (Gabrielsson, 1973b p. 10)."
2. Movement properties. This defines rapidity and tempo forward movement (depending on the fact that in a pattern the movement seems to accelerate or decelerate) and movement characteristics (i.e., different aspects of experienced movements in relation to rhythmic experience: dancing-walking; floating-stuttering; solemn swinging, and others).
3. Emotional aspects. Gabrielsson thinks that these are characterized by the dimensions of vital-dull, excited-calm, rigid-flexible, and solemn-playful.
All the above approaches attest to some links that exist between rhythmic perception and movement. This has led us several times to speak, as did Ruckmick (1927), of rhythmic experience, thereby enlarging the perceptual aspect as it is most often accompanied by motor stimuli and by emotional reactions.
Perception is first of a temporal order-that is, of a regularity. It is, moreover, striking, that in music the temporal data-at least that of the notes-are always explicit. This necessary condition is not always sufficient. The ordered elements can, in effect, be varied as to duration, intensity, pitch, and contain silences. The rhythmic structures are always perceptually complex.
However, rhythm, differing from melody, is made up, above all, of temporal and intensive patterns. One aspect can predominate over the other. The Gregorian chant is the best example of a temporal structuring without intensity and without periodicity. The march and most dances represent the other extreme where the pattern of accents imposes itself with its regularity and where isochronous patterns repeat each other.
All classical music falls between these two extremes. In order to understand the play of regularity and variety, it is perhaps necessary to mention the most simple studies. Brown (1911) has shown that while tapping successions of rhythmic patterns, the variability of the durations is half as strong as that of the intensities. Let us complete what we said above. If in music the durations are always explicit, the indications of accents are much more vague. The rule that states that the first beat of the bar is always accented is only a convention. We have seen that bars, even those of a classical musician such as Bach, and even when played by the same orchestra, can be perceived in multiple ways.
In all polyphonic musical performances, it is necessary however, that there be a regularity in order to permit the anticipation of playing and the synchrony of the artists. In reality, all musical performances consist of isochronous repetition and, simultaneously, of more varied patterns, but which, in a complex way, fit into the play of isochronous repetitions. Perhaps Chopin's remark can be generalized: "Let your left hand be an intransigent and rigorous orchestra conductor and your right hand do what it wants." In the same spirit one can cite this passage from a letter written by Wagner to Liszt (Dumesnil, 1949) that singers respect the duration of notes "by staying within the indicated bar... if they leave it, in order to go further, let them do so with an intelligent liberty and instil fire rather than caution, thereby entirely making the continuity which the bar imposes disappear. If only they produce the impression of an animated and poetic style, we will have won everything."
It is necessary not to dissociate the motor behaviors linked to rhythms from these complex perceptions. Still, two aspects are to be considered here. The play of music is always based on movements. Some are very voluntary and lengthily learned in order to go beyond simple determinisms. Others, on the contrary, give way to these determinisms. They appear when the repetitions of accented patterns impose themslves on the player and on the listener. This subtle or rough motor component has the effect that rhythmic perception is plurisensorial. This is why we speak of rhythmic experience. The movement brings along more particularly affective reactions which are also a component of this experience. The affective aspect is all the more important, in part, because the anticipation of successive patterns facilitates synchrony between individuals. They are spontaneously realized in marches and in dances. All socialization of behavior, as is well know, reinforces their affective impact. ,
This said, in spite of the plurality of rhythmic components and of artistic realizations, is there a general sense of rhythm? We have explored this aspect (Hiriarborde & Fraisse, 1968) by using factorial analysis on a series of tests-some predominantly perceptual (discrimination, reproduction, transcription of temporal or intensive structures), others predominantly motor (adaptation to a change in pattern, polyrhythmic hand-foot, synchronization with good or with ambiguous forms)-and by using the batteries of musical aptitude tests developed by Seashore and Wing. All of these tests were in positive correlation with each other, except for the tests of intensity discrimination and pitch discrimination from the battery by Seashore, and the tests of choice of the best harmonization and judgment of rhythmic accent by Wing.
Using a centroid analysis and by looking for the orthogonal factors, we found three main factors:
1. Perceptual structuration. The tests that are the most saturated in this factor are based on the discrimination of temporal structures.
2. Rhytbmic anticipation. The tests that are the most
saturated in this factor are
3. Practo-rhythmic. The tests that are the most saturated in this factor are those of coordination with alternate movements of the hand and foot and those of adaptation to changes in rhythm, both of which require a voluntary control of rhythmic movement. Moreover, we found a musical factor and a factor of discrimination.
Thackray (1969) carried out a similar study based on his experience as a music teacher. He started from a battery of tests which he classified into three categories:
1. Rhythmic perception: Perceiving the
number of sounds in a pattern, distinguishing
2. Rhythmic performance: reproducing temporal and intensive patterns of sounds and of a short melody, cônserving a tempo. "
3. Rhythmic movement: rhythmic quality of movements that put the whole body to play-sequences of (rhythmic) movements, following music in which the tempo, the bar, and so on are varied.
The correlations between all these tests are quite high, especially within the same
group of tests. Using these results and
applying the method of Spearman-Burt, Thackray, in effect, extracted a general
factor. We also could have extracted one had
Allen, T. W., Walker, K., Symonds, L., & Marcell, M. Intrasensory and intersensory perception of temporal sequences during infancy. Developmental prycbology, 1977 l3, 225-229.
Berg, W. K. Cardiac orienting responses of 6- and 16-weeks-old infants. Journal of Experimental Child Psychology, 1974, 17, 303-312. ,
Bovet, P. Quantité dinformation transmise darts la perception der durées brèves. Thèse de 3e cycle, Université René Descartes, Paris, 1974 (unpublished).
Brown, W. Temporal and accentual rhythm. Psycbological Review, 1911, 18, 336-346.
Brunswik, E. Perception and the
representative design of psychological experiments.
Chang, M. W., & Trehib, S. E. Infant's perception of temporal grouping in auditory patterns. Cbild Development, 1977, 48, 1666-1670.
Crowder, R. G., & Morton, J. Precategorical acoustic storage (P A S). Perception ér Prycbopbyria, 1969, f, 365-373.
Demany, L., Mc Kenzie, B., &
Vurpillot, E. Rhythm perception in early infancy, Nature (
Dietze, G. Untersuchungen über den Umfang des Bewusstseins bei regelmüssig auf einander folgenden Schalleindrücken. Pbilosopbiscbe Studien, 1885, 2, 362-393.
Dumesnil, R.Le rytbme musical. Paris: La Colombe, 1949.
Ehrlich, S. Le mécapisme de la synchronisation sensori-motrice. L'Année Prycbologique, 1958, 58, 7-23. Ehrlich, S., Oléron, G., & Fraisse, P. La structuration tonale des rythmes. L'Année Psycbologique, 1956, 56, 27-45.
Fraisse, P. Mouvements rythmiques et arythmiques. LAnnée Prycbologique, 1946-1947, 47-48, 11-21. Fraisse, P. De l'assimilation et de la distinction comme processus fondamentaux de la connaissance. In Miscellanea P.rycbologica Albert Micbotte. Louvain: Institut Supéricur de Philosophie, 1947. Pp. 181-195.
Fraisse, P. Les structures rytbmiques. Louvain: Editions Universitaires, 1956. - - " - " Fraisse, P. Psycbology of time. New York: Harper, 1963.
Fraisse, P. L'anticipation de stimulus,rythmiques." Vitesse d'établissement et précision de la synchronisation. L'Année Psycbologique, 1966, 66, 15-36.
Fraisse, P. Psycbologie du rytbme. Paris: Presses Universitaires de France, 1974.
-Fraisse, P. Is rhythm a gestalt? Irf S: Ertel; L. Kentmler, &
Stadler (Eds.), Gettaltt&orie in der
Fraisse, P. Les synchronisations sensori-motrices aux rythmes. In J. Requin (Ed.), Anticipation et comportement. Paris: Editions C.N.R.S. 1980.
Fraisse, P. & Ehrlich, S. Note sue la possibilité de syncoper en fonction du tempo d'une cadence. L Année Psycbologique, 1955, f f, 61 -65.
Fraisse, P., & Fraisse, R. Etudes sue la mémoire immédiate. I. Uappréhension des sons. L'Année Psychologique, 1937, 38, 415-423.
Fraisse, P., & Oléron, G. La structuration intensive des rythmes. L'Année Psycbologique, 1954, 54, 35-52. Fraisse P., Pichot, P., & Clairouin G. Les aptitudes rythmiques. Etude comparée des oligophrènes et des enfants normaux. Journal de Psychologie Normale et Pathologie,1949,pp. 42,309-330.
Frischeisen-KÜhlcr, 1. Das Persänlicbe Tempo. Eine
Gabriclsson, A. Similarity ratings and dimension analyses of auditory rhythm patterns. 1. Scandinavian Journal of Psycbology, 1973, 14, 138-160. (a)
Gabriclsson, A. Studics in rhythm.
Acta Universitatis Upsaliensis 7,
Garner, W. R., & Gottwald, R. L. The perception and learning of temporal patterns. Quarterly Journal of Experimental Psychology, 1968, 20, 97-109.
Gatewood, L. L. An experimental study of the nature of
musical enjoyment. In M. Shoén (Ed.). The effects of music.
Handel, S. Perceiving mcltxiic and rhythmic auditory patterns. Journal of Experimental Psychology, 1974, 103, 922-933.
Harrel, T. W. Factors influencing preference and memory for auditory rhythm. Journal of General Psychology, 1937, 17, 63-104.
Hawkes, G. R. Information transmitted via electrical cutaneous stimulus duration. Journal of Psychology, 1961, 5I, 293-298.
Hiriartborde, Is., & Fraisse, P. Les aptitudes rytbmiques, Monographies françaises de Psychologie, Paris. CNRS, 1968.
James, W. The principles of psychology.
Koffka, K. hxperimentelle Utttersuclxttngen zur Lchrc von Rhythmus. Zeitscbrift fùr Psyclologie, 19(19, f2, I -109.
Leavitt, L. A., Brown, J. W., Morse, P. A., & Graham, F. K. Cardiac orienting and auditory discrimination in 6 week-old-infants. Developmental Psychology, 1976, 12, 514-523.
Lchtovaara, A., Saarinen, P., & Järvinen, J. Psychological studies on twins. 11. The psychomotor rhythm: environmental versus hereditary determination. Reports from the Psychological Institute, University of Helsinki, 1966, No. 3.
Mac Dougall, R. The structure of simple rhythm forms. Psychological Review, Monograpb Supplements, 1903, 4, 309-416.
Martin, J. G. Rhytmic (hicrarchical) versus serial structure in speech and other hchavior. Psychological Review, 1972, 79, 487-509.
Mcumann, 1?. Untersuchungen zur Psychologie and Aesthetik des Rhythmus, Pbilosopbiscbe Studien, 1894, 10, 249-322, 393-430.
Michon, J. A. Studies on subjective duration. 1. Differential sensitivity in the perception of repeatcd temporal intervals. Acta Psycbologica, 1964, 22, 441-450.
Miner, J. B. Motor, visual and applied rhythms. Psychological Review, Monograpb Supplements, 1903, f, I-106.
Mishima, J. Fundamental research on the constancy of "mental tempo". Japanese Journal of Psychology, 1951-1952, 22, 27-28.
Mishima, J. ()n the factors of the mental tempo. Japanese Psychological Research, 1956, 4, 27-38. Mishima, J. Introduction to the morphology of buman behavior. The experimental study of mental tempo. Tokyo: Tokyo Publishing, 1965.
Miyake, 1. Researches on rhythmic activity. Sttulies from the Yale Psycbological Laboratory, 1902, 10, 1-48. Montpellier, G. de. Les alterations morpbologiques des mouvements rapides. Louvain: Institut Supérieur do Philosophie, 1935.
Murphy, L. F.. Absolute judgments of duration. Journal of Experimental Psychology, 1966, 71, 260-263. (Iléron, G. Etude de la "perception" des structures rythmiques, Psycbologie Française, 1959, No. 4, 176,89. .
Preusser, D. The effect of structure and rate on the recognition and description of auditory temporal patterns. Perception dr Pçycbopbysics, 1972, ll, 233-240.
Preusser, D., Garner, W. R., & Gottwald, R. L. The effect of starting pattern on descriptions of perceived temporal patterns. Psycbonomic Science, 1970, 21, 219-220. (a)
Preusser, D., Garner, W. R., & Gottwald, R. L. Perceptual organization of two-element temporal patterns as a function of their contptmcnt one-clement patterns. American Journal of Psychology, 1970, 83, 151-170. (b)
Plotkin, L. Stimulus generalization in Morse code learning. Archives of Psycbology, 1943, No. 287. Rimoldi, H.J.A. Personal tempo. Journal of Abnormal and Social Psychology, 1951, 46, 283-303.
Royer, F. L., & Garner, W. R. Response uncertainty and perceptual difficulty of auditory temporal patterns. Perception dr Psyebopbysics, 1966, 1, 41-47.
Royer, F. L., & Garner, W. R. Perceptual organization of nine element auditory temporal patterns, Perception QY Psycbopbysics, 1970, 7, 115-120.
Ruckmick, C. A. The rhythmical experience from the systematic point of view. American Journal of Psychology, 1927, 39, 355-366.
Sander, F. Experimentelle Ergebnisse der Gestaltpsychologie. 10 Kongres ftir Experimentelle Psycbologie, lena, 1928, p. 23.
Schmidt, E:. M. Uber den Aufbau rhythmischer Gestalten. Neue Psycbologiscbe Studien, 1939, 14, 1-98. Schultze, F. E. Beiträg zur Psychologie des Zeitbewusstseins. Arcbiv fdr die Gesamte Psycbologie, 1908, 13, 275-351.
Scars, G. I-1. A contribution to the psychology of rhythm. American journal ofPsychology, 1902, l i, 28-61. Stecdman, M. J. The perception of musical rhythm and metre. Perception, 1977, 6, 555-569.
W. Das psychisch Tempo. In Uber
psycbologie der individuellen differenzen.
Sturger, P. T., & Martin, J. G. Rhythmic structure in auditory temporal pattern perception and immediate memory. Journal of Experimental Psycbology, 1974, 102, 377-383.
l'emperlcy, N. M. Personal tempo and subjective accentuation. Journal of General Psychology, 1963, 68, 267-287.
Thackray, R. An investigation into rbytbmic abilities. Londres: Novelco, 1969.
Tisscrand, M., & Guilhot, J. Etude du tempo de 335 sujets masculins dans la région parisienne. Biotypologie, 1949-I 950, 10, l1, 89-94.
Vicrordt, K. Der 7-eitsinn mach Versucben. Tübingen: Laupp, 1868.
Vos, Is. G. Pattern perception in metrical tone sequences. Unpublished thesis, University of Nijmcgen, 1973. Vos, P. G. Identification of metre in music. Report 76 ()N 06. University of Nijmegen, 1976.
Vos, P. G. Temporal duration factors in the perception of auditory rhythmic patterns. Scientific Aesthetics, 1977,1, 183-199.
Vos, P. G., Lecuwenberg, E'.. L., & Collard, R. F. What melody tells about meter in music. Report 78 FU 03. University of Nijmegen, 1978.
Vurpillot, E:., Ruel, J., & Castrec, A. Y. L'organisation perceptive chez le nourrisson: réponse au tout et à ses éléments. Bulletin de Psycbologie, 1977, 30, 396-405.
Wallin, J.E. W. Researches on the rhythm of speech. Studies from the Yale Psychological Laboratory, 1901, 9, 1-142.
Wallin, J.E.W. Experimental studies of rhythm and time. Psychological Review, 1911, 18, 100-131, 202-222.
Wallon, H. Les origines du caractère cbez l'enfant. Paris: Presses Universitaires de France, 1949.
Weaver, I I. F.. Syncopation: a study of musical rhythms. Journal of Gentral Psychology, 1939,10, 409-429. 1Vcrncr, I 1. Rhythmik, cinc mchrwertige Gestaltenverkettung. Zeitscbrift ftir Psycbologie, 1919, 81, 198218.