Residual Masking at Low Frequencies


Electronics Research Laboratories, Columbia University, New York (Received April 9, 1959)


Short duration auditory fatigue, i.e. the temporary shift in the threshold of farming following the cessation of a masking source, may be termed "residual masking." In this study, curves of residual masking vs frequency have been obtained in a, fin the low-frequency range, for  250-cps masking is at sound pressure levels of 90 and 110 dB, following the method of Munson and Gardner. These making paths were obtained 150 and 200 msec after the cessation of the masking tone. In addition, residual masking at 250 cps was measured as a function of the mel-band intensity level of a "white noise"' masking source. Plats of loudness density Is frequency were computed from residual masking patterns, and the loudness of the pure-tone masking source was evaluated.



NUMEROUS  investigations have been carried out of the effects on the threshold of hearing of prior exposure to sounds of various sound pressure levels.'-This paper is concerned with such auditory fatigue for very short exposure times (only 400 msec) and levels up to 110 db. Because the car recovers rapidly for such short exposures, threshold measure­ments must be made within a very short time following the cessation of the masking source. Here the measurements were made 150 or 200 fee after the sound source had stopped. The temporary threshold shift which occurs may be termed "residual ring." Since masking is the amount by which the threshold of audibility of a sound is raised by the presence of another (masking) sound, by extension "residual masking" is the masking that persists after the cessation of the masking sound, Munson and Gardner have investigated residual masking as a function of frequency for very short exposure times. In their work, a 1000-cps ticking source was employed and the subject listened with earphone. In contrast, this study of residual masking is at lower frequencies, for binaural listening in a free field. Following their method, plots of loudness density us frequency were calculated from the residual masking data, and the loudness of the pure tone mask­source was evaluated. The data presented here include residual masking vs frequency for 250-cps masking having sound pressure levels of 90 and 110 db, and residual masking at 250- cps Vs level of a "white noise" masking source.



The psychoacoustic testing procedure employed tout this study was essentially the same as that described by Munson and Gardner."' An ABX measure­ment technique was used which is particularly applicable fair threshold measurements only a fraction of a second after the cessation of a masking source; furthermore it provides a convenient method of programming the tests automatically so that threshold levels may be obtained on a statistical basis. It is a constant stimulus method which derives its name from a sequence of three signal conditions presented to the subject. The first is called the: "A condition," the second the "B condition," and the third the "X condition." Condition X is either the same as A or B, and the subject is required to vote whether in his judgment X =A or X- J3. Once his vote is recorded, a view ABX sequence is presented to him. For example, in Fig. 1, a subject is exposed to a 250-cps masking tone having a sound pressure level of 110 db for 400 msec; his threshold is measured at a time 150 msec following the cessation of the 250-cps tone. This threshold is determined through the use of a probe tone that follows the masking tone at a specified time interval. In a single ABX sequence, the level of the probe tone is selected by the coding on a punched tape which has been coded so as to ensure a random presentation of the probe tone levels in discrete steps. The threshold value is then determined from a curve of the percentage of correct votes that are recorded at each probe tone level, an example of which is shown in Fig. 2. When the range of probe levels is properly adjusted, at the highest probe tone level the subject votes correctly 100% of the time; at the lowest probe tone level he votes correctly only 50% of the time and is merely guessing. His threshold is defined as the midpoint between these two values-at the 75% correct-vote level.


FiG. 2. Percentage of correct observations vs sound pressure level of probe tone.


If the exposure to a masking source is less than 1 sec, even for a source having a moderately high sound pressure level, the ear recovers rapidly. For example, 100 to 200 msec after the cessation of the source, the threshold shifts may be less than 20 or 30 db, Thus while the experimental equipment must produce distortion-free signals of high level, the system also must be capable of measurement at low-level threshold values. Therefore the system must have an exceedingly high dynamic range, and it must provide a mechanism for measuring thresholds within a small fraction of a second following the cessation of the masking tone.

FIG. 3. Threshold of hearing for binaural listening in a free field (observer facing the source).

The equipment used in this study provides an automatically controlled sequence of "ABX" test tones in a manner predetermined by a coded punched tape and by settings of electronic switches and interval timers in the system, and it provides a means of recording the responses of the subjects. Its principal basic components include the following:

(a) Two audio oscillators which are the sources of the masking and probe tones.

(b) Electronic switches to turn on and off the signals provided by the oscillators. (The rate of buildup and decay of the signals is adjusted to the fastest rate consistent with the lack of detectability of clicks.)

(c) A Western Union Type 2A Teletype Tape Transmitter to select the test condition for each sequence, i.e., whether X equals A or B, and to select the level of the probe tone.

(d) Electronic interval timers to control the time intervals in the series of ABX tones (i.e., the time duration of the masking and probe tones, and the length of the time delay between them).

(e) A "programmer" to select the various output levels required for the probe tone, and to record the votes of the subjects.

(f) A low-frequency loudspeaker as the masking source. This loudspeaker, especially constructed by the Bell Telephone Laboratories, produces but negligible distortion even at very high levels.

(g) A "free-field room" in which the subject is seated in a chair facing the loudspeaker. The position of his head is determined by a headrest. Measurements made with probe-tube microphones indicated that the sound waves in both ears produced by the loudspeaker were exactly in phase at the frequencies employed in these tests.


FIG. 4. Residual masking pattern for 250-cps masking tone, 150 msec after cessation of tone for masking tone sound pressure levels of 90 and 110 db.


FIG. 5. Residual masking pattern for 250 cps masking tone, 200 msec after cessation of tone. For masking tone sound pressure levels of 90 and 110 db.


(h) A "vote box" which the subject holds in his hands. It contains pushbutton switches that he de.. presses to indicate whether his vote for a given A .X sequence is X-A or X-B. At a fixed tune interval after the subject has voted, the next sequence is presented by the automatic operation of the above equipment.


The threshold of audibility curves for the four subjects$ were determined in the "free-field" environ­ment between 130 and 1000 cps by means of the ABX technique. These data are summarized in Fig. 3 together with a curve labeled FM, which is the Fletcher Munson 0-phon contour, and a curve labeled NPL which is the 4-phon contour obtained more recently at the National Physical Laboratory by Robinson and Dadson. The seine temporal sequence of stimulus presentations was followed as that used during the residual masking tests. However, during the threshold of audibility tests, the 250-cps masking tone was rot sounded. Thus the subject was "exposed" to a brief interval of silence (condition A), followed by condition B which contained a faint probe tone (probe tone duration- 100 cosec); then condition X was either a silent interval as in A, or it contained the probe tone as in B. The subject voted whether in his judgment condition X was the game as condition A or B. (A cueing light flashed alt times corresponding to the starts of conditions A,B and X).  The level of the probe tone determined the threshold of hearing:

A single test contained SO .ABX sequences, 10 at each of S fixed probe tone levels, 3 db apart. In 25 of these sequences, X-.A, and in the other 25, X-B. The first 1(f sequences emu the highest probe tone level, and the probe tone levels of the remaining 40 sequences were random at the lower probe tone levels. The test ion itself was preceded by a practice run consisting of 10 sequences. In arty one test session a threshold was determined at only one of the following frequencies : 130,190, 250, 350, 800, and 1000 cps. The order in which these frequencies were selected in some tests was random for each subject, as well as from subject to subject. Each point in Fig. 3 represents the mean of three complete tests of 50 ABX sequences each.



Figure 4 shows measured values of residual masking as a function of frequency. These data reports the shift in the threshold of hearing which results from the exposure for 400 msec to a pure tone at 250 cps leaving levels of 90 and 110 db. The delay time between masking and probe tones was 150 msec. Similar data of threshold shift for a delay dime of 200 cosec are given in Fig. 5. Note that the greatest masking occurs at a frequency corresponding to the masking tone. This is in contrast to simultaneous wing patterns fort pure tones which show a dip at the frequency of the masking tone as a result of beats between the masking and probe tones.(' Note that for sortie observers at even the highest levels there is no evidence of aural harmonies, which tends to confirm the observations of Munson and Gardner, whereas for others there would appear to be small secondary peaks at the second harmonic of the masking tone. It is interesting to speculate as to the reason for these individual differences, and to consider the possibility of their resulting from auditory fatigue susceptibility.


Measurements were made of the residual masking at 250 cps resulting from exposure to white noise. Such data were obtained for various mel-band intensity levels of the source, which approximated "white noise" in the restricted frequency range of interest in this study, at time intervals 150 msec and 200 msec after the cessation of the source. These data are shown in Figs. 6 and 7. The threshold shift in db is plotted against the level of the noise source, expressed as the sound pressure level in a mel band at 250 cps in db re 0.0002 microbar.


FIG. 7. Residual masking at 250 cps vs mel-band intensity level of wbite noise, 200 msec delay.


PLOTS OF LOUDNESS DENSITY VS FREQUENCY According to Munson and Gardner, plots of loudness density (expressed in millisones per mel) versus frequency can be derived from curves of residual masking versus frequency in the following way:

(1) First obtain a relation between residual masking (in db) and loudness density (in millisones per mel). Such a relationship for one subject, shown in Fig. 8, can be determined as follows. Redraw Figs. 6 and 7 so that the intensity level of the white noise source is expressed in terms of the mel-band intensity level in db above threshold, from (a) a relationship between the residual masking at 250 cps produced by white noise and the intensity level of this noise in a mel band at 250 cps (Figs. 6 and 7), and from (b) the threshold measurements dat4 given in Fig. 3. Next apply the relationship of Munson and Gardner between loudness density and mel-band intensity level of a white noise expressed in db above threshold (Fig. 14 of reference 12). By eliminating the axis which is common to these two sets of curves (mel-band intensity level in db above threshold) one obtains a curve of residual masking versus loudness density, such as that given in Fig. 8 for one of our subjects.

(2) Next, obtain a loudness density vs frequency plot by converting the ordinate values (residual masking) in Figs. 4 and 5 to loudness density values, using a relationship such as that shown in Fig. 8 for each subject. Such loudness density plots are shown in Figs. 9 and 10 for a 250-cps masking tone.

FIG. 8. Residual masking vs loudness density for a delay of 150 msec.

Figure 9 shows data for two subjects derived from residual masking tests 150 msec after the cessation of the 250-cps tone. The upper curves in each case represents conditions where the sound pressure level of the 250-cps masking tone was 110 db, and the lower curves are for a masking torte level of 90 db. Similar curves are shown in Fig. 10; however, here the curves are based on data where the interval between the 250-cps masking tone and the probe tone was 200 msec. These curves show a pronounced peak which is very much sharper than the peals in curves of amplitude of vibration along the basilar membrane vs frequency which confirms a similar observation by Munson and Gardner for a 1000-cps masking source, indicating that some sharpening process, as yet unknown, takes place even at the lower frequencies.


Fm. 9. Plot of loudness density vs frequency derived from data obtained 150 msec following cessation of the masking tone. For masking tone sound pressure levels of 90 and 110 db.





The loudness of the250-cps masking tone used in these experiments was determined at sound pressure levels of 90 and 110 db; (a) experimentally, by comparison with an equally loud 1000-cps pure tone, and (b) by computation, from the plots of loidensity vs frequency. In the latter method the loudness was calculated by integrating the area under the loudness density tot frequency plots of Figs. 9 and 10, which had been modified in the following way. The frequency scale in these curves was converted to the men scale; then, an integration of the area under the curves represented the total loudness of the masking tone, expressed in millisones. The values of loudness, com­puted from this procedure were lower by I to than those values which were determined experimentally by comparison with a 1(100-cps (cone. While this agreement is not as close as that obtained by Munson and Gardner, who followed a similar procedure at 1000- cps, it is of the order that one might expect for these measurement conditions. The loudness level, in phons, of the 250-cps tone was measured toy determining the soured pure level oĢ art equally brad 1000-cps tone. Then the loudness, in sones, of the 2501-cps tone was determined from the loudness level by the following relationship:

log10N=0.03LN- 1.2

where N=loudness in sores and L-loudness level in phons. An A-B technique was used in these loudness comparison tests which was similar to that employed in A BX technique described previously, but the X sequence was omitted. The subject hearth a sequence of two tones, A and B; he then indicated, by voting, which of the two tones sounded louder to him. This experimental procedure permitted the use of the same equipment and programming methods described earlier. During a sine test, the sound pressure level of the 250-cps tones was fixed. They were compared with 10000-cps tones having any one of five deferent sound pressure levels which differed by 3 db. Thus, S- shaped curves similar to Fig. 2 were obtained. Fifty A-B sequences were presented in a single test session. The sound pressure levels of the 1000-cps comparison tones were random in these sequences. The duration of both the A and B tones was 400 msec, and the interval between them was 1000 msec.

These loudness level measurements are summarized in Table 1. The first tyro columns show the results of comparing a 250-cps tone having a sound pressure level of 90 db with 1000-cps tone to determine the level of an equally loud 1000 tone. For example in the three tests, the sound pressure levels of 1000-cps tones which subject VB judged to be equally loud were 89, 91, and 91.5 db when the 250-cps tones were presented first in each sequence. On the other hand, the sound pressure levels of the 1000-cps tones which were judged to be equally bud, were 92, 94, and 92 db when the 1000-cps tones were presented first. Similar data for this subject for a sound pressure level of 110 db are shown in the two columns to the right. For a given subject and for a given order of presentation, the spread in the data is very small. But note that for two of the subjects (V.B. anal J.D.), the 250-cps tons was judged to be significantly louder when it was presented second in each of the A-B sequences than when it was presented first. This was observed consistently at both the 90 and 110 db levels of the 250-cps tone, even though the possibility of systematic error had been eliminated.

The assistance of Winston L. Nelson, Miss Carmen Hurtado, Dr. Eda Bergen and A. B. Grundy, Jr., who designed the special electronic devices required, tested the overall system, and made many of the preliminary measurements, as well as the measurements contained ins this paper, is gratefully acknowledged. The kind cooperation of a number of our Laboratory personnel, who acted as subjects, is appreciated. The author wishes to thank W. A. Munson for many stimulating discussions concerning the subject matter.