Audibility of "Diffusion" in Room Acoustics AuralizationAn Initial Investigation

Rendell R. Tomes, Mendel Kleiner

Chalmers Room Acoustics Group, Chalmers University of Technology, 5E-412 96 Goteborg, Sweden,

Bengt-Inge Dalenback

CATT, Mariagatan 16A, SE-414 71 Goteborg. Sweden,



Inaccurate modeling of scattering remains a weakness of room acoustics auralization. How well must scattering be modeled for accurate auralization? To evaluate the time-frequency perception of scattering in the binaural room impulse response, one can begin by investigating the audibility of frequency-dependent changes in Lambert diffuse reflection. Listening tests are performed to compare computed auralizations of a Swedish concert hall. In this study one finds the following:

(1) For some signals, changes in the diffusion coefficient are clearly audible within a wide frequency region. Thus, diffuse reflection should be modeled in a frequency-dependent manner, although not all auralization programs currently do this.

(2) The perception of these changes depends on the input signal. For sustained signals (e.g., an organ chord, pink noise), changes are strongly perceived as differences in coloration; for example, increasing low-frequency diffusion is perceived as "decreasing the bass" content or "increasing the treble" content of the signal. For impulsive signals (e.g., string pizzicato), coloration differences are less audible than for sustained signals, whereas spaciousness differences are relatively stronger. It is interesting that listeners, though uninformed of the differences between high- or low-diffusion signals, give consistent answers regarding perceived changes in frequency coloration.

1. Introduction

Inaccurate modeling of scattering is a remaining weakness of room acoustics auralization. Part of the reason is that various mathematical models for scattering have not been fully ex­ploited and optimized for rooms. Instead, binaural room im­pulse responses are currently simulated using approximate, geometrical models and energy-based "diffusion" methods to approximate surface scattering and diffraction. But even with these methods, one has a lack of reliable input data. Figure la roughly depicts the state-of-the-art and inquires whether improving scattering models in certain frequency regions is more important than in others in order to achieve accurate auralization. In this effort our group has been de­veloping and applying models for diffraction and scattering for room acoustics simulation. Previous and current work includes diffraction effects in auralization, comparisons with scale-model measurements, and auralization of various types of surface scattering via scale-model measurements [1, 2, 3]. This paper stems from a presentation of potential scattering models for room auralization [4] and discusses the subjective tests in detail.

What kinds of approximations can be made to complex scattering models, while still retaining accurate auralization? Full-frequency, exact models may not be necessary (as.scat­tering from certain surfaces may not be equally influential at all frequencies and times) and furthermore may be compu­tationally impractical. For example, it may be sufficient to use high accuracy for the early part of the impulse response to achieve correct early coloration and "source width" of the real hall, such that approximate models may be employed for the later reverberation. ("Correct coloration" in this context refers to the natural filtering that a given room imposes on an anechoic signal.) If a hybrid of numerical models is used, one should determine if and how far one may extend or over­lay their formal ranges of validity. Thus, Figure Ib suggests that one must investigate the general audibility of scatter­ing (and of various types) at different times and frequencies in the room impulse response. One may begin by examin­ing the perception of frequency-dependent degradations of specular reflections (a rough way of envisioning scattering) through binaural listening tests that evaluate changes in sur­face "diffusion" in different frequency regions. Thus, one may better estimate the most important frequency ranges and the required accuracy for modeling scattering, and better understand its various psychoacoustic effects.

2. "Scattering" and "diffusion"

We now discuss similar but distinct terms for scattering phe­nomena. The general term "scattering" refers to the redirec­tion of sound when it interacts with a body and, thus, encom­passes transmitted, reflected, and diffracted waves [5]. In this paper "scattering" refers primarily to non-specular compo­nents of this redirection of sound and their effects on the total field. "Diffraction" ("edge-diffraction" or "edge-scattering") refers to scattering from a wedge of a given angle, including planar "wedges" and interior corners. Diffraction is often en­visioned with the receiver in the shadow zone (hidden from the source), but this edge-scattering occurs at all angles from the wedge. "Surface scattering" refers to scattering from an area (possibly with "rough" or periodic profile). One might consider the mildest form of scattering as slight deviations (in coloration, strength, and directivity) from purely specular reflection, whereas more severe scattering is associated with significant coloration and angular spreading.

 Figure 1. Figure 1(a) roughly depicts the accuracy of auralization programs and asks where one should concentrate on improving scat­tering models. The product ka represents the ratio of characteristic dimension to wavelength, where k is the wavenumber. Figure 1(b) portrays the need of investigating the ear's sensitivity to various types of scattering at different time-frequencies in the room impulse response.


One should note that these definitions overlap. For example, the multiple-order edge-diffraction from an array of wedge-like, rectangular features is essentially "surface scat­tering" for acoustical wavelengths much longer than the fea­tures' periodicity. This not only affects the designing of al­gorithms for scattering models but also aids in understand­ing the wavelength-dependent scattering behavior of a given profile. One may also note that scattering can be expressed as complex pressures (e.g., in the time-domain formulation of edge-diffraction in [1]), which add directly to (interfere with) the specular component and yield the associated total directivity pattern and frequency coloration.

The term "diffusion", on the other hand, is typically related to the redirection of a portion of the specular energy into non­specular directions (in its broad usage in room acoustics and auralization). It is sometimes equated with "diffuse reflec­tion", as discussed in [6]. Since phase is ignored, diffuse en­ergy components do not add directly to specular reflections, so impulse responses (for auralization) must be constructed indirectly with some phase assumptions. Diffusion cannot di­rectly model edge-diffraction effects, for example, where the edge contributions interfere destructively with the specular reflection for certain source-receiver orientations and wedge angles. One might thus say that "scattering" describes the behavior more comprehensively, whereas "diffusion" offers a more heuristic picture. The term "scattering" may also be preferable to "diffusion" as the latter can be confused with diffusivity of the sound field, which is related but not equiv­alent to diffuse reflection [7, p.110]. Nevertheless, thinking in terms of diffusion (and its directivity and frequency col­oration) is a basic and practical starting point for modeling scattering, understanding its behavior, and investigating its audibility.

3. Listening tests

Listening tests are employed here to compare simulated bin­aural signals in a hall with frequency-dependent changes in surface scattering. One might ideally have a concert hall where variable surfaces have different scattering-scales and profiles; the large volume, compared to a small "laboratory" room, would allow early-order scattering effects to be more realistically judged in the presence of a longer reverbera­tion (although the "maximum" audibility can also be of in­terest). Measured binaural room impulse responses (BRIR) would then be convolved with anechoic signals, yielding reproducible signals for subjective comparison of scatter­ing at different frequencies. One could conceivably perform the same procedure with physical scale-modeling, although spark sources can have too narrow a frequency range and ex­cessive noise for wide-band auralization, and electroacoustic scale-transducer sources often have limited omnidirection­ality. Furthermore, scale-model binaural-head microphones can be subject to HRTF-mismatch (with the end listener) and other compromises, although extensive studies have been done in this area [8, 9].

One might also employ computer auralizations using ex­act models (or finite-/boundary-element methods) for scat­tering, if such programs were both available and practical for concert-hall dimensions and full-frequency computations [10]. One can begin, however, with first approximations like Lambert's-Law "diffusion" [7, p.84], where the scattered in­tensity obeys a cosine-law directivity away from the surface normal. This model is hardly exact yet is a fundamental case and would at least indicate the ear's sensitivity to randomly diffusing surfaces.

Binaural impulse responses (with diffusion varied at dif­ferent frequencies) are computed with the program CATT­Acoustic (version 7.1), based on randomized cone-tracing and developed by Dalenback [I1]. This implementation of diffusion, described below, was judged sufficient for initial subjective experiments, allows frequency-dependent diffu­sion coefficients, and can simulate full binaural impulse re­sponses. As a result, variations in diffusion can be simulated in the context of a realistic room shape and in the presence of a reverberation tail. Details of the results may depend on the implemented algorithm (and various definitions of the diffusion coefficient exists), so conclusions must be made with this in mind.

Two standards for a scattering coefficient and diffusion coef­ficient are currently in preparation by the ISO (International Orga­nization for Standardization) and by the AES (Audio Engineering Society).

Figure 2. "Quasi-step function" for the diffusion coefficient, applied at each frequency region. The abscissa represents the frequency regions as defined in the text; the ordinate, the Lambert diffusion coefficient. Each of the three frequency regions has an associated pair, denoted by the subscript.

3.1. The Lambert diffusion coefficient

For first-order diffuse reflection, an elementary source is gen­erated at each first-order "ray hit-point" with a Lambert's­Law directivity. If surface absorption is denoted by a, the power of such a source is proportional to b(1 - a), cor­responding to the proportion ~ (the diffusion coefficient) that is "diffused". Diffusion for second- and higher-order reflections is treated as in classical ray-tracing (here cone­tracing): If a generated random number from 0 to 1 is less than ~, the reflected direction is randomized according to Lam­bert's Law; otherwise, the reflection is specular. To allow frequency-dependent diffusion, this is performed indepen­dently for each of the six octave bands with center frequen­cies 125 to 4000 Hz.

For these listening tests, the surface diffusion coefficient (expressed in percent) in the entire room is adjusted from 10°~o to 60% in each of three frequency regions within the l25 to 4000 Hz octave bands: "High" (2 and 4 kHz), "Mid" (500 and 1000 Hz), and "Low" (125 and 250 Hz). (The final binaural impulse responses, however, cover the entire audio range, with extrapolated absorption coefficients above the 4 kHz octave band.) The frequency dependence of the diffu­sion is described by a quasi-step function (Figure 2), which assumes that the diffusion begins at an onset frequency and does not severely drop above this [12]. Thus, the first BRIR pair is in the "High" frequency region, with diffusion com­pared at 10% and 60%, while constant in the other regions at 1 % (numerically extreme but representative of purely spec­ular reflection). This quasi-step function then slides to the "middle" ("Mid") region where 10% and 60% diffusion is

Torres et al.: Audibility of diffusion in auralization 921

Figure 3. The hall model, shown in plan and perspective (bold lines for clarity). The source "S" is on stage. The receiver used in listening tests is "R3" (rear, center). The ceiling height varies from 13.5 m (stage area) to 15.5 m (near center of hexagon) to 12 m (near back). The reflector heights vary from 7.5 m to 10.5 m above the floor area. The stage width is 18 m (stage area); the hexagon width varies from 19.6 to 25.5 m. The length of the hall is 32.1 m. The source is located on the centerline, 7 m from the front wall of the hall. The receiver faces the source and is located off the centerline by 1 m, at a distance of approximately 1 m from the rear wall. The balcony is about 3 m above and extends about 2.6 min front of the receiver (height/depth approximately I.1).

compared, with 1 % diffusion below and 60% above. Finally, the difference in diffusion is compared in the "Low" region, while the upper regions have 60% diffusion. In total, three pairs of BRIR are constructed (Figure 2), each having one BRIR with 10% diffusion (Signal `A') and another at 60% (Signal "B"). The values 10% and 60% in the comparisons are somewhat more realistic bounds than the extremes 1 % and 99%, although one may later explore all ranges of (and difference limen for) this diffusion coefficient.

3.2. The hall geometry and listener position

The auralizations are based on a hall with reverse-fan/ hexag­onal shape ("Tonhallen", 8650 m3, in Sundsvall, Sweden) for 700-900 people. The acoustical design and original computer model were done by consultants at Akustikon (Sweden) [13]. For these listening tests, the side balconies in the computer model are removed to acoustically emphasize the hexagonal plan form. The hall geometry and listener/source positions are given in Figure 3.

Listener positions are auralized at positions near reflecting walls at different angles, in addition to one near the center, distant from wall surfaces. However, to limit the listening tests to 30 minutes per person (to avoid fatigue), only the rear­center position ("R3") is used for this initial study. This seat near the rear wall (l.l m away) is expected to most strongly reveal the effects of varying the diffusion, due to its proximity to a reflecting surface and since the perceived comb-filter effect from the nearest wall is on-axis with the direct sound

Organ chord

Figure 4. Spectrogram of the organ chord, chosen as an example of a sustained musical signal. (The musical notes represent predominant harmonics in the spectrogram, but the organist does not necessarily need to play each note on the keyboard to obtain the spectrum.) The scale in dB is arbitrary.

(and thus greatest). Selecting such a "sensitive" position is appropriate because it presumably yields an upper limit for this study. Later comparisons for positions far from wall surfaces (e.g., near the center) and at different neighboring wall angles should complement this investigation with other reference points.

The sound source is modeled as omnidirectional (although the software does allow a specified source directivity). This representation is inaccurate when compared to the directivity of a string quartet or to an organ (i.e., those instruments rep­resented in the listening tests). However, omnidirectionality is acceptable in this case, as the main goal is to create impulse responses with varying diffusivity, not to exactly replicate the instruments in the hall. This may negatively affect, of course, the subjected realism of spaciousness, since the test-listeners inherently make subconscious comparisons to their personal listening experiences. This is evident in the results, where one observes that coloration differences are perceived more consistently than changes in spaciousness.

The absorption values at the six octaves (from center fre­quencies 125 to 4000 Hz) are simplified, for these tests, into two types, with values according to Beranek [14]: "occupied seats, medium upholstered" (0.68 0.75 0.82 0.85 0.86 0.86) and "Type A residual hall absorption" (0.14 0.12 0.10 0.09 0.08 0.07). The number of cones is 33416, judged more than sufficient, given the algorithm and hall.

3.3. Administration of listening tests

The three BRIR pairs (for the three frequency regions) are convolved with three anechoic recordings and yield the fol­

 Figure 5. Spectrogram o~ string quartet with pizzicato (plucked strings), chosen as an example of an impulsive musical signal. The vertical lines correspond to pizzicati; the horizontal lines correspond to the first violin playing with a bow and with vibrato. The ex­cerpt is taken from four bars of "La Cumparsita" by Gerardo Matos Rodriguez.

lowing four test sounds: two "sustained" (synthesized organ chord, five-seconds pink noise), and two "impulsive" (string quartet with pizzicato, and the unconvolved BRIR "alone"). These test signals are chosen to highlight time vs. frequency effects, evident in Figures 4 and 5, though these effects are (naturally) never entirely separated. The organ chord (the opening of an improvised chorale) was played and recorded using a synthesizer simulating a church organ with no room reverberation. The string quartet passage (pizzicato in the lower three strings) was taken from four measures of "La Cumparsita" (1917), a famous tango by the Uruguayan Ger­ardo Matos Rodriguez. The Clausen String Quartet recorded this and other music in the anechoic room at the Department of Applied Acoustics, Chalmers University of Technology [15].

An excerpt from the questionnaire is given in Figure 6. Binaural pair comparisons using equalized headphones are performed, where listeners are not told the test's purpose or background and are asked to rate the overall difference be­tween A and B. The four groups of signals (order: strings, organ, pink noise, impulse) are evaluated one at a time, each with three pairs, plus a reference pair. Before ranking the "perceived difference" from 0 to 1 for each pair, each test person is instructed to first listen to all three pairs and to the reference pair for that test signal. (The reference is an exam­ple pair of a "clear difference" for the signal, with b =10%


Figure 7. Average "perceived difference" (solid circles) and standard deviations (vertical lines). Note that the average levels of "Perceived Difference" depend on the input signal (e.g., "Pink Noise" vs. "Im­pulse").

vs. 60% constant diffusion). The listener is also asked to rank the three scales relative to each other and to the reference pair, thus yielding the perceived difference among pairs, rel­ative to the reference. For example, if the signals in Pair 1 are slightly different, and the signals in Pair 2 are more different, and neither of these is as different as the signals in Pair 3, then their relative distances on the scale should reflect this.

In addition to the difference scaling, the listeners may specify whether they hear a difference in coloration and/or spaciousness and/or any other quality (that they write in the comment area). If the listener hears a spaciousness difference, he/she is questioned whether "A" or "B" is more spacious. Note that "coloration" is not used in an absolute or negative

Torres et al.: Audibility of diffusion in auralization 923

context here; it only describes the nature of the difference between two signals.

As stated, the test person is not told that the diffusion is varied or that pairs correspond to frequency ranges. The only given "hint" is to listen for changes in coloration and spaciousness, as they are later asked if the differences can be characterized in these terms. The test takers are allowed to listen each group of pairs as many times as needed before making their judgements. The 15 listeners consist mainly of acoustics graduate students and professors, several with experience in music and critical listening.

4. Results and discussion

4.1. General perceived difference

In Figure 7 the overall perceived difference when the diffu­sion coefficient is varied shows a clear dependence on the input signal; for example, the general differences with pink noise are greater than those with string quartet. For some signals the differences are audible at all frequency regions. This suggests that diffusion (and more generalized scatter­ing) must be treated with some frequency dependence. The general trend in "perceived difference" shows a characteristic "sag" (i.e., slight dip) in the "Mid" frequency region, but the standard deviations suggest that one cannot make detailed claims about its importance (or non-importance) relative to the other frequency ranges. In general, the relative ranking of individual frequency regions vary among listeners.

4.2. Detailed characterization of differences

The listeners' characterization of the differences and com­ments (see Figures 8-I I) are more revealing than expected and suggest that the "perceived difference scale" alone is not sufficient. In the figures each of the three frequency regions contains a pair of vertical bars. The left, black bar depicts how many people heard differences in coloration between Signal "A' and Signal "B". The right bar represents how many listeners heard differences m spaciousness; this bar is divided into those who thought "B" was more spacious or "A' was more spacious. (For example, in the organ chord's "Low" region, 2 out of 3 people thought "B" was more spa­cious than `A'.) The horizontal dashed line shows how many people heard differences in both spaciousness and coloration for a given pair.

One may first notice that coloration differences are per­ceived more strongly for sustained signals: about (12/15) listeners for the organ chord and (14/15) for the pink noise, compared to lower values for the impulsive signals. Further­more, for the sustained signals, most listeners who heard differences in spaciousness also heard differences in col­oration (see dashed lines in Figure 9), but very few heard only spaciousness differences. The significance level for forced-choice tests with two possible choices is given by P(r) <0.05 for I1 or more repeated answers from 15 total listeners; P(r) <0.01 for 13 or more, out of 15 listeners. For


Figure 8. Listeners described whether they heard differences in col­oration, spaciousness, and/or other qualities. In the second column of each pair, the white portion shows how many thought Signal "B" (with 60% diffusion) was more spacious. The dashed line shows how many people heard differences in both spaciousness and col­oration. For sustained signals like the organ chord, coloration differ­ences were strongly heard, and comments reflected the frequency­dependence of changes in the diffusion coefficient.

the parts of the test where the listener is not forced to evaluate (e.g., in the optional characterizations of the difference) or is given more than two choices, repeated answers should have even higher significance. Thus, the significance levels above can be used as approximate lower limits.

The optional listener comments are surprisingly consistent in identifying which frequency ranges are adjusted, although the "High" and "Mid" regions are together usually perceived as "high frequencies" (HF). When comparing the organ sig­nals (Figure 8), listeners wrote for the "High" region: "B less treble", "B more bass". This is reasonable since the high-frequency specular component is decreased in signal B. (Again, signal "A' has 10% diffusion; "B", 60%.) Similarly, listeners judged in the "Low" region: "A more bass", "A less treble", "A less HF', "B brighter". When the diffusion is increased, specular reflections are reduced in strength; as a result, increasing low-frequency diffusion in, signal "B" ­and lowering the specular "bass" component - is perceived as either making signal B "brighter" or endowing signal "A' with "more bass". It is interesting that listeners often connect changes with diffusion with changes in coloration, while not knowing the cause of the differences between A and B, or their frequency-dependence. These consistencies are found for all the test signals, most often in the outer frequency regions.


Figure 9. Similar to the organ chord, coloration differences were strongly heard for the pink noise comparisons. For example, in the "Low" frequency region, increasing the diffusion was perceived as high-pass filtering the signal. Differences in spaciousness were again secondary here, and none heard only spaciousness differences.

In addition, some listeners also describe that some BRIR seemed to have more or less diffusion by listening to the structure of the early reflections in the bare impulse response. Again, listeners were not informed that changes were made to diffusion coefficients, but they still used related descriptors (e.g., "clearer early reflections") in their evaluations.

Differences in spaciousness were audible but less obvious to the listeners (at most, 10 of 15 heard a difference) than differences in coloration. This may depend on room geome­try (whether the form is inherently "sound diffusing"), non­personal head-related transfer functions (HRTF), and other factors. For example, the modeling of the string quartet and organ by an omnidirectional source is unrealistic (though tolerable, as discussed in section 3.2); moreover, the per­ceived changes in spaciousness are inherently judged against the listener's own experiences, which can be quite different. Nevertheless, spaciousness differences are generally more present in the impulsive signals than in the sustained ones, e.g., when comparing the strings' pizzicato with the organ chord. On the other hand, coloration differences become less obvious for impulsive signals. For these signals, there are also more people who hear changes "either" as coloration "or" as spaciousness (but not both).

Regarding diffusion and spaciousness, people often at­tributed spaciousness to the impulse response with higher


Figure 10. For impulsive signals (e.g., string pizzicato), coloration differences are somewhat weaker and spaciousness differences rel­atively stronger than for sustained signals. There is also a clearer delineation between those who hear only spaciousness differences and those who hear only coloration differences, as shown by the lower levels of the dashed lines.

diffusion ("B"). This is observed for the string pizzicato ("Low" and "High" regions) and impulse alone ("High" re­gion). It would be inaccurate, however, to always attribute "increasing diffusion" with "increasing spaciousness"; for example, for the organ chord in the "Mid" region, four out of four listeners rated the signal with lower diffusion ("A') as being more spacious. (Again, there may also be the issue of HRTF mismatch.) Furthermore, above a certain "optimal" level of surface diffusion, increasingly high diffusion coeffi­cients beget weaker specular reflections and correspondingly hazier "aural" imaging until the specular cues (e.g., strong lateral reflections) theoretically disappear in an anonymous diffuse decay. In real rooms this effect is presumably similar but less extreme.

Another observation is that differences in coloration and spaciousness are audible in the "Low" region, even with 60% diffusion at higher frequencies. Thus, high diffusion in upper frequency regions does not necessarily mask audibility of diffusion in lower regions and further demonstrates that diffusion (and scattering in general) should be modeled with some frequency dependence.

Regarding the computer model, the geometry and absorp­tion are such that reverberation times did not vary unreason­ably, despite very low b-values in certain cases. For example, the computed reverberation time at 1 kHz varies from 2.2 to 1.9 seconds when the diffusion is increased from 1 % to 60%, and varied less for the pair comparisons, which were always

Tomes et al.: Audibility of diffusion in auralization 925      (125-250 Hz)  "B brighter, thinner°,"B smoother transition to late"


Figure 1 I . Listening to the bare t~npulse response corresponds to con­volving with a unit impulse, which accentuates effects at higher fre­quencies. Nevertheless, for this transient signal, coloration changes are still heard over all frequencies tested.

at 10% vs. 60% diffusion. Such differences in reverberation time, though audible, do not obscure other perceptive phe­nomena (like coloration or spaciousness) caused by varying the diffusion values. This is demonstrated in the evaluation of sustained signals, where most test takers still heard dif­ferences in coloration - changes in "bass" or "treble" - that relate to the stationary part of the organ chord or the pink noise (although one listener mentioned additional differences in the reverberation tail).

Finally, the sequence of the test signals (i.e., having musi­cal signals first) had some significance. One sensitive listener with "perfect pitch" said that if dne had listened first to the pink noise and impulse response signals, this would have "given away the answers", i.e., helped in hearing similar but less exposed differences in the musical signals. This con­sideration, and others described under "Future Work" may also help define more standard test signals and methods for evaluating auralizations, currently less straightforward than validating measured room parameters. It is hoped, moreover, that discussing the approach of these listening tests is as useful to future research as the results.

5. Conclusions

This is an initial study for a given implementation of diffu­sion and a receiver near a rear wall. Thus, these conclusions

the ttttiflg 9uartet.

 lndivfdual rankings of the frequency regions (e.g., whether the "Mid" region is more or less audible than another region) vary among listeners. In any case, for some signals, changes are clearly audible in every frequency region tested. This implies that diffusion (and scattering) should be modeled in a frequency-dependent manner, although not all auralization programs currently do this.

The character of the perceived difference in diffusion de­pends on the input signal. Differences can be perceived (1) mainly as changes in coloration (for sustained input signals such as organ chords and pink noise) or (2) more as changes in spaciousness (for transient signals such as plucked strings or impulses). Of course, some listeners heard differences in both qualities, and other perceptive dimensions (e.g., early and late decay) are obviously pos­sible.

Even if uninformed of the differences between high- or low-diffusion signals, listeners give consistent answers regarding perceived changes in frequency-coloration. Moreover, by selecting scattering surfaces to temporally and spatially redistribute early reflected energy at various frequencies, one can possibly tailor the perceived col­oration at certain listener positions, such as those near reflecting surfaces.

6. Future work

No initial study is all-encompassing. Future tests should in­clude more listener positions, e.g., near side walls where comb-filter effects binaurally decrease, to center positions far from reflecting walls. (Informal listening suggested that a position far from reflecting surfaces is less affected by changes in the diffusion coefficient, which agrees with some studies done in [16].) The source in the tests should likewise be developed into a multiple-source ensemble with more realistic directivities. Test signals could also include noise with "frequency-shaping" to approximate organ or orchestra chords, in addition to musical signals. Furthermore, listening to the BRIR alone corresponds to convolution with a unit impulse, thus emphasizing differences at higher frequencies relative to differences at lower frequencies. The BRIR should instead be passed through a filter that has characteristics of, e.g., pink noise, or musical spectra such as those above.

In this investigation, diffusion was not varied below the 125 Hz octave band independently (although values in lower bands follow from the 125 Hz band). However, initial stud­ies with edge-scattering (edge-diffraction). in auralization [2] have shown that inclusion of edge-diffraction is audible in this range. Future subjective tests should also employ more

ACUSTICA ~ acta acustica Vnl. H6 (2000)



Figure 12. Qualitative representation of the "optimum" degree of surface scattering (here "diffusion" b) for different frequency regions f . The shape of the curves can also depend on the room geometry (e.g., certain geometries may sound best with relatively smooth walls or vice versa) and on the type of signals played in the room.

accurate scattering models like those discussed in [4] or per­haps more elaborate diffusion models [17].

Such studies have practical implications as well. It may not be obvious how much diffusion one should specify to achieve a certain sound quality. In addition to varying dis­tance from surfaces, one should also vary the diffusion coef­ficient in smaller steps and within narrower frequency bands; this would better establish difference limen and possible "op­timal" values. Figure 12 depicts a qualitative approach to such a study. Below the optimum amount, the binaural impulse response may not have the desired timbre or the preferred density of diffuse reflections; above the optimum, the aural "signature" of the room (given by its predominant specular reflections) may degrade toward a more anonymous "wash" of decay. Again, these results should not be excessively gen­eralized to demonstrate that greater surface scattering always gives increased spaciousness. The perceived spaciousness of a room with "too little" surface scattering may naturally im­prove with more articulated surfaces, but only up to a point. The situation becomes more complex with additional con­siderations, such that the curves in the figure also depend on the geometry of the roort~, (e.g., certain geometries may sound better with smooth walls, or vice versa) and on the type of signals played in the room. Of course, the general in­vestigation of "preference" can naturally be complemented by investigations of "modeling accuracy" and other criteria for optimization.

Through such tests one may possibly distill the most sig­nificant parameters to achieve a numerically and aurally ac­curate scattering model. Equally important, however, these tests may refine our practical understanding and use of scat­tering in room acoustics.


The listening tests were done during the first author's hos­pitable stay with Prof. Michael Vorlander at the Institut fur 'fcchnische Akustik, Aachen, Germany. Sincere appreciation for discussions and assistance is extended to M. Vorlander, D. Vastfjall, P Svensson, F. Fricke, I. Dash, T.J. Cox, V. Ri­oux, E. Mommertz, S. Muller, consultants from Akustikon pB, and all participants in the listening tests. Constructive comments from the anonymous reviewers assisted in clarify­ing various points in the article. The original computer model of the hall was kindly provided by Akustikon AB, Sweden. Funding was generously awarded by the Axel and Margaret Axson Johnson Foundation and Teknikbrostiftelsen. Prelim­inary findings from this paper have been presented in con­ference proceedings. Typographical errors in the old table of results have been corrected here.


[1] U. P. Svensson, R. I. Fred, J. Vanderkooy: An analytic sec­ondary source model of edge diffraction impulse responses. J. Acoust. Soc. Am. 106 (November 1999) 2331-2344.

[2] R. Torres, M. Kleiner: Audibility of diffraction in auralization of a stage house. Proceedings of International Congress on Acoustics, Seattle, 1998.

[3] M. Kleiner, P. Svensson, B.-I. Dalenback: Auralization of QRD and other diffusing surfaces using scale modeling. 93rd AES Convention, Pre-print, 1992.

[4] R. Tomes, M. Kleiner: Considerations for including surface scattering in auralization. Proceedings of 137th ASA/ 2nd EAA (Forum Acusticum)/DAGA, Berlin, March 1999.

[5] H. Medwin, C. Clay: Fundamentals of acoustical oceanogra­phy. Academic Press, New York, 1998.

[6] B.-I. Dalenback, M. Kleiner, P. Svensson: A macroscopic view of diffuse reflection. J. Audio Eng. Soc. 42 (October 1994).

Torres et al.: Audibility of diffusion in auralization 927

[7] H. Kuttruff: Room acoustics, third edition. Elsevier Applied Science, London and New York, 1991.

[8] N. Xiang, J. Blauert: Binaural scale modeling for auralisation and prediction of acoustics in auditoria. J. Appl. Acoust. 38 (1993) 267-290.

[9] N. Xiang: A mobile universal measuring system for the binau­ral room-acoustic modelling-technique. Fb 611 (Forschung), Lehrstuhl fur Allgemeine Elektrotechnik and Akustik, Ruhr­Universitat Bochum. Schriftenreihe der Bundesanstalt fiir Ar­beitschutz, Wirtschaftsverlag NW Bremerhaven, 1991.

[10] G. Bartsch: A simulation package for room acoustics with an open interface. Proceedings of 137th ASA/ 2nd EAA (Forum Acusticum)/DAGA, Berlin, March 1999.

[11] B.-I. Dalenb5ck: Verification of prediction based on random­ized tail-corrected cone-tracing and array modeling. Proceed­ings of 137th ASA/2nd EAA/DAGA, Berlin, March 1999.

[12] Rayleigh: The theory of sound, second ed., vol. ii, sec. 272a. Macmillan, London, 1929.

[ 13] Akustikon AB, Goteborg, Sweden;

[14] L. Beranek: Concert and opera halls: How they sound. Acous­tical Society of America, 1996.

[15] Clausenkvartetten (The Clausen Quartet). Goteborg, Sweden. Bjorn Clausen and Eva Johansson (first and second violin), Fredrik Meuller (viola), Rendell Torres (cello), 1998. Sheet music arrangement for string quartet by Merle J. Isaac and published by Carl Fischer, Inc., New York, 1966. If recording is used, this article and the Clausen Quartet should be acknowl­edged. Organ recording Koralimprovisation was played by Jo­hannes Landgren in 1998 on a synthesized organ. Recordings may not be used for commercial purposes without permission. Sound samples are included on the compact disc corresponding to this issue of Acustica / acta acustica.

[16] R. Heinz: Entwicklung and Beurteilung von computerge­stutzten Methoden zur binauralen Raumsimulation. Disser­tation. Inst. fur Technische Akustik, RWTH Aachen, 1994.

[17] B.-I. Dalenback: Room acoustic prediction based on a unified treatment of diffuse and specular reflection. J. Acoust. Soc. Am. 100 (August 1996).