Dynamic-Range Issues in the Modern Digital Audio Environment*

LOUIS D. FIELDER, AES Fellow

Dolby Laboratories Inc., San Francisco, CA 91403, USA

The peak sound levels of music performances are combined with the audibility of noise in sound reproduction circumstances to yield a dynamic-range criterion for noise-free reproduction of music. This criterion is then examined in light of limitations due to microphones, analog-to-digital conversion, digital audio storage, low-bit-rate coders, digital-to-analog con version, and loudspeakers. A dynamic range of over 120 dB is found to be necessary in the most demanding of circumstances, requiring the reproduction of sound levels of up to 129 dB SPL. Present audio systems are shown to be challenged to yield these values.

PAPERS

O INTRODUCTION

The advent of digital audio technology has allowed music recording and reproduction systems to be implemented having equipment noise levels that approach or equal other elements in the recording and reproduction chain. This study examines the degree to which present audio systems allow the reproduction of music without the presence of audible noise. The evaluation of these equipment noise levels will be made by the use of dynamic-range values. Dynamic range is defined as the ratio between the rms maximum undistorted sine-wave level producing peak levels equal to a particular peak level and the rms level of 20-kHz bandwidth-limited white noise that has the same apparent loudness as a particular audio chain’s equipment noise in the absence of a signal. The dynamic range defined here is a channel dynamic-range metric as opposed to one that is a ratio of the loudest to the softest levels of music performances. It will be assumed that no dynamic level compression is used and that the goal is to reproduce the intended music performance as accurately as possible. An upper bound on the requirements for dynamic range using the most demanding of music reproduction circumstances will be established, and then elements in the audio chain will be examined for their effect on the overall dynamic range. All examinations will be made using equivalent acoustic levels and expressed in decibels sound pressure level (SPL). As is standard, 0 dB SPL equals an rms sound pressure of 20 μPa.

The determination of dynamic-range requirements has been a subject of considerable interest in the past due to the severe noise problems in the audio chain. The most accurate study of that period was carried out by Fletcher [ who derived a dynamic-range requirement of 100 dB by comparing the sound pressure at the discomfort level of 120 dB SPL with estimated acoustic noise levels in residential listening environments by Hoth [ The 120 dB SPL was justified because it was greater than the peak acoustic output levels of 100—116 dB SPL for classical music, as determined by Sivian et al. [ The only major flaw in Fletcher’s determination was that the residential-room noise levels had been esti mated too high in the Hoth study that he used. More recently the author [ [ studied the question of required dynamic range limited by conventionally dithered digital audio equipment. Peak levels in music performances were compared to just audible levels of white noise, producing limits of 115 dB SPL [ and 122 dB SPL [ This paper will review and expand on these earlier studies in light of the present state of the art in digital audio.

The performances of digital audio elements of the audio chain are compared to the properties of micro phones, amplifiers, and loudsneakers. It will he shown result, the dynamic-range requirements of the various digital components will be modified by these other limitations.

1 MODEL OF RECORDING AND

REPRODUCTION CHAIN

The model of music recording and reproduction for this analysis emphasizes simplicity and maximum dynamic-range capability. This model includes recording and reproduction environments, microphones and loud speakers, analog amplification, analog-to-digital con version units, and intermediate digital audio elements. The intermediate digital audio elements include a mixing console, storage element, transmission system, and control unit at the playback site. All the digital interconnections are assumed to have 24-bit word lengths and sufficient dither to preserve linearity. The situations of monophonic, stereophonic, and five-channel reproduction will be considered by the appropriate selection of microphones and loudspeakers from this general model. A schematic diagram of this audio system is shown in Fig. 1.

The first element in the audio system is the recording environment and the musicians. This analysis will generally assume that microphones are located at the ideal listener location, but will briefly consider configurations where individual instruments or groups of instruments have microphones associated with them. In either situation it will be assumed that the final recording will closely duplicate the listening experience at a desired audience location. It is also assumed that considerable care is taken to reduce environmental noise during the recording. The properties of the microphones are deter mined by the microphone—preamplifier combination.

Fig. I shows that the microphone preamplifiers are directly connected to analog-to-digital conversion units (ADCs) to maximize the dynamic range. The number of microphones is dependent on the number of reproduction channels and the microphone technique used. In general a simple microphone technique has the same number of microphones as reproduction channels, where in = n. As mentioned earlier, there are two basic approaches to sound recording. One attempts to sample the sound field at an ideal recording location and is designated “natural miking.” The ideal recording location is assumed to have performance sound and noise levels equal to a practical audience location. The other method places microphones in various locations in the performance environment, typically close to specific instruments, and uses signal processing to generate a synthetic listening location. This technique is designated “multimiking.” Although practical recording situations may actually be a combination of the two, this analysis will focus on the natural miking configuration for simplicity sake. The ADC section includes any signal processing to improve its performance, but is assumed to produce an output with flat frequency response and a digital word length sufficiently long to introduce no significant noise. 24-bit word lengths are assumed to be sufficiently long to satisfy this criterion.

Next in the chain is the digital mixer, which combines and modifies the microphone inputs to generate the required number of reproduction channels. For this analysis n will be either 1 for monophonic, 2 for stereophonic, or 5 for five-channel reproduction. It is assumed that no equalization is used and that the digital processing de vice has sufficient internal word lengths to keep internally generated noises significantly lower than the apparent equipment noise of the system.

The two succeeding elements in the chain are storage and transmission. In actual systems either one or both are generally present. Again, it is assumed that the input and output word lengths are sufficient to prevent impairment of performance and that any signal processing specific to improve the dynamic-range performance of each element is included within it. For example, if error feed back noise shaping is used to improve the apparent dynamic range of a 16-bit storage system, that processing is included within the element. The use of low-bit-rate coders in the transmission system is also discussed.

After transmission or storage there are the control unit and the digital-to-analog converters (DACs). In most present systems the control unit is an analog control preamplifier and is located after the DACs. The disadvantage with this situation is that the dynamic range of most analog control preamplifiers is limited to 90—100 dB, which will be shown to be insufficient. As a result, the order of the control unit is interchanged with the DACs to maximize the possible dynamic range. In this situation the digital control unit is assumed to have sufficiently large internal word lengths not to reduce the dynamic range of the system. It is assumed that no equalization or dynamic compression is done at this point. The DACs that follow the control unit are assumed to include any necessary signal processing within the DAC element.

At the end of the audio system chain are the power amplifiers, loudspeakers, and listening environment. The power amplifiers are matched to the loudspeakers in such a way as to maximize the sound-level capabilities of the loudspeakers chosen, implying amplifiers with outputs in excess of 250 W. Power amplifiers of this size generally have a signal-to-noise ratio or dynamic range in the region of between 110 and 120 dB. For a sampling of specific examples see, King [ [ and Feldman [ [ Examining the loudspeaker element, both consumer and professional loudspeakers will be considered. The listening environment will be assumed to be a high-fidelity music listening room, appropriate for high-fidelity reproduction of audio, with some at tempt to control environmental noise levels. For simplicity of the analysis, the loudspeaker—room combination is assumed to result in a fiat frequency response to the listener. Headphone reproduction will not be considered.

This basic model of an audio recording and reproduction system will be used to evaluate the effect of each element on the dynamic range of the music reproduction. Although actual sound recording and reproduction chains are often more complicated, employing smaller digital word lengths or more signal processing steps, the result is to further reduce the dynamic range of the final system. This downward trend in dynamic range will be shown to be undesirable because of the high dynamic- range requirements that will result from this analysis.

2 BASIC DYNAMIC-RANGE REQUIREMENT

Under ideal conditions the dynamic-range requirement is based solely on music levels and the human auditory capabilities of noise detection. Under these conditions a music performance is faithfully reproduced at natural absolute acoustic levels and not marred by the presence of any audible audio equipment noise or frequency response aberrations. Since this study focuses on natural miking configurations, determination of the dynamic-range requirement involves finding the ratio between the highest peak acoustic levels at favored audience locations in actual music performances and the maximum level of noise not yet audible in the quietest of available recording or listening environments and employing that as the proper value.

This analysis applies to monophonic reproduction circumstances where a single acoustic sampling of the sound field is done and a single noise source is examined for audibility. Extension to stereophonic and five-channel reproduction will be made by adaptation of the mono phonic results. Performance sound pressure levels are assumed to add power for the front loudspeakers in both stereophonic and five-channel circumstances. This is plausible since the frontal combination of loudspeakers reproduces the sound emanating from the performance location (in front) whereas the rear loudspeakers are assumed to provide only reverberation or concert-hall ambiance. Therefore it is reasonable to neglect the effect the rear loudspeakers have in creating high peak acoustic levels at the listener’s location. Despite this fact, all loudspeakers are assumed to produce equal peak sound- level capabilities because of the possibility that high sound levels may come from the rear in special circumstances.

Equipment noise is treated differently. The acoustic noise perceived by the listener is assumed to be the power sum of all loudspeakers. This is reasonable since the presence of noise from each loudspeaker location feeds noise into the reverberant sound field equally. As a consequence, each loudspeaker’s reproduction of system equipment noise contributes equally to the noise energy at the listener’s location.

The implication of the previous assumptions for the summation of peak sound levels and the reproduction of equipment noise is that stereophonic reproduction circumstances will require the same dynamic-range limits as the monophonic situation. This is true because both peak and noise levels increase by 3 dB. The situation for five-channel reproduction implies an increase in dynamic range required in each audio system channel by the power ratio of 5:3 because three loudspeakers pro duce the peak levels whereas five loudspeakers produce the audible equipment noise. This 5:3 ratio is equivalent to an increase of 2.2 dB in the channel-by-channel dynamic-range requirement.

2.1 Peak Levels in Music Performances

The first step in the determination of the required dynamic range for apparent noise-free reproduction of music signals is the examination of the peak levels found in actual music performances. Although most information on sound levels is focused on average levels and hearing damage, a number of studies have examined peak levels. An early paper by Sivian et a!. [ provided absolute peak acoustic level data for classical music with two piano solos measured at 103 and 104 dB SPL, two 15-piece orchestras measured at 112 dB SPL each, four examples of a 75-piece orchestra measured at 107, 112, 113, and 114 dB SPL, and two pipe organ examples measured at 106 and 116 dB SPL. Another study, by Lebo and Oliphant [ presented peak data of 101 dB SPL for classical and 114 or 122 dB SPL for popular music.

Meares and Lansdowne [ in a paper on sound levels in broadcasting studios, gave data for peak sound levels in recording studios at 111, 113, and 118 dB SPL for recital, orchestra, and dance band music, respectively. Ahnert [12] studied sound energies and peak levels of classical music, combining the results of a number of German investigators: peak levels between 86 and and 125 dB SPL were reported for recording locations. The author has also examined the question of absolute peak acoustic levels and has performed two studies. The first study [4] measured levels of 122 dB SPL for percussive classical and 124 dB SPL for electronically augmented country music. The second study by the author [ was more extensive, surveying over a 1-year period. In this study 47 music selections played during 36 different performances were examined for peak sound levels. An acoustic peak measuring device capable of measuring 90—130 dB SPL was used by the author sitting at favored listening locations. This produced peak levels ranging between 90 and 129 dB SPL. 22 classical music selections were measured, covering a range of 90—118 dB SPL, 11 rock music selections measured covered a range of 115— 129 dB SPL, six jazz music selections produced peak levels in the range of 114—127 dB SPL, whereas the remainder produced peak levels of between 116 and 127 dB SPL.

A combination of all the studies mentioned is presented in Fig. 2, which is a histogram showing the number of music examples versus peak acoustic levels. Fig. 2 really consists of two histograms, one on top of the other. The first is a combination of all lightly and darkly shaded bars. It represents the absolute peak level distribution for all 72 examples surveyed. The second histogram is only composed of the heavily shaded bars and represents the 51 examples surveyed that did not include any electronic augmentation (that is, amplifiers and loudspeakers). Inspection of this figure shows that many music performances have peak levels in the range of 120—129 dB SPL and that a number of these do not include electronic augmentation. The major reason for the existence of such high sound levels is the use of instruments capable of producing high sound levels. Most notable are drums, which are capable of producing sound levels of 138 dB SPL at a 1-m distance (see Sivian et al. [ for more details.) The major conclusion drawn from the study of Fig. 2 is that the reproduction of music performances at natural levels requires the ability to pro duce very loud sounds in the range of 120—129 dB.

Fig.2 Peak acoustic levels of various music performances. Heavily shaded examples were performances that did not include electronic augmentation [ 0 dB SPL = 20 μ Pa ]

 

2.2 Determination of Just Audible Noise Level

The other factor in the determination of dynamic- range requirements is the ability of listeners to detect audio system equipment noise. The audibility of equipment noise is determined and its level measured in terms of an audibly equivalent level of 20-kHz bandwidth noise that is otherwise white in character. The use of an equivalent white noise level metric is justified because digital audio truncation noise is white and band limited when it is conventionally dithered (see Vanderkooy and Lipshitz [ In addition, analog dynamic-range measurements have often assumed a white equipment noise floor.

A determination of the level of white noise that is just audible in quiet listening conditions was performed by the author [ 13 listeners were exposed to a mono phonic acoustic noise source that produced noise closely approximating white noise in the frequency region of 1—10 kHz. The noise source cycled the noise on and off smoothly at a 2-s rate while the listener adjusted the level to reach the “just audible” level. Each listener performed this experiment in his or her preferred listening environment. The results of these experiments are shown in Fig. 3.

Fig. 3 is a histogram of the results of the threshold determination for the 13 listeners and indicates that the mean threshold was 3.8 dB SPL for 20-kHz low-pass- filtered white noise. The detection levels span the range from — 2 to 9 dB SPL and have a statistical standard deviation of 3 dB. The low level of detected noise and the tight spread in values are surprising in that the listening rooms have wide-band noise levels in the range of 20—35 dBA  SPL. In this study it is determined that the listener was detecting the presence of the noise using the noise energy within the 3—7-kHz band, and the effect of room noise was unimportant in most circumstances. This indicates that the spread in threshold values is governed by individual hearing acuity rather than environmental variations.

Number

2.3 Maximum Dynamic-Range Requirement

The combination of measured acoustic peak levels up to a maximum of 129 dB SPL for music performances with a just audible level of white noise at 3.8 dB SPL yields a dynamic-range requirement of 122 dB for mono phonic reproduction circumstances. Extension to stereo phonic or five-channel situations requires correction fac tors of 0 and 2.2 dB, respectively, to be added to the 122 dB. Therefore music reproduction, limited only by sound levels and human auditory capabilities, requires a dynamic range of 122 dB for monophonic, 122 dB for stereophonic, and 124 dB for five-channel reproduction circumstances. These limits are particularly rigorous be cause they greatly exceed the 90— l00-dB dynamic range of most digital audio systems today. For instance, the attainment of the 1 24-dB dynamic-range limit would require more than 20-bit word lengths if ideal conversion systems and triangular-probability additive dither were used. (See Vanderkooy and Lipshitz [ for dynamic range versus word lengths.)

The preceding analysis has derived a requirement for the dynamic range determined solely by the peak acoustic levels of music and the detection of just audible noise in quiet environments. Environmental noise at recording or reproduction may reduce the requirements of the equipment by covering up or masking otherwise audible equipment noises. Limits in microphones and loud speakers may prevent the reproduction of the full dynamic range, and hence reduce the requirements of inter mediate parts of the chain.

3 BASIC PSYCHOACOUSTICS OF NOISE DETECTION

The previous determination used the audibility of band width-limited white noise. However, the examination of environmental acoustic noise, microphone noise, and signal processing for apparent dynamic-range extension requires a comparison of the loudness or detectability of noises that are not white in character. This is most readily done by the use of the critical-band concept and of the detection theory of psychoacoustics. The use of these two concepts will allow the spectral comparison between different noises to assess their audibility and loudness.

3.1 Critical-Band Concept

The critical-band-concept model of the human auditory system was first developed by Fletcher [ to ex plain why masking experiments showed that signals covering a frequency range less than a certain bandwidth produced the same masking and threshold of detection properties as other signals with smaller bandwidths. The basic idea behind the critical-band concept is that the ear acts as a multifrequency-band real-time analyzer with varying sensitivities and bandwidths throughout the audio range. Since the critical bandwidth represents the appropriate frequency resolution for the detection of acoustic signals, it is extremely important in the analysis of the audibility of sounds that have extended and nonflat frequency spectra. This implies that an accurate measurement of nonwhite acoustic noise must use band widths similar to critical bandwidths.

The critical band represents the minimum resolution of masking situations. For instance, the masking of a small signal by a large sine wave nearby in frequency is constant and maximum until the frequency separation between the two exceeds a critical bandwidth. Later workers further quantified this concept. Zwicker et al. [ discussed this resolution bandwidth and compared it to various detection and masking phenomena of the ear. Scharf [ examined the loudness of complex sounds using the critical-band concept and found that the loudness remained constant until the sound exceeded a critical bandwidth. An acoustic signal is detected if the energy within a particular critical band exceeds a certain level, whether the signal is a sine-wave tone, a band of noise, or a combination of the two.

 

Fig.3 White-noise threshold tests [ 0 dB SPL = 20 μ Pa ]

 

3.2 Threshold Detection via Critical Bands

There is a question as to the independence of critical bands for the threshold detection of acoustic signals. If critical bands are independent, the detection of noise only requires that the energy within at least one critical band exceed the threshold. If the critical bands are not independent, then a method of combining critical bands must be used. The author [ used 20-kHz bandwidth- limited white and shaped noise to conclude that critical bands were independent at threshold but gradually started sharing energies as the loudness increased above threshold. The sharing of energies between bands was shown to eventually produce an increase in loudness of 8 dB for a noise signal equally exciting all critical bands, when compared to white noise. This experiment would indicate that the single-band detection method is accurate. However, later work by Stuart and Wilson [ performed listening tests that indicated that the single- band model was less accurate than a sharing of detection probabilities between bands. They hypothesized that the probability of detection for each band is combined together to increase the listener’s sensitivity by as much as 8 dB for a noise signal uniformly exciting all critical bands. For further information see Stuart [ or Stuart and Wilson [ As a result, this study will first discuss the results assuming single-band detection and then discuss those results based on a detection model similar to that of Stuart and Wilson.

3.3 Determination of Critical-Bandwidth Values

The relationship between critical bandwidth and frequency has been studied extensively using masking and detection experiments. Zwicker [20] established 24 fixed bands spanning the range of 20 Hz to 15 kHz. Later researchers have further refined the measurement of this important parameter. Some of these more recent experiments give smaller critical bands than earlier works, particularly for frequencies below 500 Hz. Most notable are Fidell et al. [ Moore and Glasberg [ Moore et al. [ and Shailer et al. [ Above 500 Hz there is good agreement between most critical-bandwidth de terminations, and they agree within 20—30% bandwidth. Below 500 Hz the disagreement between researchers becomes more marked. Zwicker’s results indicate a minimum critical bandwidth of 100 Hz, whereas the others derive bandwidths as little as 30 Hz. Unfortunately this disagreement between bandwidth estimates is the subject of much debate in the literature and is still unresolved. The later studies by Schorer [ and Fastl and Schorer [26] produced different low-frequency critical-band width values than did Zwicker [ or Moore and Glasb erg [ These differences in critical bandwidth are well correlated with the methods used to derive them. For a good overview see Gelfand [27]

3.4 Spectral Method of Threshold Detection

Since the determination of the audibility of noise re quires the correct critical bandwidths applicable for detection purposes, the disparity between the various esti mates of critical bandwidths needs to be resolved. Unfortunately all the low-frequency studies for critical bandwidth rely on various masking experiments, each the bandwidths of Zwicker and those of Moore. These compromise critical bandwidths are used to derive a one- third-octave threshold criterion for noise audibility from diffuse-field sine-wave audibility. One-third-octave spectral comparisons are appropriate because of the relative similarity between one-third octaves and critical bandwidths.

Sine-wave audibility is given by the ISO R226-1961E [ standard for the hearing threshold of sine waves under free-field conditions, and modified to diffuse-field conditions by the ISO R454-1975E [ standard for diffuse to free-field correction. A noise threshold curve is made by modifying the sine-wave threshold curve by a correction factor for the difference in bandwidth between “critical” and one-third-octave bands. A combined sine-wave and noise threshold of audibility curve is generated by taking the minimum of either the sine- wave or the noise threshold curve. This is done as the most conservative solution for the problem that one-third octaves are not equal to the critical bandwidths and may create errors of as much as 2.5 dB for frequencies above 200 Hz. The minimum of either curve guarantees that this criterion curve will not underestimate the audibility of a combination of either noises or sine waves. For further details see Cohen and Fielder [ This compromise curve is shown in Fig. 4 as the threshold of audibility. Also included in this figure is the spectrum of 3.8- dB SPL, 20-kHz band-limited white noise.

Examination of this figure shows that the hearing threshold curve includes error bars, representing the standard deviation in the threshold values at each one- third-octave frequency point. These standard deviation values are taken from Robinson and Dadson [ for the free-field hearing threshold and from Robinson et al. [ for the diffuse to free-field correction on which the ISO R226-1961E [ and ISO R454-1975E [ standards were based. The totals shown are primarily based on the free-field threshold variations, with an additional 0.5-dB uncertainty in diffuse- to free-field correction experiments. Large standard deviations of 5—17 dB exist due to variations in individual hearing acuity. They increase above 2 kHz, where head and outer ear differences influence the response characteristics of the listener. Also shown in Fig. 4 is the spectrum of just audible white noise, with the dashed lines representing the curve displaced upward and downward by 1 standard deviation.

The standard deviation values are included to help explain why simply comparing the hearing acuity to the spectrum of just audible white noise does not predict that the 3. 8-dB SPL noise is audible. This is true whether the single-band or the multiband probabilistic detection model is used. If single-band detection is applicable, the two curves should intersect for at least a single one third octave, and even the probabilistic detection model would require the curves to be within a few decibels of each other. Examination of the standard deviation of the threshold curve and the range of the white noise levels more sensitive than average, and the listener detects the presence of the noise at this point. On average this produces a 5-dB lower level for white-noise detection than otherwise expected. Despite the complication of increased sensitivity to noise due to individual variations, the use of the average hearing threshold curve is a useful way to assess the audibility of noise. It is important to remember that noise detection may occur at a 5-dB lower level than the average curve.

3.5 Detection of Changes in Audible One-Third-Octave Bands

Once the possibility of noise detection is analyzed, those one-third-octave bands that are audible are investigated for their sensitivity to the addition of audio system noise. The system equipment noise has no effect on sound reproduction as long as it is substantially below either the hearing threshold or the already existing acoustic noise spectrum of the performance environment for all one-third-octave bands.

If the equipment noise level is higher than —5 dB relative to the hearing threshold, then the noise is audible. If this is so, the equipment noise is only significant if it results in an apparent change in the overall noise. The audibility of changes in noise level is examined by Miller [ who determines that the just noticeable change in noise spectrum is level dependent. At the threshold of hearing, a 2—3-dB change in level is required for detection, while at higher levels a smaller difference is noticeable. At + 10 dB relative to the threshold the minimum detectable change is 1 dB, whereas at +20 dB or higher a level change of only 0.5 dB is detectable.

4 LIMITATIONS DUE TO RECORDING OR PLAYBACK ENVIRONMENT

At this point dynamic-range requirements of 122—124 dB have been derived for a noise-free recording environment and a system with no other sources of noise. Next the effect of acoustic noise found in recording and reproduction environments on the audibility of audio system noise is investigated. The determination of the effect of environmental noise requires the one-third-octave spectral comparison between the environmental noise, the equipment noise implied by the system dynamic range, and the hearing acuity curve.

4.1 Noise in the Recording Environment

The analysis of recording environment noise begins by determining whether the noise in the recording location is 5 dB below the hearing acuity curve. Since this study is looking for the maximum dynamic-range requirements, focusing on the noise levels of the lowest noise recording rooms is appropriate. Typically the best recording environments are designed specifically for re cording purposes or are performance venues that have had modifications in their operation to reduce noise, such as turning off the heating and air-conditioning systems, deactivating sound reinforcement systems, and taking any other appropriate measures. (See Cohen and Fielder [ for more details.) Under these conditions the noise of the best recording venues is quite low. Fig. 5 is a one-third-octave spectral comparison of the noise of two excellent recording venues, a BBC spectral requirement for studios where drama performances are recorded, and the hearing threshold derived earlier. The recording- venue noise spectra are two examples from the measurements of recording environmental noise levels by Cohen and Fielder [ while the BBC drama requirement comes from Meares and Lansdowne’s [11] study of noise levels in recording studios.

Examination of Fig. 5 shows that the two recording venues have noise levels at or below the threshold of hearing. The Davies Symphony Hall has a noise spectrum comparable in level to the hearing threshold curve when used under recording conditions. This means that this hall will produce recordings that include only slightly audible noise. This audible noise occurs because of the existence of spectral components below 1 kHz but not in the ear’s most sensitive frequency region of 3—6 kHz. This low level of noise in the 3—6 kHz region means that the audibility of audio system equipment noise and the dynamic-range requirement derived for ideal conditions are not modified by the presence of recording-venue noise.

Fig. 5 also shows that Lucasfilm’s Skywalker scoring stage has noise levels substantially below those of the hearing threshold, indicating that recordings made there will not include any audible scoring stage noise.

The extremely low levels of noise of this environment are the result of careful design and represent one of the finest examples of noise control in a commercial indoor space. Certainly no reduction of the earlier dynamic- range determinations is indicated for this recording circumstance. Finally a comparison is made of the BBC drama studios and the hearing threshold criterion. Since this standard specifies maximum noise levels approximately at the hearing threshold, studios built to this standard will typically have lower noise levels in most parts of the spectrum. It is unlikely that any reduction of the dynamic-range requirement is justified. Therefore a study of the best recording venues shows that extremely low levels of recording environmental noise can be maintained. This results in no modification of the dynamic-range requirements for audio systems.

4.2 Noise in the Playback Environments

The assessment of the effect of playback environmental noise is made in a manner similar to the recording- venue noise analysis. As before, the analysis of playback noise audibility begins by comparing the acoustic noise to the hearing acuity curve. Instead of evaluating the acoustic noise in a similar manner as the recording environment, the playback noise spectrum is evaluated to see whether it is no higher than 10 dB above the hearing threshold criterion. This higher limit is appropriate be cause the listener is able to use directional clues to differentiate playback room noises from the equipment noise of the audio system. In the author’s original study [ on dynamic-range requirements, it was determined that the listener was able to perceive audio system noises as much as 15 dB below that of the listening-room noise for monophonic noise sources. This ability to tune out listening-room noise reduces the probability that room noise will have the effect of reducing the dynamic- range requirement.

The level of typical listening-room noise is assessed by two further studies. The first of these, by this author [ examines 10 home listening rooms to produce an average noise curve, while the second, by Cohen and Fielder [ examines 27 home listening rooms and produces minimum, maximum, and average noise spectra. Since both studies produce similar averages for home listening-room noise, Fig. 6 shows the minimum, average, and maximum noise spectrum levels from only the second study.

Fig. 6 shows that the average noise spectrum of the home listening rooms surveyed possesses noise levels above 400 Hz that are no higher than 10 dB above the hearing threshold criterion. This, combined with the fact that the listener is able to employ directional clues, means that generally the home listening-room noise has no effect on reducing the dynamic-range requirements. Examination of the minimum noise levels for each one- third-octave frequency point shows that the most quiet home playback conditions have extremely low noise levels in the frequency bands above 2 kHz, critical to the detection of white noise. In this frequency region the lowest room noise situations are at least 10 dB below the hearing acuity curve.

4.3 Summary

Summarizing the effects of environmental noise during recording or playback, it can be stated that the best recording and consumer playback conditions result in no reduction of the listener’s ability to detect audio system equipment noise. This is true for natural miking conditions, which result during the reproduction of recording- venue noises at identical levels. It is also generally true for multimiking configurations, where the higher acoustic levels of “close-miked” instruments result in less than unity-gain acoustic transfer functions and therefore reduced playback levels of environmental noise and self-noise.

5 MICROPHONE AND LOUDSPEAKER LIMITATIONS

Thus far the analysis of the sound levels of music, the levels of acoustic noise, and the ability of listeners to detect noise have indicated that the ideal audio system must have 122—124 d13 of dynamic range. Next the limitations of the acoustic transducers at both ends of the audio chain are examined. Microphones limit the maximum sound levels recorded and also possess electric equipment noise. Loudspeakers limit the maximum reproduced sound levels but are not a source of acoustic noise. Power amplifiers generate noise that is reproduced by loudspeakers.

5.1 Microphone Limitations

Dynamic-range reduction due to microphone imperfections is determined by an examination of the best examples of microphone design. To this end, four excel lent recording microphones and one ultralow-noise measurement microphone, all of the condenser type, were examined for maximum undistorted sound level reproduction and equivalent acoustic self-noise. As before, the analysis of noise audibility will be made by one- third-octave spectral comparison with the hearing threshold curve. Fig. 7 compares the five microphones with the spectra of just audible white noise and the aver age hearing threshold.

This figure shows that the noise of the four recording microphones is substantially higher than that of the Bruel and Kjaer 4179 measurement microphone. The noise of the quietest of the recording microphones is lower than the hearing threshold, except in the ear’s most sensitive region of 3—6 kHz, where the noise is higher.

This means that natural miking situations using these recording microphones will not reproduce music performances in a noise-free manner. Instead some micro phone noise will be audible in the recording. The degree to which this is true depends on the microphones used. The most quiet recording microphone is within 3 dB of the average hearing threshold, whereas the noisiest of the group produces noise as much as 10 dB above the hearing threshold limit. This indicates that the noise of the recording microphone produces the first important limitation on the ability to create noise-free recordings.

A further examination of microphone technology is warranted. This begins by comparing the noise of the Bruel and Kjaer 4179 to the recording microphones and observing that its noise level is substantially lower than the others. This is true because its design lowered the primary source of self-noise, the diaphragm damping element, as shown by Tarnow f34}. The design of the 4179 microphone, as discussed by Frederiksen 135], reduces the value of the noise that induces diaphragm damping and equalizes the resulting resonant rise in the frequency response. This method could also be applied to the design of a recording microphone, allowing lower noise levels. If the noise levels of the 4179 are scaled to the diaphragm sizes of 12—18 mm for recording microphones, the resultant microphone noise would still be at least 5 dB below the hearing threshold.

Thus far the dynamic-range limitations have been studied for natural miking configurations. Multimiking configurations can be used to lower the presence of audible noise by sampling the higher sound fields near the instruments. This causes a reduction of microphone noise by the ratio of the sampled sound level to the level at the synthesized listener location. This is successful to the extent that the sampled sound levels can be raised above those perceived by the listener without overloading the microphone. The recording microphones included in Fig. 7 have overload sound levels of 130 dB SPL for the Schoeps, 135 dB SPL for the Sennheiser, 136 dB SPL for the Neumann, and 143 dB SPL for the Bruel and Kjaer 4006. Unfortunately the most difficult situations requiring noise-free reproduction at sound levels of above 125 dB cannot be accommodated in this manner since the microphone is already operating near overload and cannot accept an increase in the acoustic level.

The common practice of using microphones to sample the sound field at the listener location with extra microphones to accent portions of the performance ensemble is also not helpful in reducing audible equipment noise of recording. This is true because the noise levels are the sum of the natural miking configuration plus additional noise from the accent microphone locations. Pres ent recording microphones have audible dynamic ranges between 108 and 115 dB and are shown to be significant limiting factors in the creation of noise-free recordings at natural levels. The use of stereophonic or five-channel playback configurations has a mixed effect on the audibility of recording noise. Microphone equipment noise will decrease relative to the music signal when performance sound levels are sampled and reproduced in a phase-coincident manner between microphones, but it will increase as the number of microphones not exposed to the highest sound fields increases.

5.2 Loudspeaker Limitations

Loudspeakers affect the dynamic-range requirements for music reproduction by either having a nonflat frequency response, which emphasizes or deemphasizes parts of the frequency spectrum, or by having limited sound output that prevents the reproduction of performances at natural sound levels. For this analysis the frequency response of the loudspeaker—playback room environment is assumed to be constant with frequency or to be compensated for in the power amplifier section. The primary limitation due to loudspeakers will then be the sound capabilities of the one to three frontal loud speakers of monophonic, stereophonic, or five-channel configurations. Professional and consumer playback loudspeakers will be considered by the inclusion of examples of both.

The peak sound level capability in the listening environment is determined by the use of peak output specifications for a l-m distance and conversion from 1-rn sound levels to listening-room levels. This conversion factor is determined by assuming that home listening environments are similar to the IEC recommended listening room [ with loudspeakers that have frequency responses and directional characteristics similar to the examples examined by Toole [ in his study of loud speakers, listening rooms, and listener preferences.

The IEC listening room is a stereophonic playback configuration in a room approximately 7 m long, 4 m wide, and 2.8 m high with a reverberation time of 0.34 s. The listener is located in the center of the room and 3 m from each loudspeaker. Reduction to monophonic reproduction is made by replacing the two loudspeakers with a single one between the stereo pair’s positions, while extension to five channels is accomplished by adding the monophonic and stereophonic loudspeakers together and including rear loudspeakers at the back of the room. Since the placement and type of rear loudspeakers vary considerably, this study will assume that they are far enough from the listener such that the reverberant sound field dominates over the direct sound. The analysis of peak sound level production ignores the rear loud speakers, while in the monophonic and five-channel con figurations this center loudspeaker is assumed to have the same characteristics and coupling to the listening room as the stereophonic ones.

The production of peak sound levels in a typical listening room depends on the loudspeaker directional characteristics, the spectral content of high sound levels, and the effect of room reflections on peak loud levels. Toole [ examined the effect of reflections on the ratio between direct and total sound energy at the listener’s location and determined that the total sound levels were higher than the direct sound levels by the following amounts: 9 dB at 100 Hz, 6 dB at 200 Hz, 3 dB at 500 Hz, and 3 dB at 1 kHz. When these are combined with the author’s conclusion [ that the major sound energy causing acoustic peaks in the 120—129-dB range arise from the audio spectrum below 500 Hz to 1 kHz, a compromise value of 4.5 dB is selected. The total correction factor from 1 m to listening-room levels is obtained by subtracting 9.5 dB, due to the inverse square law for 3-rn distance, and then adding 4.5 dB for reverberant sound field augmentation. This results in a correction factor of —5 dB for monophonic circumstances. Assuming power addition by the frontal loudspeakers, a value of —2 dB is obtained for sterephonic and 0 dB for five- channel reproduction.

Therefore the peak levels in listening environments are derived from the 1-rn values and the correction factors. Since consumer loudspeaker manufacturers do not typically supply peak level capabilities for their products, the consumer loudspeaker sound capabilities are estimated by using the data from 23 loudspeaker tests in Audio Magazine. In this magazine Keele [ determined the maximum undistorted peak acoustic capabilities for various loudspeakers as a function of frequency. A single number for each loudspeaker’s 1-rn sound output is derived by taking the minimum value between 100 Hz and 1 kHz. The results of this process are shown in Fig. 8.

Examination of this figure shows that the consumer loudspeakers surveyed have 1-rn acoustic level capabilities of up to 122 dB SPL, with the average consumer loudspeaker capable of producing 114 dB SPL. Professional monitoring loudspeaker capabilities are greater, as a sampling of studio monitoring loudspeakers shows. The JBL4435, Meyer Sound 833, Eastern Acoustic Works DS223Hi, and Apogee Sound AE-5 are capable of producing 1-m peak outputs of 128—131 dB SPL. Therefore this analysis assumes that consumer stereo phonic reproduction is limited to sound levels of 112—120 dB SPL, limiting the dynamic-range requirements to 108—116 dB, while professional playback circumstances are shown to reproduce performance sound levels and not to reduce the audibility of audio system equipment noise or the dynamic-range requirement. Monophonic reproduction limits both consumer and professional sound level capabilities by 3 dB, whereas five- channel reproduction allows an extra 2 dB sound output capability and dynamic range for consumer applications.

5.3 Power Amplifier Limitations

Although loudspeakers do not produce noise, the power amplifiers driving them do. Since power amplifiers with output capabilities of 250 W or greater are needed to produce the high acoustic levels of music performances, the dynamic range for amplifiers of this type are surveyed. Typically amplifiers of this type have dynamic ranges in the 110—120-dB level and have the potential to produce audible equipment noise. Power amplifier equipment noise is limited in dynamic range due to the fact that amplifier designers have considered the l10—120-dB dynamic range to be completely adequate for all applications. Fortunately the design of a power amplifier with wider dynamic ranges is possible, given the need to do so.

5.4 Summary

An examination of microphone, power amplifier, and loudspeaker properties indicates that limitations in dynamic-range requirements occur. The best microphones are shown to have audible noise only having an effective dynamic range of 108—115 dB in natural miking situations. Consumer and professional loudspeaker acoustic outputs indicate that consumer loudspeakers are not adequate to reproduce the 129-dB peak levels of live music, whereas the high-output professional monitors are. This limitation of consumer loudspeaker acoustic levels reduces the dynamic-range requirement to 108—116 dB at the expense of the loss of realism. The noise of power amplifiers is a small but significant contributor to total audio system noise since their dynamic range is rarely above 120 dB.

6 DIGITAL AUDIO LIMITATIONS

Digital audio elements are limited in their dynamic range by inadequate performance of the digital-to-analog conversion processes and the use of digital word lengths that are insufficiently long. The analysis of the dynamic- range capabilities of audio systems employing digital technology is performed by first examining the ADC’s performance, investigating the consequences of using 16-bit storage technology or low-bit-rate coding, and finally exploring DAC limitations. All other elements in the digital audio chain are assumed to use 24-bit word lengths or better so that their noise contributions can be ignored. Digital audio dynamic-range characteristics are investigated in light of the previously determined capabilities of the rest of the audio chain.

6.1 Analog-to-Digital Converter Limitations

A survey of the dynamic range capabilities of ADCs shows values of 90—110 dB, with the highest value for the best configurations of 20-bit word length converters. Analog Devices, Crystal Semiconductor, and Ultra Ana log all make ADCs with dynamic ranges of 106—110 dB. Unfortunately these values of dynamic-range performance are inadequate to meet the professional and most demanding of the consumer requirements, and techniques to increase the apparent dynamic-range characteristics are necessary. One technique to improve the perceived dynamic range performance is the use of pre and postemphasis as specified by the CCITT J17 standard [ The author [ studied the usefulness of this emphasis and demonstrated that a J17 emphasis with unity gain at 1 kHz is very useful in extending the apparent dynamic range. Fig. 9 shows the gain characteristic of this preemphasis.

This figure shows that this preemphasis characteristic attenuates the lowest frequencies by as much as 7 dB and boosts the highest ones by 12 dB. Despite the apparent asymmetry in boost and loss for this proposed preemphasis, this author determined that its use results in no in crease in the effective peak acoustic levels, implying that the low-frequency content of music performances is primarily responsible for the highest peak acoustic levels. An audible increase in apparent dynamic range results because complementary postemphasis is a good match to the shape of the hearing threshold and causes a decrease in ADC noise levels above 1 kHz. This results in a modification of the audibility of the ADC noise by shaping it to the hearing threshold curve and lowering noise levels in the listener’s most sensitive frequency range.

The improvement in ADC dynamic-range capability is investigated by the use of an ADC with a white-noise floor and 108-dB dynamic range. It is calibrated to record in an undistorted manner a music performance with 129-dB peak sound levels. The ADC output uses a 24-bit word length, and the two situations of analog- to-digital conversion with and without J17 emphasis are considered. The two resultant equipment noise spectra are shown in Fig. 10 and compared to the hearing thresh old criterion.

Examination of this figure shows that the use of emphasis greatly improves the audible equipment noise characteristic of the ADC system by depressing noise above 1 kHz. This results in a noise floor very nearly at the hearing threshold and slightly lower than the equipment noise of the lowest noise recording micro phone examined. As a result the dynamic range of the ADC is increased from 108 to 117 dB. This indicates that the best ADC configurations have noise levels com parable to the best recording microphones when J17 emphasis is used.

6.2 Storage and Transmission Limitations

Dynamic-range limitations arise from the use of insufficiently long word lengths to manipulate and transfer audio signals. Inmost elements of the digital audio chain it is possible to avoid these problems by using 24-bit word lengths since even the use of word lengths of 20 bits can cause a noticeable degradation in high-dynamic- range systems. Unfortunately, in the area of transmission and storage data, the high data rate implied by long word lengths is expensive or prohibitive. Typically, digital transmission and storage media limit word lengths to 16 bits, a value resulting in a dynamic-range performance of below 100 dB. The situation may even be more complicated in that audio bit-rate reduction coding is used to further reduce the data rate.

Techniques to improve the audible performance of 16- bit word lengths are considered. Emphasis can be used to increase the apparent dynamic range from 95 dB with out the use of J17 emphasis to a value of 104 dB with emphasis. Unfortunately this is not enough for the more limited consumer playback applications requiring at least 108 dB.

Instead, the technique of noise shaping via error feed back of the word length reduction process is used, as discussed by Lipshitz et a!. [ Wannamaker [ Gerzon and Craven [ and Stuart and Wilson [ [ This method redistributes the quantization energy as a function of frequency to minimize its audibility. This is done by modifying the quantization noise to have the same frequency characteristic as the hearing thresh old in the frequency region where the listener is most sensitive. Although these investigators have used slightly different hearing threshold criteria and methods of determining the loudness of the resulting noise, their results are similar. They indicate that noise shaping is able to reduce the audibility of word-length-reduction induced quantization noise by approximately 15 —20 dB, thus increasing the dynamic-range capabilities of 16-bit storage media to 110—115 dB. Fig. 11 shows a spectral comparison between the hearing threshold with the noise of a conventionally dithered, 16-bit word-length reduction and one employing the 24-coefficient noise shaper proposed by Wannamaker [ Reproduction of music performances with 129-dB peak levels is assumed.

Examination of this figure shows the tremendous ad vantage of this technique since it has reduced the noise by 25 dB in the listener’s most sensitive frequency region. Instead of generating a noise floor considerably above the hearing threshold, the shaped noise floor fits it to a level that stays at or below the hearing threshold criterion. This results in a noise contribution that is only slightly audible since the shaped noise follows the hearing threshold curve above 2 kHz and does not have the 5-dB margin to guarantee noise inaudibility. This results in a 16-bit storage element dynamic range of 110—115 dB, depending on the detection model used to estimate loudness. This dynamic range is comparable to that needed by consumer end use but does not satisfy the strictest requirements of professional use.

The use of bit-rate-reduction coding complicates the situation in that the coding process is highly nonlinear and does not preserve the noise-shaping properties through the coder if 16-bit word lengths are used. Fortunately the ability to preserve the noise-shaping proper ties is not a problem if the bit-rate reduction coder accepts and outputs longer word lengths. Under these conditions the low apparent noise levels are sustained as long as the internal dynamic range of the coder is large enough. Two examples of bit-rate-reduction coding with greater than 95-dB dynamic range are given by van der Waal and Griffiths [ for the DCC system and by Fielder and Davidson [ for the AC-2 system. Both coding systems are shown to have internal dynamic ranges in excess of 107 dB. In particular van der Waal and Griffiths showed that a noise-shaped 16-bit source preserved much of its dynamic-range improvement when the 18-bit output of the DCC decoder was used. It is important to note that the limitation of 107-dB dynamic range is merely the result of parameters defined by their intended applications. The design of bit-rate reduction coders with dynamic ranges in excess of 125 dB is possible when the need to transmit extremely wide-dynamic- range signals is necessary.

6.3 Digital-to-Analog Converter Limitations

Finally the effect of DAC imperfections and limitations of the dynamic range is investigated. Ideally the DAC element should have a low enough level of equipment noise so that it is lower than the target dynamic- range limit of 122—124 dB for professional applications. In addition, the DAC should be linear enough so that the benefits of noise shaping earlier in the audio chain do not disappear. The best of the present DACs examined by Benjamin [ were found to possess 20-bit input word lengths, idle channel noises as low as 119 dB, and linearities sufficient to allow effective noise shaping to a perceptible dynamic range of 112 dB. In addition, the best of the DACs measured were linear enough over the entire input range not to result in audible modulation noise or distortion.

The results of Benjamin’s study show that commercial DACs have usable dynamic ranges of 112 dB under a variety of conditions. Since 112 dB is not as large as the previously determined dynamic-range limits, pre and postemphasis can be used to lower the apparent noise level of the DAC by an additional 9 dB, in a manner similar to that demonstrated for ADCs. Digitally implemented J17 preemphasis is performed in front of the DAC, and postemphasis implemented on its analog output causes the apparent dynamic range to increase to 121 dB.

6.4 Summary

Dynamic-range limitations in the digital audio portion of the recording and reproduction chain have been examined and found to have limits lower than 122— 124 dB. ADCs, 16-bit storage systems, bit-rate-reduction coders, and DACs are examined. ADCs benefited from the use of J17 emphasis, increasing the apparent dynamic range of the best units surveyed to 117 dB from an unemphasized 108 dB. DACs also benefited from the use of emphasis, but had a higher original dynamic- range performance of 112 dB raised to 121 dB. The intermediate elements of storage were examined using a few practical examples. The dynamic-range performance of 16-bit storage with noise shaping results in a dynamic range up to 115 dB. Finally it was shown that the dynamic-range performance of two bit-rate reduction coders designed for 18-bit systems produces a limit of 107 dB, with extensions to a much larger dynamic range shown to be possible.

 

7 CONCLUSIONS

This study examined the audibility of the equipment noise of various parts of the recording and reproduction chain when music performances are played back at actual sound levels. To do this, the peak acoustic levels at music performances were compared to the level of just audible noise in typical listening environments. The peak acoustic level was 129 dB SPL or its 126-dB SPL rms equivalent, while the average level of just audible noise was determined to be 3.8 dB SPL. The comparison of the ratio of these two acoustic levels produced the criterion for noise-free reproduction of music of 122-dB dynamic range for each channel. Five-channel discrete sound reproduction situations were found to be slightly more sensitive due to noise production in the surround loudspeaker, requiring a dynamic range of 124 dB for each channel.

Next a model for the detection of noise was developed to allow a comparison between various sources of noise and assessment of their relative importance. To this end a one-third-octave spectral comparison using equivalent acoustic noise levels was developed from critical-band theory. It was shown that noise is audible when it exceeds an individual’s hearing threshold in one one-third- octave band or uses a limited sharing of energy between bands. The use of an average hearing threshold curve was shown to be useful and accurate as long as a 5-dB guard band is employed to account for typical individual increased sensitivity in one or more bands. This meant that noise signals with a spectrum just below that of the average hearing threshold are actually audible due to the individual variations in frequency sensitivity.

The effect of environmental noise in the recording or the playback was investigated using this spectral comparison method. This demonstrated that the best re cording and consumer playback conditions resulted in no reduction in the listener’s ability to detect audio system equipment noise. This was true for natural miking configurations sampling the sound field at the audience location; it was also generally true for multimiking configurations.

Next, microphones, power amplifiers, and loudspeakers were investigated for their effect on the dynamic range. The best microphones were shown to have audible equipment noise levels for natural sound-level music reproduction, limiting the effective dynamic range to 108— 115 dB. Consumer loudspeaker acoustic output also was shown to limit the available dynamic range because of inadequate acoustic output capability, whereas professional loudspeakers were shown not to have this limitation. Limitations in consumer loud speaker acoustic levels reduced the dynamic-range requirement to 108—116 dB at the expense of the loss of realism. Power amplifier noise was a small but significant contributor to the total audio system noise, with dynamic ranges in excess of 120 dB.

These limitations were combined to provide the correct context for the assessment of the importance of digital audio element dynamic-range capabilities. ADCs, 16-bit storage systems, bit-rate-reduction coders, and DACs were examined. ADCs benefited from the use of J17 emphasis and had a perceived dynamic range of up to 115 dB. DACs also benefited from the use of emphasis, but had a slightly better performance of 121-dB dynamic range. The dynamic-range performance of 16-bit storage with noise shaping was shown to have an apparent dynamic range of 117 dB. Two examples of bit-rate-reduction coders designed for 18-bit systems were shown to have dynamic-range limits of at least 107 dB, but higher values of above 125 dB are possible.

Combining the noise sources from the entire recording and reproduction system is now appropriate. A system assembled from the combination of elements will possess a total noise level that is more audible than that of any of the parts. The effect of consumer playback conditions results in the manifestation of at least 7-dB lower acoustic equipment noise levels due to limited playback levels. Fig. 12 shows the noise spectra of a complete system composed of the best examples of re cording microphones, ADCs, 16-bit noise-shaped storage, a wide-dynamic-range low-bit-rate coder, DACs, and power amplifiers. The low-bit-rate coding system is assumed to have 24-bit dynamic range and therefore produces negligible levels of noise. Resultant noise will be calibrated for professional and consumer playback capabilities of 129 and 122 dB, respectively.

Examination of this figure shows that a complete sys tem assembled from the best elements still produces noise levels that are audible. The professional playback circumstance results in a total system noise level that is 7 dB higher than the average hearing threshold, having an equivalent dynamic range of 109 dB. The consumer playback circumstances produce lower levels of noise and are much less audible. Since the resultant noise spectrum is higher than the limit set by the hearing threshold and the — 5-dB factor, which accounts for individual variations, consumer playback will still be slightly noisy.

In conclusion, noise-free reproduction of music at actual sound levels is not possible with today’s recording and reproduction technology. Although improvements in digital audio dynamic-range capabilities have much improved the situation compared to 10 years ago, most elements, digital and analog, need to be improved if this goal is to be obtained. Consumer use, with its lower playback levels, is in a better situation in that the total system noise levels are within 5 dB of noise-free performance.

8 REFERENCES

[ H. Fletcher, “Hearing, the Determining Factor for High Fidelity Transmission,” Proc. IRE, vol. 30, pp. 266—277 (1942 June).

[ D. F. Hoth, “Room Noise Spectra at Subscribers’ Telephone Locations,” J. Acoust. Soc. Am., vol. 12, pp. 499—504 (1941 Apr.).

[ L. J. Sivian, H. K. Dunn, and S. D. White, “Absolute Amplitudes and Spectra of Certain Musical Instruments and Orchestras, ”J. Acoust. Soc. Am., vol. 1, pp. 330—371 (1931 Jan.).

[ L. D. Fielder, “Dynamic-Range Requirement for Subjectively Noise-Free Reproduction of Music,” J.Audio Eng. Soc., vol. 30, pp. 504—511 (1982 July / Aug.).

[ L. D. Fielder, “Pre- and Postemphasis Techniques as Applied to Audio Recording Systems,” J. Audio Eng. Soc., vol. 33, pp. 649—658 (1985 Sept.).

[ B. H. King, “Equipment Profile: Boulder 500 AE Amplifier and Ultimate Preamplifier,” Audio Mag., vol. 74, pp. 54—70 (1990 Feb.).

[ B. H. King, “Equipment Profile: McIntosh MC2600 Amplifier,” Audio Mag., vol. 76, pp. 44—61 (1992 Feb.).

[ L. Feldman, “Equipment Profile: McIntosh MC7200 Power Amplifier,” Audio Mag., vol. 74, pp. 72—78 (1990 Jan.).

[ L. Feldman, “Equipment Profile: Bryston 4BNRB Amplifier,” Audio Mag., vol. 76, pp. 42—44 (1992 Aug.).

[ C. P. Lebo and K. P. Oliphant, “Music as a Source of Acoustic Trauma,” J. Audio Eng. Soc., vol. 17, pp. 535—538 (1969 Oct.).

[ D. J. Meares and K. F. L. Lansdowne, “Revised Background Noise Criteria for Broadcasting Studios,” BBC Research Rep. RD1980/8 (1980).

[ W. Ahnert, “The Sound Power of Different Acoustic Sources and Their Influence in Sound Engineering,” presented at the 75th Convention of the Audio Engineering Society, J. Audio Eng. Soc. (Abstracts), vol. 32, p. 464 (1984 June), preprint 2079.

[ J. Vanderkooy and S. P. Lipshitz, “Resolution

Below the Least Significant Bit in Digital Systems with Dither,” J. Audio Eng. Soc., vol. 32, pp. 106—113 (1984 Mar.).

[ H. Fletcher, “Auditory Patterns,” Rev. Mod. Phys., vol. 12, pp. 47—65 (1940 Jan.).

[ E. Zwicker, G. Flottorp, and S. S. Stevens, “Critical Band Width in Loudness Summation,” J. Acoust. Soc. Am., vol. 29, pp. 548—557 (1957 May).

[ B. Scharf, “Critical Bands and the Loudness of Complex Sounds Near Threshold,” J. Acoust. Soc. Am., vol. 31, pp. 365—370 (1959 Mar.).

[ J. R. Stuart and R. J. Wilson, “A Search for Efficient Dither for DSP Applications,” presented at the 92nd Convention of the Audio Engineering Society, J.

Audio Eng. Soc. (Abstracts), vol. 40, p. 431 (1992 May), preprint 3334.

[ J. R. Stuart, “Noise: Methods for Estimating Detectability and Threshold,” presented at the 94th Convention of the Audio Engineering Society, J. Audio Eng. Soc. (Abstracts), vol. 41, p. 387 (1993 May), preprint 3477.

[ J. R. Stuart and R. J. Wilson, “Dynamic Range Enhancement Using Noise-Shaped Dither Applied to Signals with and without Preemphasis,” presented at the 96th Convention of the Audio Engineering Society, J.Audio Eng. Soc. (Abstracts), vol. 42, p. 400 (1994 May), preprint 3871.

[ E. Zwicker, “Subdivision of the Audible Frequency Range into Critical Bands (Frequenzgruppen),” J. Acoust. Soc. Am., vol. 33, p. 248 (1961 Feb.).

[ S. Fidell, R. Horonjeff, S. Teffeteller, and D. M. Green, “Effective Masking Bandwidths at Low Frequencies,” J. Acoust. Soc. Am., vol. 73, pp. 628—638 (1983 Feb.).

[ B. C. J. Moore and B. R. Glasberg, “Formulae Describing Frequency Selectivity as a Function of Fre niI nd I ve1 m Th 1T n C Pyritiberg, “Auditory Filter Shapes at Low Center Frequencies,” J. Acoust. Soc. Am., vol. 88, pp. 132—140 (1990 July).

[ M. J. Shailer, B. C. J. Moore, B. R. Glasberg, N. Watson, and S. Harris, “Auditory Filter Shapes at 8 and 10 kHz,” J. Acoust. Soc. Am., vol. 88, pp. 141—148 (1990 July).

[ E. Schorer, “Critical Modulation Frequency Based on Detection of AM versus FM Tones,”J. Acoust. Soc. Am., vol. 79, pp. 1054— 1057 (1986 Apr.).

[ H. Fastl and E. Schorer, “Critical Bandwidth at Low Frequencies Reconsidered,” in B. C. J. Moore and R. 0. Patterson (Eds.), Auditory Frequency Selectivity (Plenum Press, New York, 1986), pp. 311—318.

[ S. A. Gelfand, Hearing: An Introduction to Psychological and Physiological Acoustics (Marcel Dekker, New York, 1990), pp. 353—382, 389—392.

[ ISO R226-1961E, “Normal Equal-Loudness Contours for Pure Tones and Normal Threshold of Hear ing under Free Field Listening Conditions,” International Organization of Standardization, Geneva, Switzerland (1961 Dec.).

[ ISO 454-1 975E, “Acoustics —Relation between Sound Pressure Levels of Narrow Bands of Noise in a Diffuse Field and in a Frontally-Incident Free Field for Equal Loudness,” International Organization of Standardization, Geneva, Switzerland (1975 Jan.).

[ E. A. Cohen and L. D. Fielder, “Determining Noise Criteria for Recording Environments,” J. Audio Eng. Soc., vol. 40, pp. 384—402 (1992 May).

[ D. W. Robinson and R. S. Dadson, “A Redetermination of the Equal-Loudness Relations for Pure Tones,” Brit. J. AppI. Phys., vol. 7, pp. 166—181 (1956 May).

[ D. W. Robinson, L. S. Whittle, and J. M. Bowsher, “The Loudness of Diffuse Sound Fields,” Acustica, vol. 11, pp. 397—404 (1961).

[ G. A. Miller, “Sensitivity to Changes in the Intensity of White Noise and Its Relation to Masking and Loudness,” J. Acoust. Soc. Am., vol. 19, pp. 609—619 (1947 July).

[ V. Tarnow, “Thermal Noise in Microphones and Preamplifiers,” Bruel & Kjaer Tech. Rev., no. 3, pp. 3—14 (1972).

[ E. Frederiksen, “Microphone System for Extremely Low Sound Levels,” Bruel & Kjaer Tech. Rev., no. 3, pp. 16—22 (1984).

[ IEC Publ. 268-13, “Sound System Equipment, Part 13, Listening Tests on Loudspeakers,” Geneva, Switzerland (1986).

[ F. E. Toole, “Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2,” J. Audio Eng. Soc., vol. 34, pp. 323—348 (1986 May).

[ D. B. Keele, “Equipment Profile: NHT Model 11 Speaker,” Audio Mag., vol. 74, pp. 60—78 (1990 July).

[ D. B. Keele, “Equipment Profile: Dahiquist MQtV7 “ 4.,d; A,r 7 .. 00 (92—112 (1990 Nov.).

[ D. B. Keele, “Equipment Profile: Dynaudio Special One Speaker,” Audio Mag., vol. 74, pp. 90— 106 (1990 Dec.).

[ D. B. Keele, “Equipment Profile: Boston Acous tics T1030 Speaker,” Audio Mag., vol. 75, pp. 78—94 (1991 Jan.).

[ D. B. Keele, “Equipment Profile: Thiel CS5 Speaker,” Audio Mag., vol. 75, pp. 56—76 (1991 Feb.).

[ D. B. Keele, “Equipment Profile: Meridian D600B Speaker,” Audio Mag., vol. 75, pp. 62—76 (1991 Mar.).

[ D. B. Keele, “Equipment Profile: Monitor Audio Studio 10 Speaker,” Audio Mag., vol. 75, pp. 44—54 (1991 July).

[ D. B. Keele, “Equipment Profile: Wharfedale Diamond IV Speaker,” Audio Mag., vol. 75, pp. 60—69 (1991 Aug.).

[ D. B. Keele, “Equipment Profile: PSB Stratus Gold Speaker,” Audio Mag., vol. 75, pp. 46—58 (1991 Nov.).

[ D. B. Keele, “Equipment Profile: Westlake Audio BBSM 6F Speaker,” Audio Mag., vol. 75, pp. 78—96 (1991 Dec.).

[ D. B. Keele, “Equipment Profile: AR-Mi Speaker,” Audio Mag., vol. 76, pp. 68—82 (1992 Jan.).

[ D. B. Keele, “Equipment Profile: Quart 49OMCS Speaker,” Audio Mag., vol. 76, pp. 62—74 (1992 Feb.).

[ D. B. Keele, “Equipment Profile: Canton Ergo 100 Speaker,” Audio Mag., vol. 76, pp. 62—73 (1992 Mar.).

[ D. B. Keele, “Equipment Profile: Bright Star Altair Speaker,” Audio Mag., vol. 76, pp. 60—69 (1992 July).

[ D. B. Keele, “Equipment Profile: Tannoy 615 Speaker,” Audio Mag., vol. 76, pp. 58—67 (1992 Aug.).

[ D. B. Keele, “Equipment Profile: Celestion 300 Loudspeaker,” Audio Mag., vol. 77, pp. 50—56 (1993 Mar.).

[ D. B. Keele, “Equipment Profile: Paradigm Studio Monitor Loudspeaker,” Audio Mag., vol. 77, pp. 52—62 (1993 Apr.).

[ D. B. Keele, “Equipment Profile: DahlquistDQ 30i Loudspeaker,” Audio Mag., vol. 78, pp. 68—78 (1993 June).

[ D. B. Keele, “Equipment Profile: Counterpoint Clearfield Metropolitan Loudspeaker,” Audio Mag., vol. 77, pp. 56—65 (1993 July).

[ D. B. Keele, “Equipment Profile: Genesis Genre I Loudspeaker,” Audio Mag., vol. 77, pp. 52—62 (1993 Sept.).

[ D. B. Keele, “Equipment Profile: DGX Audio DD1-1 Speaker and DDA-1 Digital Processing Amp,” Audio Mag., vol. 77, pp. 48—57, 88 (1993 Nov.).

[ D. B. Keele, “Equipment Profile: NHT Model 3.3,”AudioMag., vol. 78,pp. 49—56, 62 (1994 Feb.).

[ “Pre-emphasis Used on Sound Program Circuits in Group Links,” CCITT, Red Book III, Fascicle 111.4, Rec. J17 (1972).

[ S. P. Lipshitz, J. Vanderkooy, and R. A. Wan namaker, “Minimally Audible Noise Shaping,” J. Audio Eng. Soc., vol. 39, pp. 836—852 (1991 Nov.).

[ R. A. Wannamaker, “Psychoacoustically Optimal Noise Shaping,” J. Audio Eng. Soc., vol. 40, pp. 611—620 (1992 July/Aug.).

[ M. Gerzon and P. Craven, “Optimal Noise Shaping and Dither of Digital Signals,” presented at the 87th Convention of the Audio Engineering Society, J. Audio Eng. Soc. (Abstracts), vol. 37, p. 1072 (1989 Dec.), preprint 2822.

[ R. G. van der Waal, A. W. J. Oomen, and F. A. Griffiths, “Performance Comparison of CD, Noise-Shaped CD, and DCC,” presented at the 96th Convention of the Audio Engineering Society, J. Audio Eng. Soc. (Abstracts), vol. 42, pp. 406, 407 (1994 May), preprint 3845.

[ L. D. Fielder and G. Davidson, “AC-2: A Family of Low Complexity Transform-Based Music Coders,” presented at the 10th mt. AES Conference, Lon don, 1991 Sept.

[ E. Benjamin, “Effects of DAC Nonlinearity on Reproduction of Noise Shaped Signals,” presented at the 95th Convention of the Audio Engineering Society, J. Audio Eng. Soc. (Abstracts), vol. 41, p. 1053 (1993 Dec.), preprint 3778.

                                                                                                            

AND ALL ZAINEA LIVIU'S ARTICLES - IN GREEK , AROUND THE DISCUSSED PSYCHOACOUSTICAL MATTERS , AT http://www.zainea.com/list.html