Parametric Sound Field Auralization of Small Room Acoustics for Perceptual Research on Room Reflections

  • Alan Pawlak

Student thesis: Doctoral Thesis

Abstract

Parametric sound field synthesis methods can produce auralizations of room acoustics based on the priori data obtained from measurements. They can be used to create a virtual listening room for headphone-based sound reproduction, addressing the inherent limitations of headphones in presenting spatial auditory cues. Arguably, the spatial data obtained through parametric methods could be limited to providing only the essential room-induced spatial cues. This thesis creates a foundation for the development of optimized headphone-based reproduction based on data collected from real rooms using parametric sound field reproduction techniques.

Different approaches to sound field reproduction were reviewed, finding parametric methods particularly plausible, as they employ real measurements, providing spatial data associated with the room acoustics at the same time. Nonetheless, several research gaps related to them were identified. A review of the spatial audio psychoacoustics presented the important factors affecting perception of spatial and timbral attributes. The perception of acoustics in small rooms, the most common listening environments for loudspeaker-based sound reproduction, received particular attention. It was hypothesized that only a limited amount of room data may be necessary to reproduce desirable spatial attributes in headphone-based reproduction.

A comparative study was conducted, evaluating the performance of parametric frameworks in the auralization of a critical listening room. The influence on the perceived spatial and timbral fidelity of the following factors was considered: the rendering framework, direction of arrival estimation method, microphone array structure, and a pressure signal used with Spatial Decomposition Method (SDM). While all systems were distinguishable from the reference, Higher-Order Spatial Impulse Response Rendering and SDM showed comparable spatial fidelity, with SDM performing slightly better in terms of timbral fidelity. Temporal artefacts affected conditions utilizing SDM with optimizations for binaural rendering.

To address SDM's limitations, a novel method named Spatial Segmentation of Impulse Response (SSIR) was introduced. It segments the early part of SDM data into consecutive sound events, allowing for a more intuitive presentation, grouping, and manipulation of the room data. It can facilitate further psychoacoustic research and lead to identifying principal components of the room data, which contribute to the perception of desirable spatial attributes. The effectiveness of SSIR was evaluated through listening tests and objective metrics, which showed positive results. SSIR performed better than methods utilizing extracted reflections and the same or better than the SDM auralizations.

Finally, the impact of spatial resolution used to reproduce early reflections was examined. Perceptual thresholds for grid sparsity were established, and the impact of synthesis approaches (i) mapping to K-Nearest Neighbour (KNN) and (ii) Vector Base Amplitude Panning (VBAP) on these thresholds was examined through listening tests. A new Weighted Directional Error (WDE) metric was developed to quantify DOA quantization errors for early reflections, showing strong correlation with subjective results. Results suggested that WDE within 10° to 15° may be tolerable for faithful reproduction of early reflections.
Date of Award2 Jun 2025
Original languageEnglish
SponsorsGenelec Oy
SupervisorHyunkook Lee (Main Supervisor) & Braham Hughes (Co-Supervisor)

Cite this

'