Subjective Evaluation of 3D Microphone Array in Classical Music Recording

  • Miki Morinaga

Student thesis: Master's Thesis

Abstract

With the advancement of 3D audio, a variety of perceptually motivated 3D microphone arrays have been developed, yet perceptual differences between them remain underexplored. The present study aimed to explore, through two subjective evaluation experiments, which perceptual attributes reveal perceptual differences among arrays and how audible these differences are in the comparison of 3D microphone arrays. Based on these experiments, salient attributes for evaluating 3D microphone arrays and their relationships with the arrays were identified.
The first experiment compared four 3D microphone arrays—OCT-3D, 2L-Cube, and PCMA-3D with 0m and 1m vertical spacing—using a 4+5+0 loudspeaker setup. Elicitation and audibility grading tests identified the salient attributes. Six sound sources—string quartet, piano, A cappella, pipe organ, conga, and trumpet—were evaluated at three listening positions: sweet spot and two off-centre positions. The results revealed perceptual differences in conventional spatial and timbral attributes, while their audibility varied considerably depending on the sound source and listening position. Listener Envelopment (LEV) emerged as a salient spatial attribute, with Apparent Source Width (ASW) also being audible for many sources and positions. Attributes related to localisation (e.g., localisability, source position, and distance) showed higher audibility in single-source and off-centre positions. For height perception, Vertical Image Spread (VIS) was only salient for pipe organ and A cappella. Timbral attributes such as Clarity, Fullness, and Brightness were especially prominent in off-centre positions, with the pipe organ showing unique trends.
The second experiment evaluated nine 3D microphone arrays using broad attributes—preference, overall spatial quality, and tonal quality—through multiple comparisons and detailed attributes ranking tests. The stimuli, including string quartet and orchestra, were assessed with various base and upper layer combinations of microphone arrays. Ratings showed considerable variability, with few clear differences between the microphone arrays, particularly for the string quartet, where no significant differences were found. For the orchestra source, the C O 1m array (cardioid base + omni upper layer with 1m vertical spacing) received higher ratings for preference and tonal quality compared to omni base arrays. O O 1m and O SC0 1m arrays (omni base + omni or forward-facing supercardioid upper layer with 1m vertical spacing) achieved high ratings for spatial quality. Ranking results indicated that Clarity was the most important attribute for evaluating preference and tonal quality, while LEV and Spaciousness were key for spatial quality.
These subjective evaluation results were discussed with objective perspectives, such as Interchannel Crosstalk (ICXT), which may influence the perceptual impressions of 3D microphone arrays.
Date of Award9 Oct 2025
Original languageEnglish
SupervisorHyunkook Lee (Main Supervisor) & Dale Johnson (Co-Supervisor)

Cite this

'