Towards a Perceptual Model of Clarity in Music Mixes

  • Andrew Parker

Student thesis: Doctoral Thesis


This thesis presents the process undertaken towards producing models of musical mix clarity perception. ‘Clarity’ is a term commonly used by listeners when describing the perceptual qualities of a mixed piece of music, reflecting the perceptual effects of various signal characteristics. Determining the relationship between these objective signal characteristics and the perception of mix clarity allows for modelling of the clarity attribute. A detailed literature review was conducted forming the foundations to both highlight potentially important signal features and outline potential methods of modelling. These observations were then affirmed and understood through an exploratory investigation of music stimuli rated in controlled subjective listening tests. Three novel approaches to mix clarity prediction are proposed, based on inter-band relationship (IBR & IBR MR), signal separation and component masking (L2PM MC & L3PM MC), and a semi-supervised convolutional neural network (CNN) classifier.
The proposed models were evaluated using a second perceptually relevant data set elicited from a controlled listening test. This showed the IBR model had a strong relationship to the perceptual data (r = 0.6 & rho = 0.6079), which was improved using a novel parameter optimised multi-resolution approach, IBR MR (r = 0.7635 & rho = 0.8024). The L2PM MC (r = −0.6103 & rho = −0.7964) and L3PM MC (r = −0.6483 & rho = −0.6930) models also achieved strong relationship to the perceptual data. As a classifier, the CNN approach showed good accuracy (80%) on unseen data. When coupled with an evaluation of the learned representation, this suggested the CNN model had learned a representation in agreement with, and as comprehensive as, the aforementioned models from the perceptual data. An objective model of mix clarity perception would be useful as a measure to supplement the judgement of engineers producing music, and in automatic mixing/mastering systems, as a target to guide them towards perceptually meaningful results. Indeed, to the authors knowledge, this work represents the first perceptually relevant objective models of musical mix clarity.
Date of Award23 Jan 2023
Original languageEnglish
SupervisorSteven Fenton (Main Supervisor) & Hyunkook Lee (Co-Supervisor)

Cite this